Sunteți pe pagina 1din 30

Short Option 7

Classical Mechanics
Prof. J. J. Binney
Michaelmas Term 2005
Revised Syllabus
The calculus of variations & EulerLagrange equations; principle of least action; equations of motion
in strange coordinates; rigid-body dynamics; motion in an electromagnetic eld; applications to nor-
mal modes; symmetries and Noethers theorem; constrained systems (including Lagrange multipliers).
Hamiltonian dynamics: Legendre transformations; Hamiltons equations; applications to harmonic os-
cillator, rotating coordinates, motion in an e.m. eld; Liouvilles theorem; Hamiltons principle and
connection with quantum mechanics; Poisson brackets; canonical coordinates; generators of canonical
maps, symmetries and conserved quantities; canonical transformations; point transformations; action-
angle coordinates; HamiltonJacobi equation; derivation from quantum mechanics; phase-space volumes
and canonical coordinates.
Books
T.W.B. Kibble, Classical Mechanics, Longman Scientic (about 18):
overall the most suitable book.
L.D. Landau & E. Lifshitz, Classical Mechanics, IoP Publishing: one of
the best books in the classic series on theoretical physics.
Tai L. Chow, Classical Mechanics, Wiley, (about 23): a useful source
of additional information but marred by too many typos.
Oliver Johns, Analytical Mechaniucs for Relativity and Quantum Me-
chanics, OUP. As well as elementary stu this book contains much that
goes beyond the course but should be found stimulating. For your college
library?
V.I. Arnold, Mathematical Methods of Mechanics, Springer: a uniquely
insightful book but too sophisticated for most undergraduates.
1 Lagrangian Mechanics
Mechanics as formulated by Newton suers from two important limitations: (i) it deals with par-
ticles; (ii) it describes their motion in special Cartesian coordinate systems: if the numbers x
i
are the
coordinates of a particle in an inertial Cartesian coordinate system, then the position of the particle
when subjected for a force with components f
i
(t) may be determined by solving the dierential equa-
tions x
i
= f
i
(t). Since an extended body can be decomposed into its consituent particles, and the
chain rule can be used to transform the equations of motion from Cartesian coordinates to any reference
frame, Newtons machinery enables us to determine the motion of any body in any reference frame
notwithstanding these limitations. But in practice it is better to determine the dynamics of complex
dynamical systems from a more powerful principle than Newtons laws of motion. Lagrangian dynamics
provides just such a principle.
Let q
i
i = 1, . . . , N be generalized coordinates for some system. That is, these N numbers
enable us to specify precisely the systems conguration. For example, six numbers suce to specify a
conguration of a rigid body such as a hard-boiled egg: we can take (q
1
, q
2
, q
3
) to be the coordinates
in some system, such as spherical polar coordinates, of the bodys centre of mass, and (q
4
, q
5
, q
6
) to be
the three angles that are required to dene its orientation. (Box 1 denes Euler angles, the standard
angles for specifying the orientation of a rigid body.) The number of generalized coordinates N required
by a system is called the systems number of degrees of freedom.
At each instant our system is at some point in conguration space an imaginary N-dimensional
space for which the q
i
constitute Cartesian coordinates. As the system moves, its representative point
in conguration space sweeps out a path q(t). Since Newtons laws of motion are 2nd order in time,
we expect this path to be uniquely determined by specifying at some time t
1
both q(t
1
) and q(t
1
). In
Lagrangian mechanics we take rather a dierent point of view: we do not specify q(t
1
) but instead
specify q at a second time t
2
. That is, we ask what path does our system follow if its conguration at
time t
1
is q(t
1
) and at time t
2
is q(t
2
)? For reasons that give deep insight into the connection between
classical and quantum mechanics, it turns out that the sought-after path q(t) is the path that extremizes
a certain quantity S. Our next task is to introduce the mathematical machinery required to dene S
and to show that it is extremized on the Newtonian path. At the end of the course we shall investigate
the connection between the extremization of S and quantum mechanics.
1.1 Paths, functionals & the calculus of variations
Before a plane takes o from New York for London, its computer chooses an optimal path x(t); i.e.,
it nds that sequence of longitudes, latitudes and altitudes at each moment t of the ight which, given
prevailing winds, will get it to London at the prescribed time with least expenditure of fuel. The quantity
of fuel required to get to London in a given time is a single number F that depends on the whole path
x(t); one says that F is a functional F[x] of the path x(t).
The simplest functionals are integrals along the path of functions of x(t) and its derivatives with
respect to t:
F
1
[x]
_
t
2
t
1
[x(t)[
2
dt
F
2
[x]
_
t
2
t
1
[ x(t)[
2
dt
F
3
[x]
_
t
2
t
1
x x(t) dt

How do we nd the path that minimizes a functional
F[x]
_
t
2
t
1
f(x, x, t) dt ? (1.1)
2 Calculus of Variations
Let x(t) be the minimizing path and let (t) be a small variation, so that x(t) x(t) +(t) x(t). We
insist on vanishing at t = t
1
, t
2
so that x(t) and the modied path both start and nish at the same
places at the same times. Then
1
F[x] F[x] =
_
t
2
t
1
f(x +,

x + , t) dt
=
_
t
2
t
1
_
f(x,

x, t) +
f
x
+
f
x
+
_
dt
= F[x] +
_
t
2
t
1
_
f
x
+
f
x
+
_
dt.
(1.2)
We now integrate by parts the second term in the integral of the last line:
_
t
2
t
1
f
x
dt =
_
f
x

_
t
2
t
1

_
t
2
t
1
d
dt
_
f
x
_
dt. (1.3)
Since (t
1
) = (t
2
) = 0, the [.] vanishes. Putting this into (1.2) we have
0 F[x] F[x] =
_
t
2
t
1
__
f
x

d
dt
f
x
_
+
_
dt. (1.4)
This relation must hold for any , no matter how small. So the higher terms indicated by + can
be neglected. The remaining integrand is proportional to , so if it were non-zero for some particular
function (t), it would have the opposite sign for

. The inequality on the extreme left would


then be violated for one of of

. Hence the integral must vanish for all . This is possible only if the
coecient of vanishes for all t
1
< t < t
2
: if it did not vanish for some t, say t

, the integral would fail


to vanish for the particular choice = (t t

). So x(t) minimizes F if and only if


d
dt
f
x

f
x
= 0 (1.5)
all along the path x(t).
Eq (1.5) is called the Euler-Lagrange equation (EL eqn), and the theory that underlies it is
called the calculus of variations. It is one of the few results we have in the theory of functionalsone
everywhere in physics encounters problems that cry out for a fully edged calculus of functionals that
shows how to integrate, Taylor expand, exponentiate etc functionals the way we do functions.
Legend has it that the calculus of variations was invented by Newton after dinner one evening to
solve this challenge problem (set in 1695 by Johann Bernoulli):
Example 1.1
A bead slides on a smooth wire that passes through two rings, one at the origin, the other at
(x

, y

, z

) = (x
0
, 0, z
0
) with z
0
> 0. To what curve (the brachystochrone) must the wire be bent
in order to minimize the time required for the bead to slide from rest at the upper ring to the lower
ring?
Solution: The optimal curve obviously lies in the plane y

= 0. It is convenient to work in
coordinates (x, y, z) such that z increases downwards. Then the time of ight is
=
_
z
0
0
dz
z
.
1
We use the convention that y

x

i
y
i


x
i
.
1.2 The Principle of Least Action 3
But
1
2
( x
2
+ z
2
) = gz, so z =
_
2gz/[(dx/dz)
2
+ 1] and
=
_
z
0
0
dz

2gz
_
_
dx
dz
_
2
+ 1. (1.6)
We need to minimize [x(z)] from (1.6) with respect to the path x(z). We may use the EL-eqn
(1.5) provided we make the substitutions
t z, f
_
x,
dx
dz
, z
_
=
1

2gz
_
_
dx
dz
_
2
+ 1. (1.7)
Since f does not depend on x, the optimal path satises
0 =
d
dz
_
dx/dz

z
_
(dx/dz)
2
+ 1
_
,
which implies
x(z) =
_
z
0
_
Az
1 Az
dz,
where A is a constant of integration. In terms of variable sin
2
Az the answer is
x =
1
A
_

1
2
sin 2
_
. (1.8)
If we write 2 this may be written z = (1 cos )/2A, x = ( sin )/2A, which is a cycloid
with the origin at its cusp. A may be determined by rst solving x
0
/z
0
= (
0
sin
0
)/(1 cos
0
)
for
0
and then using this value in A =
1
2
(1 cos
0
)/z
0
.
1.2 The Principle of Least Action
As was stated above, the path q(t) taken through conguration space by a dynamical system can be
found by identifying the path that extremizes a quantity S[q(t)] between specied locations q(t
1
) and
q(t
2
) of the system at given times t
1
, t
2
. S is called the action and is usually (but not invariably)
minimized by the dynamical path. Hence the idea that the dynamical path can be determined by
extremizing S is called the principle of least action.
S takes the form of an integral over t of a function L of q and q:
S =
_
t
2
t
1
L(q, q) dt. (1.9)
Here L is just a function (rather than a functional) of its arguments. It is called the Lagrangian of
the system. Since the dynamical evolution of the system is entirely determined by L, writing down L
amounts to specifying the physical content of the system.
There is no entirely general rule for writing down L one would hardly expect one rule to be valid
for every possible dynamical system but there is a rule that works for most simple systems: L is the
dierence between the systems kinetic energy T and its potential energy V ;
L = T V. (1.10)
4 Calculus of Variations
Lets se how this works out in a simple case: a particle of mass m moving in a gravitaional potential
(x). Now T =
1
2
m x
2
, V = m. So L(x, x) =
1
2
m x
2
m(x). Setting f = L in the EL equations
(1.5) we obtain the equations of motion as
d
dt
m x +m

x
= 0 (1.11)
as required.
Exercise (1):
Consider a shell that is red at t
1
and hits its target at t
2
. Explain in general terms why its action
would be larger if it ew on either a higher or a lower trajectory than it actually does.
1.3 Equations of motion from Lagrangians
The Lagrangian provides a neat way of calculating the eqns of motion of a particle when referred to
an odd coordinate system because it is easier to transform a single function to new-fangled coordinates
that a set of eqns of motion. Consider, for example, motion in a rotating frame.
Suppose both primed and unprimed coordinates share the same origin, but
the primed coordinates rotate with constant angular velocity with respect
to the unprimed coordinates, which are inertial. Then
v
inertial
=

r

+ r

.
So written in terms of the primed coordinates the k.e. is
T =
1
2
mv
2
=
1
2
m

+ r

2
=
1
2
m[

[
2
+m

r

( r

) +
1
2
m[ r

[
2
(1.12)
The p.e. is just V (r

, t) so
L =
1
2
m[

[
2
+m

r

( r

) +
1
2
m[ r

[
2
V. (1.13)
In writing down the EL-eqns we recall that

r

( r

) = r

(

r

). We then nd
0 =
d
dt
L


L
r

=
d
dt
(m

r

+m r

)
_
m

r

+

r

_
1
2
m[ r

[
2
V
__
.
(1.14)
Collecting everything together we have nally
m

= 2m

r


V
e
r

where V
e
V
1
2
m[ r

[
2
. (1.15)
In a rotating frame there is a contribution to the acceleration

r

from the Coriolis force 2m

, and
the potential needs to be augmented by a term that gives rise to the centrifugal force r
2
( r

).
Forces such as these, which appear because ones frame is non-inertial, are called pseudo-forces.
A second example illustrates that Lagrangians work even for coordinates that depend explicitly
on time. In cosmology it is handy to use comoving coordinates such that the spatial coordinates of
particles that move apart as the Universe expands are constant. Let the primed system be inertial and
the unprimed system comoving. Then r

= a(t)r, where a(t) is the cosmic scale factor. So


T =
1
2
m r
2
=
1
2
m(a r + ar)
2
. (1.16)
1.4 Lagrangian for a rigid body 5
Writing the potential energy as V = m the EL eqns are
0 =
d
dt
_
m(a r + ar)a

m(a r + ar) a +m

r
.
Cleaning up we get
r + 2
a
a
r +
a
a
r =
1
a
2

r
. (1.17)
A nal example illustrates how to get T in a weird curvilinear coordinate system. Oblate spheroidal
coordinates (u, v, ) are related to regular cylindrical polars (R, z, ) by
R = cosh ucos v ; z = sinh usinv. (1.18)
Slightly changing u, v and in turn while leaving the other
coordinates alone, generates small displacements

u
= u(sinh ucos v

R+ coshusin v z)

v
= v(cosh usin v

R+ sinhucos v z)

= R

.
It is easy to check that these three displacement vectors are mutually perpendicular. So the distance
one goes on changing all of (u, v, ) simultaneously is
ds
2
= [
u
+
v
+

[
2
=
2
u
+
2
v
+
2

=
2
_
(u)
2
(sinh
2
ucos
2
v + cosh
2
usin
2
v)
+ (v)
2
(cosh
2
usin
2
v + sinh
2
ucos
2
v) + ()
2
cosh
2
ucos
2
v

=
2
_
(cosh
2
u cos
2
v)[(u)
2
+ (v)
2
] + cosh
2
ucos
2
v()
2
_
.
(1.19)
Dividing through by dt
2
we get the kinetic energy in terms of ( u, v,

):
T =
1
2
m
_
ds
dt
_
2
=
1
2
m
2
_
(cosh
2
u cos
2
v)[ u
2
+ v
2
] + cosh
2
ucos
2
v

2
_
. (1.20)
The eqns of motion are therefore
m
2
_
d
dt
_
((cosh
2
u cos
2
v) u

1
2
sinh2u
_
u
2
+ v
2
+ cos
2
v

2
_
_
+
V
u
= 0
m
2
_
d
dt
_
((cosh
2
u cos
2
v) v

1
2
sin 2v
_
u
2
+ v
2
cosh
2
u

2
_
_
+
V
v
= 0
m
2
_
d
dt
_
cosh
2
ucos
2
v

_
_
+
V

= 0.
1.4 Lagrangian for a rigid body
Lagrangian dynamics really comes into its own for the dynamics of a rigid body that is an object such
as a spanner that contains a vast number N of particles that are so strongly coupled to each other that
we may consider the distances between them to be xed. In this approximation, the coordinates of every
particle are known as soon as we have determined the six generalized coordinates that are required to
specify the position and orientation of the body. Mathematically, if r
i
is the position vector of the ith
particle, r
i
(q
1
, . . . , q
6
). Newtons law of motion states that for i = 1, . . . , N
m
i
r
i
F
i
= 0 (1.21)
6 Calculus of Variations
where F
i
is the force on the ith particle. There are two contributions to F
i
: any external force F
(e)
i
and
the internal stress f
i
that keeps this particle in its allotted position relative to the other particles in the
body. Now we imagine instantaneously displacing the body such that r
i
r
i
+ r
i
. In view of (1.21)
we have
0 =
N

i
(m
i
r
i
F
i
) r
i
=
N

i
(m
i
r
i
F
(e)
i
f
i
) r
i
.
(1.22)
The contribution

i
f
i
r
i
= 0 because the internal stresses do no work (the body is rigid). So
0 =
N

i
(m
i
r
i
F
(e)
i
) r
i
.
Now the r
i
are not all independent they arise from a displacement of the entire body so they are
functions of six independent coordinates q
1
, . . . , q
6
. Hence we may write
0 =
N

i=1
6

j=1
m
i
r
i

r
i
q
j
q
j

6

j=1
Q
j
q
j
, (1.23a)
where the generalized force Q is dened by
Q
j

N

i=1
F
(e)
i

r
i
q
j
. (1.23b
Since the q
j
are all independent, (1.23a) implies that the coecient of each q
j
individually vanishes.
That is
0 =
N

i=1
m
i
r
i

r
i
q
j
Q
j
. (1.24)
In Appendix I some rather intricate algebra is used to recast this equation into the form
0 =
d
dt
_
T
q
j
_

T
q
j
Q
j
, (1.25)
where T =
1
2

i
m
i
[ r
i
[
2
is the bodys kinetic energy. When we specialize to the case in which Q
j
is
generated by a potential V , so Q
j
= (V/q
j
), equation (1.25) is easily seen to be the EL equation
for L = T V .
This analysis shows that we can obtain the equations of motion of any rigid body from the EL
equations as soon as we have expressions for the bodys kinetic and potential energies in terms of any
set of independent coordinates. The analysis is easily extended to the case of a body that is made up
of several rigid bodies that swivel or slide smoothly on one another.
Note:
Notice that the dimensions of the generalized force Q
i
are energy divided by those of q
i
. The
latter is frequently dimensionless (because it is an angle, for example), so generalized forces dont
necessarily have dimensions of force!
Let (x) be the density of a rigid body that is rotating with angular velocity about the coordinate
origin. Then the bodys angular momentum about the origin is
J =
_
d
3
x x ( x)
=
_
d
3
x [x
2
( x)x].
(1.26)
1.5 Lagrangian for motion in an e.m. eld 7
Box 1: Euler Angles
To specify the orientation of a rigid body, we imagine start-
ing with the body axes b
i
aligned with the coordinate axes
and then moving to an arbitrary orientation by compound-
ing three rotations. We label the body axes b
1
, b
2
and b
3
according to whether they start parallel to i, j or k. Now
we rotate by about k, then we rotate by about the new
position of b
1
and nally we rotate by about the new
position of b
3
.
We rewrite this formula in tensor notation as
J
i
=

j
I
ij

j
where I
ij

_
d
3
x (x
2

ij
x
i
x
j
). (1.27)
Here
ij
is the ij element of the identity matrix: it is zero if i ,= j, and unity if i = j. The matrix I
dened by (1.27) is the bodys moment of inertia tensor. Since it is a real symmetric matrix it has
real eigenvalues I
i
and eigenvectors b
i
. The b
i
are called body axes and the I
i
are called principal
moments of inertia. When the body is rotated, the body axes rotate with it so they should be thought
of as xed within the body. According to (1.27), when the body spins such that its angular velocity
lies along a body axis, its angular momentum is parallel to its angular velocity, and the proportionality
constant between these two vectors is the appropriate principal moment of inertia.
The kinetic energy of our spinning body is
T =
1
2
_
d
3
x [ x[
2
=
1
2
_
d
3
x [x ( x)]
=
1
2
I .
(1.28)
This expression is especially simple in the body-axis frame:
T =
1
2
3

i=1
I
i

2
i
, (body-axis frame). (1.29)
If all three moments of inertia are dierent, evaluating T from (1.29) in terms of the derivatives of
Euler angles (Box 1) is tedious. So consider the case I
1
= I
2
of an axisymmetric body, such as a saucer.
Since the Euler angle is a rotation about the nal position of b
3
, it is clear that

contributes

b
3
to . Since I
1
= I
2
we can adopt any two mutually orthogonal vectors in the bodys equatorial plane
as b
1
and b
2
. So lets choose b
1
to be the axis about which we rotated through Euler angle . Then

contributes

b
1
to . An increment in rotates the system about k. This lies in the plane of b
2
and
b
3
and is inclined at angle to b
3
. Hence

contributes

(cos b
3
sinb
2
) to . Adding all three
contributions together to form and substituting the result into (1.29) we nd that the kinetic energy
of an axisymmetric body is
T =
1
2
I
1
(

2
sin
2
+

2
) +
1
2
I
3
(

cos +

)
2
. (1.30)
The potential energy of an axisymmetric body can depend only on and is usually easy to write
down for any particular physical situation. Hence with (1.30) in hand the Lagrangian follows easily
see the problems.
8 Calculus of Variations
1.5 Lagrangian for motion in an e.m. eld
The simple rule L = T V does not work for a charged particle that moves in a magnetic eld B.
To see this, recall that B does no work on the particle, so it contributes to neither T nor V . Hence
it cannot appear in equations of motion that are derived from only T and V . We now show that the
correct equations of motion follow from
L =
1
2
m x
2
+Q( x A), (1.31)
where Q is the particles charge, A(x, t) is the magnetic vector potential and (x, t) is the electrostatic
potential. Equation (1.31) gives the action as
S =
_
_
1
2
m x
2
+Q( x A)

dt, (1.32)
so the EL eqn is
d
dt
_
m x +QA
_
+Q( x A) = 0. (1.33)
Here the derivative w.r.t. t is along the path, so
dA
dt
=
A
t
+ ( x )A. (1.34)
The partial derivative here can be combined with the term in (1.33) to produce the electric eld
E = A/t. Putting all these things back into the EL eqn (1.33) yields
m x = Q
_
E+( x A) ( x )A

. (1.35)
Its now straightforward to show that the last two terms on the right of (1.35) equal xB as one would
hope: bearing in mind that x = 0 we have
x B = x (A)
= ( x A) ( x )A
Thus the EL eqn applied to the action (1.32) gives
m x = Q(E+ x B) (1.36)
as required.
Note:
The action (1.32) looks rather arbitrary at this stage but is revealed to be beautifully natural when
one looks at the problem in a relativistically covariant way, as one should.
1.6 Normal modes from Lagrangians
Obviously, when a system is in equilibrium all its time derivatives vanish. From the EL eqns we infer
that equilibrium congurations correspond to V/q
i
= 0, where q
i
is any coordinate. When disturbed
from equilibrium, the system will zoom o if the equilibrium is unstable, or oscillate if the equilibrium is
stable. Small amplitude oscillations can be represented as a superposition of normal modes. Lagrangians
provide a relatively painless route to the frequencies and forms of these normal modes. The trick is to
expand L(q, q) in a Taylor series around the equilibrium conguration q = q
s
, q = 0, discarding terms
of higher than second order in q q q
s
and its derivatives. Thus we write
L
1
2

ij
_
M
ij
q
i
q
j
+C
ij
q
i
q
j
+F
ij
q
i
q
j
_
+

i
A
i
q
i
+L
0
, (1.37)
1.6 Normal modes from Lagrangians 9
where M, C, F and A are constant matrices or vectors. Since F
ij
=
2
L/q
i
q
j
, F is a symmetric
matrix, and the same applies to M.
Since the EL eqns involve only derivatives of L, we can discard the constant L
0
. It is also easy to
check that the term involving A makes no net contribution to the equations of motion. Bearing in mind
the symmety of M and F, the EL equation of motion for q
k
is easily found to be
0 =

j
d
dt
_
M
ik
q
i
+
1
2
C
ik
q
i
_

i
_
1
2
C
kj
q
j
+F
kj
q
j
_
=

i
_
M
ki
q
i
+
1
2
(C
ik
C
ki
) q
i
F
ki
q
i

.
(1.38)
These equations are easily solved by writing q(t) = Qe
it
, whence the eigenfrequencies are the roots
of
det(F +
2
M+ i

C) = 0, (1.39)
where

C
ij

1
2
(C
ij
C
ji
) is the antisymmetric part of C. When the dynamics is time-reversible, as is
usually the case when we are neither using a rotating frame nor working with a magnetic eld,

C = 0.
The equilibrium is stable i all allowed values of
2
are positive, i.e., all eigenfrequencies are real.
For simplicity we now consider the case in which

C = 0.
By expanding V (q) around the stationary point q
s
corresponding to an equilibrium conguration
and plugging the expansion into the EL eqns, one sees that the equilibrium is stable if q
s
is a local
minimum of V , and unstable otherwise.
1.6.1 Normal coordinates Let Q

be a vector that satises the eigenvalue equation


(F +
2

M)Q

= 0.
When we dot this equation through by another eigenvector, Q

, we nd
Q

FQ

=
2

MQ

. (1.40)
The equation holds if the labels and are interchanged. Moreover, by the symmetry of F and M,
Q

FQ

= Q

FQ

and similarly for M. So when we subtract from (1.40) the equation with and
interchanged we obtain
0 = (

)Q

MQ

. (1.41)
It now follows that Q

MQ

= 0 for

,=

, so if the eigenvectors are appropriately normalized


Q

MQ

. (1.42)
The general solution of the EL eqns (1.38) with

C = 0 can now be written
q(t) =
N

=1
a

cos(

t +

), (1.43)
where the a

and

are 2N arbitrary constants. Premultiplying by Q

M we nd with (1.42) that


Q

Mq(t) = a

cos(

t +

). (1.44)
For each possible value of the left side of this equation is a particular linear combination of the original
coordinates, and the right side shows that this combination oscillates sinusoidally at angular frequency

regardless how the system is set into motion. A combination of the coordinates that inevitably
oscillates sinusoidally is called a normal coordinate.
10 Calculus of Variations
Example 1.2
The governor of a steam engine contains two balls of mass m that are
mounted on light rods, and these are in turn attached to a vertical axis.
The plane of the rods rotates at constant angular velocity about the
vertical axis. A spring connects the two rods in such a way that the
potential energy stored in the spring is
1
2
k times the square of the
distance between the centres of the balls. Find a point of equlibrium
and determine the frequencies of the normal modes.
Solution: Application of the cosine law to the triangle formed by the balls and their point of
suspension shows that the potential energy is
V = mga(cos + cos ) +
1
2
ka
2
_
2 2 cos( +)

Subtracting this from the kinetic energy, we nd that


L =
1
2
ma
2
(

2
+

2
) +
1
2
ma
2

2
(sin
2
+ sin
2
) +mga(cos + cos ) ka
2
_
1 cos( +)

.
By the systems symmetry, there is a point of equilibrium with = =
0
. Setting to zero L/
evaluated at this point, we nd the equlibrium point to satisfy
0 = ma
2

2
sin
0
cos
0
mga sin
0
ka
2
sin2
0

_
_
_
sin
0
= 0 or
cos
0
=

2
g

2
2
2
s
,
where
2
g
g/a,
2
s
k/m. At (
0
,
0
) the second derivatives of L are

2
L

2
= (m
2
k)a
2
cos 2
0
mga cos
0

2
L

2
= (m
2
k)a
2
cos 2
0
mga cos
0

2
L

= ka
2
cos 2
0
Hence the equations
d
dt
_
L

_
=

2
L

2
+

2
L

etc. that govern the normal modes are


_

_
=
_
x y
y x
__

_
where
_
x = (
2

2
s
) cos 2
0

2
g
cos
0
y =
2
s
cos 2
0
(1.45)
The normal frequencies are given by the eigenvalues of the matrix:
2
= x y The lowest
squared frequency,
2
g
cos
0

2
cos 2
0
, is negative for
2
>
2
g
cos
0
/ cos 2
0
, which indicates
that the system is unstable for large .
Example 1.3
A cylinder of mass m and radius a rolls on a rough horizontal table. A second cylinder, mass m and
radius
1
2
a rolls inside the rst. Find the normal frequencies for small disturbances from equilibrium.
1.7 Noethers theorem 11
Solution: Let be the angle through which the rst cylinder has turned from equilibrium, and
be the angle through which the second cylinder has rolled relative to the rst (see gure). Then
the line between the two centres makes an angle
=
1
2
(1.46)
with the vertical. The kinetic energy of the rst cylinder (translational plus rotational) is
T
1
=
1
2
m(a

)
2
+
1
2
ma
2

2
= m(a

)
2
. (1.47)
The motion of the centre of the second cylinder is a compound of the leftward motion a

of the
centre of the rst cylinder, plus
1
2
a

perpendicular to the line joining the centres. The second
cylinder rotates with respect to inertial space at angular velocity

+

. The total kinetic energy
is therefore
T = m(a

)
2
+
1
2
m
_
(
1
2
a

cos a

)
2
+ (
1
2
a

sin)
2

+
1
2
m(a/2)
2
(

+

)
2
. (1.48)
The potential energy is simply
V = mg
1
2
a cos . (1.49)
In T, which is quadratic in the velocities, we set = 0. We expand V to second order in , to nd
T =
1
2
ma
2
(
5
2

2
+
1
2


+
1
8

2
),
V = constant +
1
4
mga(
1
2
)
2
.
(1.50)
Dening
0

_
g/a the equations of motion become
5

+
1
2

+
2
0
(
1
2
) = 0,
1
2

+
1
4


1
2

2
0
(
1
2
) = 0.
(1.51)
The eigenfrequencies are now straightforwardly found to be = 0 and =

2
0
.
1.7 Noethers theorem
A constant of motion is any function C(q, q) that satises dC/dt = 0, where q(t) is a solution of the
eqns of motion. For example, in a conservative system, energy is conserved, so E(q, q) is a constant of
motion. Finding a constant of motion is a big step towards obtaining a general solution of the equations
of motion.
In general, a system with N degrees of freedom q
1
, . . . , q
N
admits 2N 1 independent constants
of motion. We show this by arguing that given (q, q) at any time t, the equations of motion allow us
to give the position and velocity (q
(0)
, q
(0)
) at any reference time t
0
. Thus q
(0)
i
= f
i
(q, q, t), where f
i
is some function. Similarly, q
(0)
i
= g
i
(q, q, t), where g
i
is another function. On eliminating t between
these 2N functions, we have 2N 1 constants of motion.
It seldom happens that we can nd 2N 1 constants of motiona rare exception is the case of
motion in a Kepler potential V 1/r. In fact it turns out that essentially complete information about
solutions of the equations of motion can be extracted from N constants of motion. A system for which
N constants of motion can be found is said to be integrable.
A theorem proved by Emmy Noether (18821935) provides a powerful way of extracting constants
of motion from Lagrangians. Noethers theorem involves identifying a ow in conguration space that
leaves L invariant. A ow is an innitesimal transformation
q q

= q +
dq(q)
d
. (1.52)
12 Calculus of Variations
For example, the transformation x x +i, is a ow.
A ow changes the path q(t) into the path q

(t) and thus changes the value of the Lagrangian at


time t by
L =
L
q
q +
L
q
q. (1.53)
Notice that q is well dened: q =
q
q
q.
Invariance of L just means that L takes the same value at all points that are joined by the ow.
Noethers theorem states that if L vanishes along the dynamically determined path, then
dq
d

L
q
(1.54)
is a constant of motion. Thus from the invariance of L under translation x x +i along the x-axis,
Noethers theorem deduces the constancy of
i
L
q
=
L
x
. (1.55)
For a particle moving in a velocity-independent potential this is just the x-momentum m x.
The proof of Noethers theorem is simple. Equating to zero equation (1.53) for L we have
0 = L =
L
q
q +
L
q
q. (1.56)
Using the EL eqns to eliminate L/q this becomes
0 =
d
dt
_
L
q
_
q +
L
q
q
=
d
dt
_
L
q
q
_
,
(1.57)
and the result follows on writing q = (dq/d).
Consider the proof of conservation of angular momentum by Noethers theorem. A rotation by
about the unit vector n changes x by n x. So if L is invariant under this rotation, the following is
a constant of motion:
J n x
L
x
= n x
L
x
.
(1.58)
For a particle moving in a velocity-independent potential this is just the component of mx x parallel
to n.
Example 1.4
A certain system with coordinates x, y, and z has Lagrangian
L =
1
2
(m
1
x
2
+m
1
y
2
+m
2
z
2
) +A(t) z
1
2
k
_
(x y)
2
+ (y z)
2
+ (z x)
2

,
where m
1
, m
2
and k are constants and A(t) is a given function of time. Obtain an expression for
A(t) A(0) in terms of the values of x, y and z at time t and at time zero.
Solution: L depends only on the dierence between coordinates, so it is invariant under (x, y, z)
(x +, y +, z +). The associated invariant is
L
x
+
L
y
+
L
z
= m
1
( x + y) +m
2
z +A(t) (1.59)
1.8 Constraints 13
so
A(t) A(0) = m
1
( x x
0
+ y y
0
) m
2
( z z
0
). (1.60)
Heres an application to motion in a uniform magnetic eld B = Bk. Lets choose A = (By, 0, 0).
Then by (1.31) L =
1
2
m x
2
QBy x is invariant under two ows: (i) x x +i and (ii) x x +k.
Hence we have two invariants
p
x

L
x
= mv
x
QBy ; p
z

L
z
= mv
z
. (1.61a)
Choosing A = (0, Bx, 0) we nd a third invariant for the same physical problem:
p
y

L
y
= mv
y
+QBx. (1.61b
The physical meaning of p
z
is obvious, but what do p
x
and p
y
mean physically? Add them up:
P p
x
+ ip
y
= m(v
x
+ iv
y
) +QB(ix y)
= m

+ iQB
where x + iy. (1.62)
Solving this rst-order d.e. for we nd
(t) = (0)e
it

iP
m
, where
QB
m
(1.63)
is the Larmor frequency. It is now easy to see that the real and imaginary parts of P encode the y and
x coordinates of the guiding centre around which the particle gyrates.
1.8 Constraints
Sometimes it is convenient to work with more coordinates than a system has degrees of freedom. Sup-
pose, for example, that the system consists of a dumbell of length s that is free to slide on a smooth
table. This system has three degrees of freedom, namely the position of the centre of mass and the
orientation of the dumbell. But we might prefer to describe the system in terms of the x and y coords
of the dumbells particles. These are not independent, but satisfy the constraint
(x
1
x
2
)
2
+ (y
1
y
2
)
2
= s
2
. (1.64)
The dynamics of the system are obtained by extremizing the action subject to this constraint equation.
Lagrange multipliers (Box 2) enable us to do this simply. We write the constraint equation as C(q) = 0
and evaluate
0 = S
_
dt C
=
_
t
2
t
1
dt

i
q
i
_
L
q
i

d
dt
_
L
q
i
_
+
C
q
i
_
.
(1.65)
Here (q, t) is an arbitrary function. As in Lagranges standard argument, we choose to ensure that
the coecient of one of the q
i
vanishes, and then conclude from the independence of the remaining q
i
that their coecients must vanish too. Hence we have for every i that
d
dt
_
L
q
i
_
=
L
q
i

C
q
i
. (1.66)
Specically for our dumbell example, L =
1
2
m(v
2
1
+v
2
2
), so the equations of motion are
m x
1
= 2(x
1
x
2
)
m x
2
= 2(x
1
x
2
)
m y
1
= 2(y
1
y
2
)
m y
2
= 2(y
1
y
2
)
. (1.67)
14 Calculus of Variations
Box 2: Lagrange Multipliers
Suppose we are given the prot G(x, y, z) when we sell some food with amounts x, y and z
of additives that are constrained by the health and safety regulations such that we must have
F(x, y, z) = 0, where F is a specied function. The regulations oblige us to manufacture a product
whose representative point in (x, y, z) space lies on the two-dimensional surface F = 0, and we
maximize our prot by nding the point on this surface at which G is biggest.
If we make small changes in the inputs, our prot changes by
dG =
G
x
dx +
G
y
dy +
G
z
dz. (B2.1)
Unfortunately, we are obliged to remain on the surface F = 0, so our changes have to satisfy
0 = dF =
F
x
dx +
F
y
dy +
F
z
dz. (B2.2)
We multiply this equation by an arbitrary function (x, y, z) and subtract the result from equation
(B2.1). We then have
0 =
_
G
x

F
x
_
dx +
_
G
y

F
y
_
dy +
_
G
z

F
z
_
dz. (B2.3)
We now choose the function to make the coecient of dz vanish that is we set =
(G/z)/(F/z). So we now have
0 =
_
G
x

F
x
_
dx +
_
G
y

F
y
_
dy. (B2.4)
The changes dx and dy can be chosen independently because whatever values we adopt for these
variables, the constraint (B2.2) will be satised for an appropriate value of dz. One allowed choice
is dx ,= 0 with dy = 0, and for this choice equation (B2.4) holds only if the coecient of dx
vanishes. Similarly, choosing to set dx = 0 with dy ,= 0 we infer that the coecient of dy also
vanishes. We now have four equations that must hold at the point that maximizes G, namely
F = 0 =
G
x

F
x
; 0 =
G
y

F
y
; 0 =
G
z

F
z
. (B2.5)
In principle we can solve these four equations for the four unknowns: the values of x, y and z at
the stationary point, and the numerical value of the function at that point. This procedure was
invented by the Lagrange, so is called a Lagrange multiplier.
Adding the lower to the upper equations we obtain the equations of motion of the centre of mass:

R = 0, where R =
1
2
(r
1
+ r
2
). Dividing the top left equation by the bottom right equation and the
bottom left equation by the top right equation and then subtracting the resulting equations, we obtain
xy x y = 0, where x x
1
x
2
etc, which expresses conservation of the systems angular momentum:
d
dt
( xy x y) = 0.
We shall see below that p
i
L/ q
i
is the momentum conjugate to q
i
. Equation (1.66) expresses
the rate of change of p
i
as a sum of two generalized forces. The term L/q
i
is simply minus the
gradient of the potential that would be associated with the coordinates in the absence of the constraint.
This vanishes in our dumbell example. The term (C/q
i
) describes the force associated with
maintenance of the constraint. In the case of the dumbell, for example, we have that the tension T in
its bar is given by
T
x
s
= F
x
= m x
1
= 2x T = 2s. (1.68)
Introduction to Hamiltonian Dynamics 15
Example 1.5
A moped engine contains a vertically mounted piston of mass m that is cou-
pled to a y-wheel of moment of inertia I by a light connecting rod of length
l. The system has only one degree of freedom but two natural coordinates,
and x. The constraint equation is
l
2
= x
2
+r
2
2rxcos . (1.69)
The Lagrangian is
L =
1
2
I

2
+
1
2
m x
2
mgx. (1.70)
From (1.66) the equations of motion are
d
dt
(m x) = mg (2x 2r cos )
d
dt
(I

) = 2rxsin .
(1.71)
Eliminating we nd that x and satisfy the d.e.
m x +
_
cot
x

cosec
r
_
I

+mg = 0. (1.72)
This should be solved inconjunction with the constraint (1.69).
Sometimes it is in principle possible to write the Lagrangian in terms of as many coordinates as
the system has degrees of freedom. In such a case the constraint is called holonomic. Clearly, the
constraint (1.64) of the dumbell is of this class, although in practice holonomic constraints will be more
complex than (1.64) and correspondingly algebraically hard to eliminate.
Sometimes a constraint cannot be eliminated, even in principle. Such unavoidable constraints are
called non-holonomic. The classic example of a non-holonomic constraint occurs in the problem of
a rough ball moving on a rough plane. Five natural coordinates for the problem comprise the (x, y)
coordinmates of the balls centre together with three Euler angles to specify the balls orientation. Two
constraints couple the velocities of these coordinates since if the ball is moving parallel to either axis,
it must be rolling and therefore the Euler angles must be incrementing in a denite way. On the other
hand, it is not possible to eliminate any of these coordinates because it turns out that by rolling the ball
to a chosen position, spinning it there about its point of contact with the plane and then rolling it back,
one can arrange for any given values of the Euler angles to be associated with given values of (x, y).
We can obtain equations of motion for the balls ve coordinates by a straightforward generalization
of the formalism described above: we express the balls Lagrangian (its kinetic energy) as a function
of q = (x, y, , , ) and their derivatives and then extremize the action subject to the two constraints
C

(q, q) ( = 1, 2) on the positions and velocities.


2 Hamiltonian Dynamics
The Lagrangian of a dynamical system depends on 2N variables, the systems N coordinates and N
velocities. The 2N-dimensional space of initial conditions (q, q) is called phase space. The eqns of
motion allow one to determine uniquely the systems future and past from its present position in phase
space. Geometrically, through every point of phase space there runs a curve along which the system
evolves. These curves never intersect one another.
It turns out that ( q, q) are not the ideal coordinates for phase space. The natural coordinates are
(p, q), where
p
L
q
(2.1)
16 Chapter 2: Hamiltonian Dynamics
Box 3: Legendre transforms
Let g(x) be a convex function, that is, a function such that g

(x) > 0. Then the Legendre


transform g(p) of g is dened by
g(p) xp g(x)
where x(p) is implicitly dened
as the root for given p of
p =
g
x
. (B3.1)
The convexity of g guarantees that the equation dening x(p) can be solved for any p that lies
between the maximum and minimum gradients of g. Thus g(p) is well dened. It is straightforward
to show that Legendre transforms are invertible. In fact a Legendre transform is its own inverse:
g(x) = g(x).
It is often helpful to consider the function ((x, p) xp g(x) of two independent variables
(x, p). Graphically, ((x, p) is the vertical displacement at ordinate x between the straight line
y = px and the upward curving graph of g(x):
The Legendre transform g(p) is the value of ( at the point x(p) at which the curve runs parallel
to the line. Since
(
x
= p
g
x
, (B3.2)
x(p) is the value of x which extremizes ( for given p, as is already evident from the gure.
is the momentum conjugate to q. Changing coordinates from q to p is analogous in thermodynamics
to replacing the volume V by the pressure P since P = (U/V )
S
just as p = (L/ q)
q
. We are
replacing a variable by the gradient of some function of that variable. Transformations of this type are
called Legendre transforms see Box 3. When in thermodynamics we eliminate V in favour of P, it
is expedient to introduce a new function H(S, P) U +PV . So here we introduce the Hamiltonian
H(p, q) p q L, (2.2)
where it is understood that q is to be eliminated in favour of q, p, and t using equation (2.1).
Example 2.1
When the single degree of freedom of the moped of Example 1.5 is taken to be (that is, x is
considered to be a function of ), the momentum conjugate to is
p

=
_
L

= I

+m x
x

. (2.3)
Dierentiating the constraint eq rst w.r.t. t and then w.r.t.

we have
0 = 2 x(x r cos ) + 2rxsin

0 =
x

(x r cos ) +rxsin
(2.4)
Hence
p

=
_
I +m
_
rxsin
x r cos
_
2
_

. (2.5)
Introduction to Hamiltonian Dynamics 17
The total derivative of the Hamiltonian is
dH = p d q + q dp
_
L
q
_
q,t
dq
_
L
q
_
q,t
d q
_
L
t
_
q, q
dt
= q dp
_
L
q
_
q,t
dq
_
L
t
_
q, q
dt,
(2.6)
where the rst and fourth terms cancel by (2.1). But we may also write
dH =
_
H
p
_
q,t
dp +
_
H
q
_
p,t
dq +
_
H
t
_
q,p
dt. (2.7)
Since equations (2.6) and (2.7) must be the same, we have
q =
_
H
p
_
q,t
;
_
H
q
_
p,t
=
_
L
q
_
q,t
;
_
H
t
_
q,p
=
_
L
t
_
q, q
. (2.8)
Using the EL eqns and simplifying the notation, the rst two of these equations lead us to Hamiltons
equations
q =
H
p
; p =
H
q
. (2.9)
Along a trajectory
_
q(t), p(t)
_
, the Hamiltonian H
_
q(t), p(t), t
_
changes at a rate
dH
dt
=
H
q
q +
H
p
p +
H
t
=
H
t
. (2.10)
Hence, if L/t = 0, it follows from equation (2.8) that the Hamiltonian is conserved along all dynamical
trajectories. We can think of this as an extension of Noethers theorem: the integral H arises from the
time-translation invariance of L.
For example, consider motion in the time-independent potential V (x). If we work in Cartesian
coordinates, the Lagrangian L =
1
2
m x
2
V (x) depends only on x and x, so L/t = 0. Hence the
Hamiltonian H is conserved. The physical quantity to which H corresponds is easily found. We have
p = L/ x = m x and
H(x, p) = p x L
=
p
2
2m
+V (x),
(2.11)
which is simply the total energy E = k.e. + p.e.. Thus for motion in a xed potential the Hamiltonian
is equal to the total energy.
Consider an harmonic oscillator: a particle of mass m that oscillates at frequency . The energy
of this system is
1
2
m x
2
+
1
2
m
2
x
2
, so
H =
p
2
2m
+
1
2
m
2
x
2
(2.12)
and Hamiltons equations are
p =
H
x
= m
2
x ; x =
H
p
=
p
m
(2.13)
We could solve these equations by dierentiating the second equation w.r.t. t and use the rst equation to
eliminate p, but lets have a little quantum-mechanics inspired fun. Consider the variable A = p+imx,
where the m factor ensures that both terms have the same dimensions (notice that AA

= 2mH). As
equation of motion is

A = p + im x = m
2
x + ip = iA. (2.14)
18 Chapter 2: Hamiltonian Dynamics
Solving this trivial equation of motion yields
A
t
= p
t
+ imx
t
= e
it
(p
0
+ imx
0
)
A

t
= p
t
imx
t
= e
it
(p
0
imx
0
).
Adding and subtracting these equations, we obtain the complete solution:
p
t
= p
0
cos(t) mx
0
sin(t) ; x
t
=
p
0
m
sin(t) +x
0
cos(t). (2.15)
What are p and H in a rotating frame? From (2.1) and (1.13) we have
p = m( r + r) (2.16)
which shows that p isnt always the same as m q. In fact, here p is identical with mass times velocity
in the underlying inertial frame.
Using (2.16) to eliminate r from (2.2) and (1.13) we nd that the Hamiltonian for a rotating frame
is
H = p
_
p
m
r
_

p
2
2m
+V
=
p
2
2m
+V (r p).
(2.17)
The rst two terms sum to the energy in an underlying inertial frame, and the last term is J, where
J is the angular momentum. Unless V is axisymmetric [V = V ([r[)], the energy in an inertial frame
changes as V does work on the potential, but H is nonetheless constant.
Exercise (2):
Show that in a rotating frame we may write H =
1
2
m[ r[
2

1
2
m[ r[
2
+ V. What is the physical
interpretation of the second term on the r.h.s?
From the Lagrangian (1.31) for non-relativistic motion in an e.m. eld we nd
p = m x +QA. (2.18)
Thus in an e.m. eld p is not just m x. In Problem 6 of Set 2 you can explain this result by demonstrating
that the e.m. eld contributes QA to p. In quantum mechanics the distinction between p and m x is
of the utmost importance because it turns out that when one quantizes, it is p rather than m x that
should be replaced by ih.
Using (2.18) in (2.2) we nd H for motion in an e.m. eld is
H = (m x +QA) x
_
1
2
m[ x[
2
+Q( x A)
_
=
1
2
m[ x[
2
+Q
=
1
2m
[p QA[
2
+Q.
(2.19)
Although H is just what one would navely think of as the energy, when expressed in terms of p it looks
odd.
2.1 Liouvilles theorem
If we imagine releasing a bunch of dynamically identical systems from neighbouring initial conditions,
then the phase points describing these systems ow through phase space like a uid. This ow is
2.2 The Hamiltonian principle of least action 19
governed by Hamiltons equations (2.9). It is an incompressible ow: the velocity of the uid is ( p, q)
and the divergence of this velocity is
div( p, q) =
_
p
p
+
q
q
_
=
_


2
H
pq
+

2
H
qp
_
= 0.
The divergence-freeness of the phase ow is known as Liouvilles theorem.
Let f be the probability density of systems in phase-space. Then conservation of probability requires
that f obey the continuity equation
0 =
f
t
+ div
_
( p, q)f
_
=
f
t
+
f
p
p +
f
q
q
=
f
t

f
p

H
q
+
f
q

H
p
(2.20)
where Liouvilles theorem has been used. The continuity equation of f in either of the last two forms is
known as Liouvilles equation.
2.2 The Hamiltonian principle of least action
The principle of least action
0 = S =
_
t
2
t
1
dt L(q, q) (2.21)
is concerned with paths q(t) through coordinate space. We can derive classical mechanics from another,
closely related, variational principle which involves paths
_
p(t), q(t)
_
through phase space rather than
coordinate space. This principle is that the path actually followed between (t
i
, q
i
) and (t
f
, q
f
) is that
for which
S = 0 where S
_
p dq H(p, q) dt. (2.22)
Here the path of integration runs between (t
i
, q
i
) and (t
f
, q
f
) neither p(t
i
) nor p(t
f
) is constrained.
Showing that this principle yields Hamiltons equations (2.9) is easy:
S =
_
_
p q +p q
H
p
p
H
q
q
_
dt
=
_
__
q
H
p
_
p
_
p +
H
q
_
q
_
dt +
_
p q

t
f
t
i
.
(2.23)
Since q vanishes at t
i
and t
f
by hypothesis, the nal term in (2.23) vanishes. Then, with p and q
subject to arbitrary variation, it is clear that S = 0 only if the contents of the pairs of large round
brackets in (2.23) vanish. But the vanishing of brackets is precisely the content of Hamiltons equations.
Notice that a very remarkable thing is being done with the variational principle (2.22): we are
treating p as quite independent of the value of q along the path. This makes perfectly good sense from
the point of view of phase-space geometry, but it makes a mockery of our original denition (2.1) of p.
This denition is recovered for the true path as a consequence of the variational principle (2.22):
q =
H
p
=

p
(p q L)
= q +
_
p
L
q
_

q
p
.
(2.24)
20 Chapter 2: Hamiltonian Dynamics
Recall that we introduced H as p qL, with q eliminated in favour of p. Now that we are treating
p as independent of q, p q H becomes a quantity dierent from L; indeed, L depends only on the
projection of a phase-space path
_
p(t), q(t)
_
onto conguration space, while p q H depends on p(t)
as well as q(t). Thus the action principle (2.22) is entirely dierent from (2.21), although the extremal
values of the two integrals are the same because along the extremal path p = L/ q.
In Appendix III (2.22) is derived from the Schrodinger equation. The basic idea is simple: from the
Schrodinger equation we calculate the quantum amplitude to get from (t
i
, q
i
) to (t
f
, q
f
) and show that
it can be expressed as a sum over all possible paths between these events of amplitudes proportional to
e
iS/ h
, where S is dened by (2.22). Then we argue that the only paths which make a net contribution
to the overall amplitude are those whose values of S lie within h of a stationary value, since the
contributions of other paths are cancelled by oppositely signed contributions from neighbouring paths.
Thus the overall amplitude is dominated by contributions from paths that lie within h of the classical,
extremizing, path, and from a macroscopic point of view these paths are identical with the classical path.
2.3 Poisson brackets and canonical coordinates
Let A(q, p) and B(q, p) be any two functions of the phase-space coordinates. Then the Poisson
bracket [A, B] is dened by
[A, B]
A
q

B
p

A
p

B
q
. (2.25)
It is straightforward to verify the following properties of Poisson brackets:
(i) [A, B] = [B, A] and [A+B, C] = [A, C] + [B, C],
(ii) [[A, B], C] + [[B, C], A] + [[C, A], B] = 0 (Jacobi identity),
(iii) The coordinates (q, p) satisfy the canonical commutation relations
[p
i
, p
j
] = [q
i
, q
j
] = 0 and [q
i
, p
j
] =
ij
. (2.26)
(iv) Hamiltons equations may be written
q
i
= [q
i
, H] ; p
i
= [p
i
, H]. (2.27)
If we write (w
i
q
i
, w
N+i
p
i
i = 1, . . . , N), and dene the symplectic matrix c by
c

[w

, w

] =
_
1 for = N, 1 , 2N;
0 otherwise,
(2.28a)
we have
[A, B] =
2N

,=1
c

A
w

B
w

. (2.28b)
Any set of 2N phase-space coordinates W

( = 1, . . . , 2N) is called a set of canonical coordinates


if [W

, W

] = c

. Let W

be such a set; then with equation (28b) and the chain rule we have
[A, B] =
2N

,=1
c

A
w

B
w

_
A
W

B
W

[W

, W

]
A
W

B
W

A
W

B
W

.
(2.29)
Thus the derivatives involved in the denition (2.25) of the Poisson bracket can be taken with respect
to any set of canonical coordinates, just as the vector formula a =

i
(a
i
/x
i
) is valid in any
Cartesian coordinate system.
2.4 Canonical transformations 21
Box 4: Lorentz invariance & Symplectic structure
inertial coordinates canonical coordinates
Lorentz transformations canonical transformations

Lorentz invariant [x[


2

__
dp dq (Poincare invariant)
The rate of change of an arbitrary canonical coordinate W

along an orbit is

=
2N

=1
W

, (2.30)
where, as usual, w (q, p). With Hamiltons equations (2.27) and equation (2.29) this becomes

=
2N

=1
W

[w

, H] =

H
w

H
w

= [W

, H].
(2.31)
Choosing to use the W
i
as independent coordinates when evaluating the Poisson bracket, we nd that

Q
i
= H/P
i
,

P
i
= H/Q
i
, so Hamiltons equations (2.9) are valid in any canonical coordinate
system.
Poisson brackets allow us to associate a one-parameter family of maps B
b
of phase space onto itself
with any function B(q, p) on phase space: from each point (q
0
, p
0
) of some (2N1)-dimensional surface
in phase space we integrate the coupled ordinary dierential equations
dq
db
= [q, B] =
B
p
,
dp
db
= [p, B] =
B
q
(2.32)
from the initial conditions q(0) = q
0
, p(0) = p
0
. If the initial (2N 1)-surface is large enough, the
integral curves q(b), p(b) of B reach every point of phase space. Then the map B
b
is dened by
B
b
(q(b

), p(b

)) = (q(b +b

), p(b +b

)). (2.33)
The generator of the transformation, B(q, p), is indistinguishable from a Hamiltonian, since it satises
Hamiltons equations (2.32), with b playing the role of the time t.
In Lagrangian mechanics, invariance of the Lagrangian under a ow in conguration space gives
rise to a conserved quantity (Noethers thm). In Hamiltonian mechanics the analogue of a ow that
doesnt change the Lagrangian is a map B
b
that doesnt change the value of H. For B
b
to have this
property, we must have
0 =
dH
db
=
H
q

dq
db
+
H
p

dp
db
=
H
q

B
p

H
p

B
q
= [H, B]. (2.34)
That is, the generator of a phase-space ow that leaves H invariant, has vanishing Poisson bracket with
H (commutes with H).
The rate of change of B along our systems trajectory is
dB
dt
=
B
q
q +
B
p
p =
B
q

H
p

B
p

H
q
= [B, H] (2.35)
Thus

B = 0 if and only if H is invariant under the ow that B generates. This Hamiltonian formulation
of the connection between constants of motion and invariance under ows goes further than Noethers
theorem because it shows the every constant of motion is associated with a ow that leaves H invariant.
22 Chapter 2: Hamiltonian Dynamics
2.4 Canonical transformations
Suppose you have a function S(P, q) of some new variables P
i
, i = 1, N and the regular coordinates q
i
such that the equation
p =
S
q
(2.36a)
can be interpreted as dening P(p, q). Then it turns out that the coordinates (P, Q) are canonical,
where
Q
S
P
. (2.36b
That is, one may show (see Appendix II) that with these denitions, [Q
i
, Q
j
] = 0, [Q
i
, P
j
] =
ij
,
[P
i
, P
j
] = 0. The transformation (p, q) (P, Q) is called a canonical transformation and S the
generating function of the transformation.
The function that generates a canonical transformation need not be of the form S(P, q); other forms
are S(P, p), S(Q, q) and S(Q, p). The generating function is always a function of one old coordinate
and one new one. An entertaining transformation is generated by S = Q q:
p =
S
q
= Q ; P =
S
Q
= q. (2.37)
Canonical transformations are closely connected to the one-parameter maps introduced above. To
see this consider functions S of the form
S = P q +s(P, q)u, (2.38)
where u 1. For S of this form we have
Q = q +
s
P
u ; p = P+
s
q
u
P = p
s
q
u.
(2.39)
Thus S = P q generates the identity transformation P = p, Q = q. Moreover,
Qq
u
=
s
P
Pp
u
=
s
q
(2.40)
In the limit u 0 we can identify P with p on the right, and these equations become
dq
du
= [q, s] ;
dp
du
= [p, s], (2.41)
which is identical with (2.32). Thus canonical transformations generated by functions of the form (2.38)
may be thought of as innitesimal canonical maps.
There is no fundamental dierence between a map and a coordinate transformation: every map
generates a coordinate transformation and every transformation a map since one can treat changed
coordinates as new numbers describing an old point (a coordinate change), or as old numbers describing
a new point (a mapping).
2.6 Hamilton-Jacobi Equation 23
2.5 Point transformations
If (Q
i
(q), i = 1, . . . , N) are any N independent functions of the generalized coordinates q, then by
equation (2.1) we obtain the new momenta P
i
= (L/

Q
i
) by expressing the Lagrangian as a function
L(Q,

Q) of the Q
i
and their time derivatives. The coordinate change (q, p) (Q, P) is called a point
transformation, because the new coordinates are functions only of the old. It is straightforward to
show that the new coordinates are canonical, by evaluating their Poisson brackets.
The importance of these results is that it is often convenient to work in curvilinear coordinates Q
and derive the corresponding momenta P = (L/

Q). Since the coordinates (Q, P) are canonical, the
Poisson bracket (2.25) can be equally well evaluated by taking derivatives with respect to Q and P as
with respect to q and p. Hence all curvilinear coordinates have equal status in Hamiltonian mechanics.
Example 2.2
A particle of mass m and charge Q
1
moves in a bound orbit around a xed charge Q
2
in the
plane perpendicular to a constant magnetic eld B. Determine the systems Hamiltonian in polar
coordinates (r, ) on the orbital plane. Hence show that mr
2

+
1
2
Q
1
r
2
B is constant on the orbit.
Solution: The vector potential can be written A =
1
2
rBe

. From (1.31) the Lagrangian is


L =
1
2
m( r
2
+r
2

2
) +Q
1
_
r

1
2
rB
Q
2
4
0
r
_
, (2.42)
so the momenta are
p
r
= m r p

= mr
2

+
1
2
Q
1
r
2
B (2.43)
Finally, the Hamiltonian is
H(p
r
, p

, r, ) =
p
2
r
2m
+
(p

1
2
Q
1
Br
2
)
2
2mr
2
+
Q
1
Q
2
4
0
r
.
The constancy of p

follows because H is independent of . Notice that (2.43) is not simply the


translation into polar coordinates of equation (2.19), which gives H in Cartsesian coordinates: when
translating H from one coordinate system to another one must pass through the Lagrangian.
2.6 Hamilton-Jacobi Equation
Suppose we could nd N constants of motion I
1
, . . . , I
N
. And suppose it were possible to nd a system
of canonical coordinates (P, Q) such that P
i
= I
i
etc. Then the equations of motion for the Ps would
be trivial,
0 =

P
i
= [P
i
, H]
=
H
Q
i
.
(2.44)
and would demonstrate that H(P) would be independent of the Qs. This last observation would allow
us to solve the equations of motion for the Qs: we would have

Q
i
=
H
P
i

i
, a constant Q
i
(t) = Q
i
(0) +
i
t. (2.45)
So everything would lie at our feet if we could nd N constants of the motion and could embed these
as the momenta of a system of canonical coordinates.
2
The magic coordinates P I and Q are called
action-angle coordinates, the Is being the actions and the Qs the angles.
2
Notice that to be able to embed the Is as a set of momenta, we require [I
i
, I
j
] = 0; functions satisfying this condition
are said to be in involution.
24 Chapter 2: Hamiltonian Dynamics
Let S(I, q) be the generating function of the transformation between regular coordinates (p, q) and
action-angle coordinates. Then we can use this to eliminate p = S/q from H, expressing H as a
function of (I, q):

H(I, q) H
_
S
q
, q
_
. (2.46)
By moving on an orbit we can vary the q
i
pretty much at will while holding constant the I
i
. As we vary
the q
i
in this way H must remain constant at the energy E of the orbit in question. This suggests that
we investigate the non-linear partial dierential equation
H
_
S
q
, q
_
= E, (Hamilton-Jacobi equation). (2.47)
If we can solve this equation, we identify the arbitrary constants on which the solution S(q) depends
with functions of the constants of motion I
i
. For example, the H-J eqn for a free particle moving in two
dimensions is
[S[
2
2m
= E (2.48)
We write S(x) = S
x
(x) +S
y
(y) and solve (2.48) by separation of variables:
constant I
x
=
_
S
x
_
2
= 2mE
_
S
y
_
2
I
y
. (2.49)
This example is very tame, but the technique works also for more complicated Hamiltonians that cannot
be solved by other means.
The similarity between the H-J eqn and the time-independent Schrodinger eqn is obvious. We can
derive the H-J eqn from QM as follows. For simplicity we consider the special case of a particle that
moves in a potential V (x). If the particle has well-dened energy, its wavefunction (x) must satisfy
the time-independent Schrodinger eqn E = H = (p
2
/2m + V ). Without loss of generality, we can
write = e
iS/ h
, where S(x) is a possibly complex function of x. Then
p
2
= h
2

_
e
iS/ h
iS
h
_
= e
iS/ h
_
[S[
2
ih
2
S
_
. (2.50)
Since we are dealing with classical mechanics, we are interested in the limit h 0. Then the second
term in the bracket vanishes and the TISE becomes
0 =
p
2

2m
+V E = e
iS/ h
_
[S[
2
2m
+V E
_
, (2.51)
which is just e
iS/ h
times the H-J eqn. This derivation reveals that the generating function of the trans-
formation from ordinary to action-angle coordinates is h times the phase of the particles wavefunction.
When one passes from wave optics to geometrical optics, you neglect a term equivalent to that dropped
from (2.50). Dropping this term is called making the eikonal approximation. The approximation
is good when many wavelegths are contained within the smallest length within which S changes
appreciably.
2.7 Phase-space volumes
Often, for example when doing statistical mechanics, one needs a credible denition of phase-space
volume. If one is using Cartesian coordinates to describe a system of n particles of mass m
i
, it is
natural to take the volume element to be d =

n
i
(m
3
i
d
3
x
i
d
3
v
i
). But it isnt immediately obvious what
to use for d in a more complex case. In particular, if one decided to describe the system of particles
by some curvilinear coordinates q(x) and their conjugate momenta p, one would expect d to be of the
form
d =
n

i=1
_
(m
i
v
i
, x
i
)
(p
i
, q
i
)
d
3
p
i
d
3
q
i
_
. (2.52)
Appendix II Proof that generating functions generate canonical transformations 25
One of the most beautiful and useful results in the subject is that the Jacobian here is just one. In
fact, the Jacobian between any pair of canonical coordinates is always one. That is, the volume of an
arbitrary region is
V =
__
V
d
N
pd
N
q =
__
V
d
N
Pd
N
Q, (2.53)
where (p, q) and (P, Q) are any canonical coordinates.
Appendix I Derivation of equation (1.25)
Since the particle coordinates r
i
are functions of the six generalized coordinates q
k
, we have that
r
i
=
6

k=1
r
i
q
k
q
k
r
i
=
6

k,l=1

2
r
i
q
l
q
k
q
l
q
k
+
6

k=1
r
i
q
k
q
k
, (I.1)
so (1.24) can be written
0 =
N

i=1
m
i
_
6

k,l=1

2
r
i
q
l
q
k
q
l
q
k
+
6

k=1
r
i
q
k
q
k
_

r
i
q
j
Q
j
. (I.2)
By the chain rule the bodys k.e. is
T =
1
2
N

i=1
m
i

k=1
r
i
q
k
q
k

2
, (I.3)
so
T
q
j
=

i
m
i
_
6

k=1
r
i
q
k
q
k
_

r
i
q
j
(I.4)
and
d
dt
_
T
q
j
_
=
N

i=1
m
i
__

kl

2
r
i
q
l
q
k
q
l
q
k
+

k
r
i
q
k
q
k
_

r
i
q
j
+
_

k
r
i
q
k
q
k
_

2
r
i
q
l
q
j
q
l
__
.
(I.5)
This expression for (d/dt)(T/ q
j
) contains two of the terms that appear in equation (I.2). Its last
term is unwanted. We can obtain an alternative expression for this unwanted term by calculating
T
q
j
=
N

i=1
m
i
_

k
r
i
q
k
q
k
_

2
r
i
q
j
q
l
q
l
_
. (I.6)
Substituting (I.6) into (I.5) and then using the result to simplify (I.2) we obtain (1.25).
Appendix II Proof that generating functions generate canonical transformations
We prove that given S(q, P), P and Q S/P satisfy the canonical commutation relations. From
the chain rule we have that

p
_
q
=
_
P
p
_
q


P
_
q

q
_
p
=

q
_
P
+
_
P
q
_
p


P
_
q
.
(II.1)
26 Chapter 2: Hamiltonian Dynamics
Applying these formulae to p
i
and using p
i
/P =
2
S/q
i
P = Q/q
i
yields

ij
=
_
P
p
j
_
q

_
Q
q
i
_
q

_
p
i
q
j
_
P
=
_
P
q
j
_
p

_
Q
q
i
_
q
(II.2)
Multiplying these equations together and summing over j we nd

kl
_
Q
k
q
i
_
q
_
Q
l
q
i

_
q
[P
k
, P
l
] =
_
p
i
q
i

_
P
+
_
p
i

q
i
_
P
=

2
S
q
i
q
i
+

2
S
q
i
q
i

= 0.
(II.3)
Since the matrix Q
k
/q
i
has an inverse by (II.2), this shows that [P
k
, P
l
] = 0.
Working again from equations (II.1) we have
[Q
i
, P
j
] =
_
Q
i
q
_
p

_
P
j
p
_
q

_
Q
i
p
_
q

_
P
j
q
_
p
=
__
Q
i
q
_
P
+
_
Q
i
P
_
q

_
P
q
_
p
_

_
P
j
p
_
q

_
Q
i
P
_
q

_
P
p
_
q

_
P
j
q
_
p
=
_
Q
i
q
_
P

_
P
j
p
_
q
+
_
Q
i
P
_
q
[P, P
j
]
=

2
S
P
i
q

_
P
j
p
_
q
=
_
p
P
i
_
q

_
P
j
p
_
q
=
ij
.
(II.4)
Similarly,
[Q
i
, Q
j
] =
__
Q
i
q
_
P
+
_
Q
i
P
_
q

_
P
q
_
p
_

_
Q
j
p
_
q

_
Q
i
P
_
q

_
P
p
_
q

_
Q
j
q
_
p
=
_
Q
i
q
_
P

_
Q
j
p
_
q
+
_
Q
i
P
_
q
[P, Q
j
]
=
_
Q
i
q
_
P

_
Q
j
p
_
q

_
Q
i
P
j
_
q
=

2
S
P
i
q
k
_
Q
j
P
_
q

_
P
p
k
_
q

_
Q
i
P
j
_
q
.
But
p
k
P
_
q
=

2
S
q
k
P
, so
[Q
i
, Q
j
] =

k
_
Q
j
P
l
_
q
_
P
l
p
k
_
q
_
p
k
P
i
_
q

_
Q
i
P
j
_
q
=
Q
j
P
i

Q
i
P
j
=

2
S
P
i
P
j


2
S
P
j
P
i
= 0.
(II.5)
Appendix III Derivation of (2.22) from the Schrodinger equation 27
Appendix III Derivation of (2.22) from the Schrodinger equation
We start by nding the amplitude A
12
to get from (t
1
, q
1
) to (t
2
, q
2
), where the interval t
2
t
1
is small.
In Diracs notation, this amplitude is
A
12
= q
2
[, t
2
), (III.1)
where [, t
2
) is the ket into which [q
1
) has evolved at t
2
. In other words, [, t
2
) is the solution of the
time-dependent Schrodinger equation (tdse) for initial condition [, t
1
) = [q
1
). This is
[, t
2
) = e
i

H(t
2
t
1
)/ h
[q
1
). (III.2)
Here the exponential is the operator with the same eigen-kets [E
n
) as the Hamiltonian

H, and eigenvalues
equal to e
iE
n
(t
2
t
1
)/ h
, where the E
n
are the eigen-values of

H. That is,
e
i

H(t
2
t
1
)/ h

n
[E
n
)e
iE
n
(t
2
t
1
)/ h
E
n
[. (III.3)
(To prove that (III.2) satises the tdse, just substitute (III.3) into (III.2) and dierentiate w.r.t. t
2
.)
Our amplitude can now be written
A
12
= q
2
[e
i

H(t
2
t
1
)/ h
[q
1
)
=
_
d
3
pq
2
[p)p[e
i

H(t
2
t
1
)/ h
[q
1
),
(III.4)
where use has been made of the fact that
_
d
3
p[p)p[ is just the identity operator since the states [p)
of well-dened momentum form a complete set.

H and thus the function of it appearing in (III.4) is a function of the operators p and q. Lets
assume that every p has been positioned to the left of every q. Then every p can be considered to act
to the left and be replaced by its eigen-value p, while every q acts similarly to the right. So the complex
number p[e
i

H(t
2
t
1
)/ h
[q
1
) becomes simply
e
iH(t
2
t
1
)/ h
p[q
1
) = e
iH(t
2
t
1
)/ h
e
ipq
1
/ h
(2h)
3/2
, (III.5)
where H is the classical Hamiltonian evaluated at the classical phase-space point (p, q) and we have
used the fact that p[q
1
) is just the complex conjugate of the wave-function of a particle of well-dened
momentum p. When we insert (III.5) into (III.4) and similarly replace q
2
[p) by a plane wave, we nd
A
12
=
1
h
3
_
d
3
p exp
_
i
h
_
p (q
2
q
1
) H(t
2
t
1
)
__
. (III.6)
Equation (III.6) for the amplitude to get from one event to another is only valid for innitesimal
t
2
t
1
. There are two issues: (i)

H may be time-dependent; (ii) for nite the operator e
i

H
=
1 i

H +
1
2!
(

H)
2
+ involves high powers of

H and so many reversals of the order of the operators
p and q will be required to ensure that the ps are to the left of all qs. In view of these objections we
use (III.6) only for small t
2
t
1
. Given two widely separated events (t
i
, q
i
) and (t
f
, q
f
), we express the
amplitude to pass between them by a particular path q
i
q
1
. . . q
f
as the product
A
i1
A
12
A
m,f
(III.7))
of m amplitudes of the form (III.6) over small intervals (t
j1
, t
j
). We then obtain the amplitude to pass
between (t
i
, q
i
) and (t
f
, q
f
) by any path by summing (III.7) over all values of the intermediate positions
q
j
. The nal amplitude is
A
if
= lim
m
1
h
3m
_
m

j
(d
3
p
j
d
3
q
j
) exp
_
i
h
m

k
_
p
k
(q
k+1
q
k
) H(t
k+1
t
k
)
__
= constant
_
TpTq exp
_
i
h
_
_
p dq H dt
__
.
(III.8)
28 Chapter 2: Hamiltonian Dynamics
Here the symbol TpTq means one is to sum the integrand over all paths
_
p(t), q(t)
_
which pass through
(t
i
, q
i
) and (t
f
, q
f
).
Thus, as claimed in 2.2, the amplitude to get from (t
i
, q
i
) to (t
f
, q
f
) is a sum over all paths of
e
iS/ h
, where S is the classical action for that path. When [S[ h the contributions from paths that
do not extremize S will cancel each other out to high precision, and the amplitude for the transition is
dominated by the extremizing classical path.
Exercise (3):
In (III.8) replace H with
1
2
p
2
/m+V (q) and dq by qdt. Then do the integration over every p
j
by
completing the square and using
_

e
x
2
dx =

. Explain the relation of the resulting expression


for A
if
to the Lagrangian principle of least action.

S-ar putea să vă placă și