Documente Academic
Documente Profesional
Documente Cultură
CLASSICAL MECHANICS
Institute of Mathematical Sciences
August-December 2014
M.V.N. Murthy
Institute of Mathematical Sciences, Chennai 600 113:
Contents
1 Introduction
.
.
.
.
.
9
9
11
16
18
20
27
27
29
32
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
45
45
46
49
51
51
53
56
56
61
61
62
68
69
81
83
84
87
87
89
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
5.3
5.4
5.5
5.6
5.7
Poisson Brackets . . . . . . . . . . . . . . . . . .
Simplectic Structure . . . . . . . . . . . . . . . .
Canonical Transformations . . . . . . . . . . . . .
Liouville Volume Theorem . . . . . . . . . . . . .
Action-Angle Variables . . . . . . . . . . . . . . .
5.7.1 Angle variable . . . . . . . . . . . . . . . .
5.7.2 Harmonic oscillator in N-dimension . . . .
5.8 Integrable systems . . . . . . . . . . . . . . . . .
5.9 Generating function of canonical transformations
5.9.1 Time dependent transformation . . . . . .
5.9.2 Group of Canonical Transformations . . .
5.10 Hamilton-Jacobi Theory . . . . . . . . . . . . . .
5.10.1 The Hamilton-Jacobi equation . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
101
104
105
108
111
112
113
114
117
119
120
122
122
. . . . .
problem
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
127
132
136
139
143
146
148
150
150
153
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
155
156
159
162
165
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
171
171
175
176
182
187
191
.
.
.
.
Chapter 1
Introduction
The history of Classical Mechanics is indeed the development of the laws of mechanics.
The fundamental laws of mechanics were drawn up in 17th century beginning with
Galileo. In Principia published in 1686, Newton wrote down the three laws of motion
and the law of gravitation. This more or less completed the framework of classical
mechanics. Underlying the development of mechanics is the central tenet that Nature
obeys unchanging laws that can be described by Mathematics.
The basic formulation of Newton underwent many reformulations, involving the
same basic parameters, from the 17th to the late 19th century with contributions
from Euler, Lagrange, Hamilton, Jacobi and many others. The laws of classical
mechanics was formulated in more general terms, like the action principle, which was
extended beyond mechanics to all of classical physics including electrodynamics. The
mathematical formulation enabled one to know every thing about the system, pastpresent-future, given the state of the system completely at some instant of time. The
classical laws of mechanics are deterministic and also reversible.
The successes of classical mechanics during this period was very impressive. So
much so, Laplace wrote in Philosophical Essay on Probabilities (1812):
We ought then to consider the present state of the universe as the effect
of its previous state and the cause of that which is to follow.... nothing
would be uncertain, and the future like the past, would be open to its
eyes.
This is indeed a forceful depiction of the determinism which is a very basic
property of dynamical systems. We are, it appears, left with the problem of computing
the past, present and the future already contained in the system of equations. The
state of the system given at any time in its history determines the its past, present
as well as the future. Though the Newtons laws as stated are simple, it could get
complicated if we try to change frames arbitrarily or when we have many particles to
contend with or the system is subjected to funny constraints. Fortunately, in the 19th
century powerful techniques were developed based on the ideas of Euler, Lagrange and
Hamilton. Apart from providing an elegant reformulation of Newtonian Mechanics,
there are many practical advantages in the reformulation of mechanics apart from
leading to generalisation to other areas of physics.
5
CHAPTER 1. INTRODUCTION
Before we jump further with this air of certainty, a word of caution is in orderthe Laplacian world view came in for serious criticism nearly a century after his
enunciation. A serious problem with this interpretation of classical mechanics was
pointed out by Poincare. The following quote from Poincare, about hundred years
later, makes it amply clear:
If we could know exactly the laws of nature and the situation of the
universe at the initial instant, we should be able to predict exactly the
situation of this same universe at a subsequent instant. But even when the
natural laws should have no further secret for us, we could know the initial
situation only approximately. If that permits us to foresee the subsequent
situation with the same degree of approximation, this is all we require,
we say the phenomenon has been predicted, that it is ruled by laws. But
this is not always the case; it may happen that slight differences in the
initial conditions produce very great differences in the final phenomena;
a slight error in the former would make an enormous error in the latter.
Prediction becomes impossible and we have the fortuitous phenomenon.
This is the insight that underlies the study of modern dynamics. The study of
long term behaviour of systems, with sensitivity to initial conditions, cannot always
be done with a prior knowledge of exact solutions, because such solutions may not
be available or may not even exist. Modern developments following Poincare in the
20th century provide basis for studying such systems.
While we keep this view in mind, this course is mainly about the classical mechanics as was established by the end of 19th century. We begin by providing a more
general setting in which classical mechanics is embedded. We follow this with the
standard development of classical mechanics. For those students who want to learn
the modern approach, some introductory material is given in the last chapter.
Part of this lecture notes is based on classical mechanics as taught by Professor
K.N. Srinivasa Rao and Professor A.V.Gopala Rao when I was a student. Though
I did not keep my notes, Prof. A.R. Usha Devi provided these notes and I have
used them extensively here. In addition I have also relied on the lecture notes by my
collaborator Prof. Matthias Brack. These lectures were delivered at the University
of Regensburg in and the original notes were in German. I have vaguely translated
and used them here.
In addition, the following books are generally recommended for additinal reading:
L. D. Landau and E. Lifshitz, Mechanics, Pergamon Press (Now available in
Indian edition at modest price). A classic reference book with every thing that
this course needs in a rather terse language.
K.N. Srinivasa Rao, Classical Mechanics- an excellent reference book for this
course in general and in particular for discussions on frames of reference and
rigid body dynamics.
H. Goldstein, C.P. Poole and J.L. Safko, Classical Mechanics- standard reference
for classical mechanics.
7
David Tong, Classical Dynamics (University of Cambridge Part II of Mathematical Tripos)- easy to read and understand. Has nice examples and applications.
A much downloaded set of notes.
The above references cover the traditional material required for any course in
classical mechanics.
For a modern approach to Classical dynamics which is more relevant to pursue
classical dynamics as an active area of research with an introduction to nonlinear dynamics and chaos see the following references.
I. Percival and D. Richards, Introduction to Dynamics, Cambridge University
Press, 1991 (For further reading on Classical Dynamics, A book that redefined
teaching and learning of classical dynamics.)
J.V. Jose and E.J. Saletan, Classical Dynamics: A contemporary approach,
Cambridge University Press, 1998 (A modern text book which contains everything in an old classical mechanics text book but written with the modern
perspective of classical dynamics in mind).
CHAPTER 1. INTRODUCTION
Chapter 2
Dynamical Systems: Mathematical
Preliminaries
We start with a very broad definition of a dynamical system and introduce the key
theoretical concepts of phase space and fixed points (or limit cycles). While the
definition may not cover all dynamical systems, it does cover a whole variety systems
of interest to us. We give examples of systems with very simple dynamics to begin
with and prepare ground for the treatment of more complicated systems which may
not be analytically solvable.
We note that the systems governed by Newtons Laws in physics are only a subset
of the class of dynamical systems- mechanical systems. This introduction is meant
to provide a geometrical point of view for analysing dynamical systems in general
and mechanical systems in particular. The emphasis is not always on solving the
equations of motion, good if we can, but to glean as much information about the
system even if an exact solution is not available.
2.1
Phase space description was first introduced by Henri Poincare and is widely used in statistical
mechanics after it was adopted by J.Willard Gibbs.
(2.1)
or simply
d~r
= ~v (~r, t).
dt
where
~v {v1 , v2 , . . . , vn }.
is called the velocity function. While we use the notation x and v in analogy
with mechanics, they do not necessarily have the usual meaning of position and
velocity.
See for example Lin and Segel (1988) Mathematics applied to deterministic problems...(SIAM)
11
2.2
This is the simplest case of a dynamical system. The equation of motion3 is given by
dx
= v(x, t)
dt
where v is the velocity function. For any given v(x, t), x(t) is completely determined
given x(t) at some t = t0 . If, in particular, the system is autonomous, or v is not
explicitly dependent on time then the solution can be written as,
Z x(t)
dx
t t0 =
x(t0 ) v(x )
The solution x(t) depends only on the difference (t t0 ) and therefore the time
evolution of the system depends entirely on the time elapsed no matter where the
origin of time is fixed.
Example 1 - Radio-activity : A classic example of a dynamical system of first
order is the Radio-active decay of a nucleus modelled by the equation
dN
= N,
dt
where N is number of nuclei present at some time t. The solution of course is well
known,
N (t) = N0 exp{(t t0 )},
where N0 is the number of unstable nuclei present at t0 .
Example 2- Spread of epidemics :
Unlike radio-decay, here the growth is usually exponential at least in the initial
period. If is an explicit function of time, as in the case of diseases one may obtain
a power-law growth instead of exponential growth. A case which has been studied
in detail is the threat of AIDS which has devastated many parts of Africa and is
threatening many other countries like India.
In an effort to make quantitative assessment of the threat, efforts have been made
to look at the reliable data compiled by Centres for Disease Control in the USA as
a function of time. If I is the number of infected persons in a population of size N,
then the rate of change of I may be given by
dI
I
dt
which gives rise to exponential growth in the initial phases which is usually true in an
epidemic. However, it has been observed that in the case of AIDS the growth shows
3
we call this equation of motion by habit. A more appropriate name is to call this time evolution
equation of the system.
13
The phase space is one dimensional. The phase flows may be indicated by
a set of arrows, for example, pointing left(right) if the sign of v(x) is positive(negative) and whose length is proportional to the magnitude of v(x) as
shown in figure below. For reference we have also shown v(x) as a function of
x. The x-axis is the one dimensional phase space of the system.
v(x)
+1
x2 = 1.
Again one can go through the stability analysis and phase portrait to obtain qualitatively the properties of the system. However, in this case we may actually solve the
system to check if our qualitative analysis is reasonable.
The solution is easily obtained by separating the variables:
a=
dy/dt dy/dt
dy/dt
=
+
y(1 y)
y
1y
we have
At time t = 0, we have
Exponentiating,
adt =
dy/dt dy/dt
+
dt
y
1y
y 1
= exp(at C) = exp(at) y0 1
y
y0
Since the quantities in absolute signs always agree in sign, we can write the solution
as,
y0
y(t) =
y0 + (1 y0 ) exp(at)
We may now look at the asymptotic behaviour of the system:
The two fixed points at y = 0 and y = 1 obviously yield constant solutions
y(t) = 0, 1, that is if y0 is equal to any of these values then the dynamical
evolution is trivial.
For any other value of y0 , the system goes to y = 1 as t . We have
considered here only positive times and positive values of y since that is the
physical situation in population growth problem. However the solution can be
analysed in more general terms without these restrictions.
Putting back x = y/, where x denotes the population, it is clear that the fixed
point at y = 1 actually denotes the saturation in the population with
1/ as the limiting population.
Uniqueness of the solution :
In all the examples above, we have assumed that the solution exists and it is
unique for the system x = v(x). However, we should be aware of some pathological
cases. For example consider the equation
x = x1/2 .
The obvious solution is x(t) = 0 since x = 0 is a fixed point. One may also obtain a
solution by integrating which gives,
2x1/2 = t + C.
Imposing the initial condition x(0) = 0 yields C = 0 so that we have
x(t) = (t/2)2 .
The source of the problem here comes from the fact that the derivative v (x) is not
finite at x = 0. Therefore, to ensure uniqueness of the solution, we demand that both
v(x) and v (x) are continuous on an open interval along the x-axis and the fixed point
is a point in this interval (as stated in the beginning).
15
The set of zeros of the function v(x) are called the fixed points of the system.
The fixed points divide the phase space into several regions.
xk is a stable fixed point or sink if the flow is directed towards the fixed
point, otherwise xk is an unstable fixed point or repellor. In the above
example obviously xk = 0 is a stable fixed point where as xk = 1 are unstable.
It is easy to see that the system evolves towards stable fixed point and away
from an unstable fixed point. That is, if x0 is a fixed point then define
=
dv
dx
.
x0
2.3
The phase space is two dimensional and each point in the phase space is characterised
by two real numbers (x, y).
~r(t) (x(t), y(t)).
d~r
= ~v (x, y, t).
dt
17
The solution of the equations defined by the velocity vector ~v has a unique solution
for all time with the initial condition ~r(t0 ) = (x0 , y0 ). If the system is autonomous
then of course there is no explicit dependence on time in the velocity function.
The solution ~r(t) obtained with a particular initial condition defines a continuous
curve called the phase curve. The set of all phase curves tracing the actual motion
is called phase flow. Note the phase curves exist only d 2. In d = 1 there are only
phase flows. The equation of the phase curve for an autonomous system of order 2 is
given by
vy (x, y)
dy
=
.
dx
vx (x, y)
Example: Falling body in a gravitational field :
Let x denote the height at some time t. The force equation may be written as
two first order equations:
dx
= vx (x, y) = y,
dt
dy
= vy (x, y) = g,
dt
where g is the acceleration due to gravity. Thus the velocity field is given by
~v = (y, g).
Since g is never zero there are no fixed points in this system. The equation of the
phase curve and its solution is
dy
g
= x = x0 y 2 /2g,
dx
y
where x0 is the height at t = 0. The phase curves are therefore parabolas as shown
in the figure below.
y
2.3.1
We shall confine our analysis to 2nd order autonomous systems, for example, motion
of a particle in one dimension or systems with one degree of freedom (dof). The
motivation, as we shall see below, for a general stability analysis is to derive the
local and where possible global properties of a dynamical system qualitatively, that
is, without actually solving the evolution equations.
For an autonomous system of order 2 we have
dx
= vx (x, y)
dt
dy
= vy (x, y)
dt
where (x, y) define the state of the system in the phase space. We shall assume that
~v is some function of x, y which may be non-linear and therefore have many roots.
The fixed point is defined through
vx (xk , yk ) = 0 = vy (xk , yk )
and there may be many solutions. The nature of fixed points and the phase flows in
the neighbourhood will be determined by the derivatives evaluated at the fixed point.
Consider one such fixed point (x0 , y0 ). The stability around this fixed point may
be obtained by giving a small displacement around the fixed point:
x(t) = x0 + x(t).
y(t) = y0 + y(t).
Now Taylor expand the velocity function ~v around the fixed point.
vx (x, y) = vx (x0 , y0 ) +
vx
vx
|x0 ,y0 x +
|x ,y y + . . .
x
y 0 0
vy (x, y) = vy (x0 , y0 ) +
vy
vy
|x0 ,y0 x +
y + . . .
x
y x0 ,y0
By definition the first term is zero since (x0 , y0 ) is a fixed point. For an infinitesimal
variation in (x, y) we may linearise the equations of motion
x
Vxx Vxy
dx/dt
dx/dt
=
=
y
Vyx Vyy
dy/dt
dy/dt
19
vx
x
vy
x
vx
y
vy
y
.
x0 ,y0
The local stability analysis is best done in the eigenbasis or in some other convenient
basis which we shall call the Standard Basis.
If the system is already linear the above analysis is globally, not just locally, valid
since
dx
= vx (x, y) = ax + by
dt
dy
= vy (x, y) = cx + dy
dt
and the matrix A is given by,
a b
,
A=
c d
where a, b, c, d are constants. However, we need not restrict the analysis only to linear
systems.
Consider the change of basis
X
x
=
M
Y
y
such that
B = M AM
1 0
0 2
1
1 = ( 2 4)
2
1
2 = ( + 2 4)
2
where
= Vxx + Vyy
= Vxx Vyy Vxy Vyx
are respectively the trace and the determinant of the stability matrix. The eigenvalues
are real or complex depending on whether 2 4 or 2 < 4. We shall consider
these cases separately.
2.3.2
We use the properties of the eigenvalues to classify the fixed points. While in the first
order systems motion either moves towards or away from the fixed point, the second
order systems are richer in the sense there is much more variety in the nature of fixed
points4
1. Stable Node, 1 , 2 < 0.
X, Y 0
as t .
If in particular 1 = 2 < 0 and the equation is separable, A = I, it is called
a stable star. (See Figure). If not a change of basis may be induced such that
the matrix
0
B=
c
(or equivalently c = 0 and b 6= 0). In this case
X = X X(t) = C1 exp(t)
Y = cX + Y Y (t) = (C2 + C1 ct) exp(t)
For < 0 this is called an improper node.
2. Unstable Node, 1 , 2 > 0.
X, Y
as t . If in particular 1 = 2 > 0 and A = I it is called an unstable star.
(See Figure).
4
The names used for the fixed points are non-standard. Different books use different names.
21
Stable Node
Unstable Node
Unstable Star
Stable Star
Y = C2 etit .
Y = C2 etit .
The fixed point is unstable since for large times the system moves away from
the fixed point.
6. Elliptic fixed point, 1 = i = 2 . Correspondingly we have,
X = C1 eit ,
Y = C2 eit .
The system is confined to ellipses around the fixed point- each ellipse corresponds to a given initial condition.
saddle or
hyperbolic
unstable nodes
4=0
unstable spirals
stable spirals
stable nodes
4=0
23
xy y x
= 1
R
C1 e2t
C1 e2t + C2
(t) = t + C3
limit cycle
The result may be seen even more easily- think of the equation for R(t) as a first
order system with a fixed point at R = 0, 1. It is easy to see that the fixed point at
R = 0 is unstable and R = 1 is stable - in two dimensional phase space this is the
limit cycle.
The result may be generalised by considering systems which are separable in
polar coordinates. For example
x = vx = y + xf (r)
y = vy = x + yf (r)
where r2 = x2 + y 2 and f (r) is any function of r. The existence and stability of limit
cycles then depends on the zeros of the function f (r).
Example 2 :
The Lotka-Volterra equations, also known as the predator-prey equations, are
a pair of first order, non-linear, differential equations. The are frequently used to
describe the dynamics of biological systems (proposed independently by Alfred J.
Lotka in 1925 and Vito Volterra in 1926). The Lotka-Volterra Model is an example of
a second order autonomous system from population dynamics in biological systems.
The model involves two populations, the prey denoted by x and predator denoted by
y. The equations governing their populations are obviously coupled.
dx
= vx (x, y) = Ax Bxy
dt
dy
= vy (x, y) = Dxy Cy
dt
where A, B, C, D 0. It is easy to see how the equations come about- The prey
population x, will grow on its own but is diminished by the predator population y.
On the other hand, the predator would starve if left alone and grows by feeding on
its prey. We take x, y to be positive always. We also assume that there is no time
gap between the causes and effects.
25
y = y0 eCt
elliptic
fixed point
(c/d,a/b)
x
0 hyperbolic fixed point
It is also easy to verify that there is a constant of motion in the system, namely
E = xC eDx y A eBy .
Obviously the phase curves are given by constant E curves.
Chapter 3
Review of Newtonian Mechanics
This is a quick review of mechanics of single and many particle systems, inertial and
non-inertial frames etc. Newtons profound discovery that the mechanical systems
are described by laws which may be written in the form of differential equations laid
the foundation of modern theoretical physics- method of describing and predicting
and indeed the deterministic evolution of any physical system.
3.1
A particle by definition is one whose size is insignificant to the dynamics. For example,
the size of a planet is irrelevant when we are talking about its motion around the sun,
or it could be a cricket ball (ignoring spin) whose size may not be relevant to compute
its trajectory when hit by a cricket bat. The central law of mechanics is the Newtons
law the rate of change of momentum of a mechanical system is proportional to the
applied force acting on it. That is,
d~p
= F~ .
dt
(3.1)
(3.2)
where ~a is the acceleration and m denotes the mass of the particle. The second form
is not always valid as for example for rocket in motion when it is burning fuel, the
mass is also a variable. Because this is second order differential equation for ~r(t),
the solution can be computed by specifying ~r(t) and the velocity ~v (t) at some time
t = t0 for all times t provided the force F~ remains finite. The main goal of classical
mechanics is to determine these solutions (analytically if possible, or using computers
otherwise). We shall first define various dynamical quantities for a single particle and
extend the definitions to the many particle systems.
27
28
~ of a particle is defined as
Angular Momentum : The angular momentum L
~ = ~r p~ Li = ijk xj pk ,
L
(3.3)
where the component form is written for vectors in three dimensions. In particular
the angular momentum depends on where we choose the origin. Taking the derivative
on both sides and noting that ~r is parallel to p~, we have
~ =
~
dL
= ~r F~ ,
dt
(3.4)
where ~ is the torque. Angular momentum is conserved when the torque is zero. This
requires that the force F~ must be parallel to the position vector ~r, that is
F~ = |F |
r.
This is indeed the definition of the central force- gravitational attraction, the force of
attraction or repulsion between two charges, are examples of a central force.
Kinetic and potential energy :
The work done when a particle moves from a point A to B in the presence of an
applied force F~ is given by
Z B
Z B
~
~
WAB =
F .dr =
F~ .~r dt
(3.5)
A
If the equation of motion of the particle is given by Newtons second law, then F~ =
m~v . Substituting in the above equation we have
Z B
Z B
1
1
d(~v .~v )
~v .~v dt = m
WAB = m
dt = mv 2 |B
(3.6)
A = TB TA ,
2
dt
2
A
A
where T denotes the kinetic energy of the particle with a velocity ~v . Thus the work
done in moving a particle from point A to point B is just the difference in kinetic
energies at these two points.
Consider now a force F~ (~r) which depends only on the position of the particle
but not on its velocity at each point. This is a very special type of force known as
conservative force. The meaning of the name will become clear soon. For such a force
the work done is independent of the path taken. The following theorem states an
important property of such a force 1 :
Theorem: If a vector point function F~ (~r) is the gradient of a scalar function V (~r),
R
~ between any two points in the region is independent
then the line integral F~ .dr
of the path. Conversely, if the line integral between any two points in the region is
independent of the path, then the force must be the gradient of some scalar function.
1
For more details and the proof of the theorem, see K.N. Srinivasa Rao, Classical Mechanics,
(Universities Press)p.149
29
(3.7)
F~ = V (~r).
(3.8)
We refer to the function V (~r) as the potential. Note that the potential V is indeterminate up to the addition of an additive constant. Many systems that we study in
Physics admit of such a potential, namely gravitational, coulomb, inter-atomic forces
etc. Substituting Eq.(3.8) in Eq.(3.5), we obtain
Z B
Z B
~ = (VB VA )
~
~
V (~r).dr
(3.9)
WAB =
F .dr =
A
(3.10)
This important result which states that the sum of potential and kinetic energies
remains a constant during the motion of a material point in the presence of a potential
field is known as the principle of conservation of energy. We call E as the total energy
of the material point and the potential V defined as before is called a conservative
potential field. In general when we consider the energy as a function of the position
~r and momentum p~ (since points A and B are arbitrary), it is referred to as the
Hamiltonian function H.
Though we have stated the principle of conservation of energy in the context
of mechanics, like momentum conservation it is a much more general principle valid
beyond classical mechanics.
3.2
We will briefly discuss the many particle systems. The generalisation is straight
forward, but we have to consider inter-particle interaction in addition to the applied
external force. Let ~ri (i = 1, , n) denote the position coordinate of the i-th particle
whose masses are mi . Let us denote the external force acting on the ith particle by
F~iext . Furthermore the interaction between the particles is denoted by F~ij , the force
on the ith particle due to the jth particle, such that the internal force on the i-th
particle is
n
X
int
~
Fi =
F~ij ,
j(6=i)=1
where we have assumed that F~ii = 0, that is no self interaction. The equation of
motion of the ith particle is then given by
F~i = p~i = F~iint + F~iext
(3.11)
30
F~i =
i=1
Note that
F~iint =
n
X
(F~iint + F~iext ).
(3.12)
i=1
n
X
F~ij =
X
(F~ij + F~ji ) = 0.
i<j
j6=i=1
The last term is zero by virtue of Newtons third law. Therefore we have
X ~ ext
P~ =
Fi = F~ ext ,
(3.13)
P
where P~ = i p~i is the total momentum of all the particles. Thus the rate of change
of total linear momentum of the system is given by the totalP
applied external force.
Let us now define the total mass of the system as M = i mi and the centre of
~ by
mass coordinate R
P
~ = i mi~ri .
R
(3.14)
M
The equation of motion for total momentum then takes the form
~ = P~
F~ ext = M R
(3.15)
which is like the equation of motion of a single particle- it says that the centre of
mass of a system of particles behaves as though all the mass of the system were
concentrated at one point.
(3.17)
(3.18)
(3.19)
i<j
31
P
but the term i<j (~ri ~rj ) F~ij vanishes only if we assume that the force F~ij is
parallel to the relative vector ~ri ~rj . If this is true for some interaction then we have
X
~ = ext =
L
~ri F~iext .
(3.20)
The total angular momentum is then conserved if the torque vanishes. Most interactions like gravitational interaction, electrostatic forces have this property, that the
total momentum and total angular momentum are conserved. An exception is the
Lorentz force between two moving particles with electric charge Q. However the total
linear and angular momentum are conserved in this case if we take into account the
fact that the electromagnetic field itself carries angular momentum.
The kinetic energy of a system of particles is given by
T =
1X 2
mi~ri .
2 i
(3.21)
(3.22)
which is a decomposition of the total kinetic energy in to a sum of the kinetic energy
of the centre of mass and total internal energy which describes the motion of the
particles around the centre of mass.
As in the case of the single particle the work done is the difference in the kinetic
energies at two instances of time,
XZ
XZ
ext
~
F~ij .d~ri .
(3.23)
Fi .d~ri +
T (t2 ) T (t1 ) =
i
i6=j
Once again energy conservation requires both external forces and the internal forces
to be written as gradients of scalar potentials, that is
~ i Vi ({~ri }) :
F~iext =
(3.24)
(3.25)
32
(3.26)
where the force depends only on the position of the ith particle and not on the
positions of other particles.
P
P
If we now define the total potential as V = i Vi + ij Vij , then the total energy
E = T + V is conserved.
3.3
33
x*
3
S*
P
r
r*
ro
x
1
x*
1
O*
x*
2
In Newtonian mechanics we regard time as absolute and therefore t = t . Differentiating both sides of Eq.(3.27) we have
d~r0 d~r
d~r
=
+
dt
dt
dt
(3.28)
i = 1, 2, 3.
(3.29)
The unit vectors form an orthogonal triad of vectors. The position vectors of the
point P in S and S are then given by
X
X
~r =
ei xi ; ~r =
ei xi .
(3.30)
i
(3.31)
where d dt~r is the time rate of change of ~r with respect to S while the second term
takes into account the changes in orientation of the frame S with respect to the
frame S. We have assumed
the summation convention over repeated indices.
d
ei
Observe that dt is itself a time dependent vector as viewed from S. Therefore
we may write,
d
ei
= aij ej
(3.32)
dt
Using the orthonormality of the unit vectors, we have
d
ej
d(
ei .
ej )
d
e
= 0 = ei .
+ ej . i
dt
dt
dt
(3.33)
34
(3.34)
where we have made use of the orthogonality property of the unit vectors. It is clear
that the matrix A = {aij } is an antisymmetric matrix since aii = 0 and aij = aji .
As a matter of convention we re-parametrise the matrix A to define another quantity
a31 = a13 = 2 ;
a23 = a32 = 1
(3.35)
(3.36)
Furthermore, we have
xi
d
ei
= xi ijk ej k =
~ ~r
dt
(3.37)
(3.38)
where
~r = ei xi = ei xi
represented in S and S . Indeed this is true for any arbitrary vector and is the most
important equation which enables one to translate the description of motion from one
frame to another. In particular notice that
d~
d
~
=
dt
dt
(3.39)
Since this equation is true for arbitrary time dependent vectors we may simply
write an operator identity which relates the time derivatives in both frames:
d
d
=
+
~
dt
dt
(3.40)
The vector
~ is called the instantaneous angular velocity and its components i are
measured with respect to the frame S . The meaning of this equation is clear- any
vector changes from the changes measured with the rotating frame and also from the
change associated with the motion of the frame itself.
35
Meaning of : Consider a point P which is fixed in the frame S . Then its position
vector relative to S is a constant vector and therefore d r /dt = 0. Consider the
rotation of the point P by an infinitesimal amount d about an axis n
. It is obvious
that
|d~r| = |~r|d sin
(3.41)
This displacement is perpendicular to the vector ~r. We have therefore,
~ ~r
d~r = d
(3.42)
~ =n
where d
d and ~r is a vector of fixed length. We have therefore the result
~
d~r
d
=
~r =
~ ~r
dt
dt
(3.43)
Thus
~
d
dt
is the instantaneous angular velocity of S relative to the frame S. In general both
the axis and the rate of rotation may be time dependent.
~ =
r*
x*
1
x 2*
Combining the above results with Eq.(3.28) we have
d~r
d~r0 d~r
=
+
+
~ ~r
dt
dt
dt
(3.44)
~v = ~v + ~vt ,
(3.45)
or
where ~v is the absolute velocity of the particle as seen from S, while ~v is the relative
velocity of the particle with respect to the frame S . The velocity of transport is
defined as
~vt = ~v0 +
~ ~r
(3.46)
36
which is the velocity of the particle fixed in S as seen from the frame S. The velocity
of transport is the absolute velocity of the point P when it is held fixed in the moving
frame S .
If = 0 then S is said to be having translation motion or advancing motion.
The coordinate axes remain parallel to themselves during the motion.
Relations among accelerations : Differentiating the velocities Eq.(3.45) once
again we have
d~v0
d
d~v
=
+ [~v +
~ ~r ]
(3.47)
dt
dt
dt
We now use the operator equation Eq.(3.38) again to get
d
d
[~v +
~ ~r ] = ( +
~ )(~v +~ ~r )
dt
dt
~
d~v d
+
~r + 2~ ~v +
~ (~ ~r )
=
dt
dt
(3.48)
(3.49)
where the first term is the acceleration of S relative to S which we denote by ~a0 and
the second term is the acceleration of P relative to S denoted by ~a . Note that time
dependence of the vector
~ is the same in S and S .
Combining all the terms we may write
~a = ~a + ~at + ~ac .
(3.50)
We shall consider the physical meaning of all these terms: The relative acceleration
~a is given by
d~v
~a =
(3.51)
dt
is the acceleration of the point P with respect to the moving frame S . The acceleration of transport is given by
d
~
~r +
~ (~ ~r ),
~at = ~a0 +
dt
(3.52)
where ~a0 = d~v0 /dt and ~at is the acceleration when the point P is held fixed in frame
S that is when ~v = 0. Furthermore note that ~a0 is the linear acceleration of the
frame S . The second and third term respectively refer to angular acceleration and
the centrifugal acceleration.
Finally the acceleration of Coriolis is given by
~ac = 2(~ ~v ).
(3.53)
37
(3.55)
Within the frame work of to Newtonian mechanics time t, mass m and the Force
~
F are invariants which do not change from S to S . However, the equation of motion
in the moving frame S is not the same as in S due to additional terms coming from
acceleration of transport and the Coriolis acceleration. By definition therefore S
is not an inertial frame. We therefore come to the important conclusion that the
frame S is inertial if and only if the acceleration of transport and the acceleration of
Coriolis are both identically zero. When this condition is satisfied we have
F~ = m~a = m~a .
(3.56)
since ~a = ~a . This is the definition of an inertial frame as we started out. For all
times this is true only if
~ = 0 and ~a0 = 0; these are then necessary and sufficient
(3.57)
F~t = m~at ;
F~c = m~ac ,
(3.58)
(3.59)
38
where F~t is the force of transport and F~c is the Coriolis force and have the physical
dimensions of force and are called inertial forces (or pseudo forces or fictitious forces)
in the literature. These inertial forces vanish if S is an inertial frame where as
the applied force F~ remains the same in all frames of reference. We may note that
while the physical force F~ is a Cartesian vector, the inertial forces are not Cartesian
(polar) vectors at all, although the vector notation is usually adopted to denote these
quantities also.
The point to emphasise is that even though the name fictitious force is used, the
force does have real effect as for example a person standing on a stationary train or
bus will feel the fictitious force when the feet are dragged forward due to frictional
contact with the base. The body tends to remain in the same position, due to inertia,
so that the force is felt effectively in the direction opposite to the applied force. Hence
the name inertial forces
Example 1 Consider a point particle of mass m which is falling vertically along
the z-axis under the action of gravity in an inertial frame S. The equation of motion
is
m
z = mg; m
x = 0; m
y = 0.
(3.60)
z = z0 + ut gt2 /2
(3.61)
Now let us observe this motion from another frame S which is moving uniformly
at a constant velocity v along the y-direction without rotating. At t = 0 both the
origins coincide. Since both acceleration of transport and acceleration of Coriolis is
zero the equation of motion in S is given by
m
z = mg;
m
x = 0;
m
y = 0.
(3.62)
z = z0 + u t gt2 /2;
y = v;
y = vt
(3.63)
u
g
y 2 (y )2
v
2v
(3.64)
39
below. If is a constant the motion is uniform circular motion and the acceleration
is given by
d2~r
= 2~r
~a =
dt
where = |~ | with
~ along the z-direction. Thus the acceleration in any frame of a
particle executing circular motion is centripetal and the equation of motion is given
by
F~ = m 2~r
Consider now the same material point in uniform circular motion with respect to
a frame S which has the same origin as S and rotating at the same angular velocity
40
where is the latitude and is the angle between the true vertical and the direction
of the force of attraction A which is radially inwards. d denotes the perpendicular
distance from the particle to the rotation axis.
md
O Q
2 d sin( )
g
Since d R, where R is the radius we have the following limit on the angle ,
sin
2R
= 3.23 103 radians
g
41
from a tall ceiling supporting a heavy bob. The most famous and earliest of such
pendula is still present in Paris Observatory. Apart from the usual oscillations of
the pendulum, the plane of the pendulum rotates making a full circle (provided the
amplitude remains the same over this period). The period of rotation of the pendulum
is related to the Earths daily rotation frequency and also depends on the latitude of
the place. Here is what he famously said in conclusion of his paper (translated from
French original):
Poisson, treating of the motion of projectiles in the air, and taking into
consideration the diurnal movement of the earth shows, by calculation
that in our latitude, projectiles thrown towards any point, experience a
deviation which takes place constantly towards the right of the observer,
standing at the point of departure and looking towards the trajectory.
It appears to me that the mass of the pendulum may be compared to
the projectile, which deviates towards the right while departing from the
observer, and necessarily in the opposite direction in returning towards its
mean plane of oscillation, and indicates its direction.. But the pendulum
possesses the advantage of accumulating the effects, and allowing them to
pass from the domain of theory into that observation.
(Comptes Rendus de lAcad. de Sciences de Paris. 3 Fevrier, 1851. )
Consider a point P on earth at latitude . Let us choose the moving frame S
such that the z is along the local vertical, x due south and y due east. Let a
pendulum of length l be suspended by a rigid string at the point P . We take the
origin of S as O at the point of suspension. The position of the bob of mass m at
rest is given by ~r = (0, 0, l).
At time t, the position of the bob is given by ~r (t) = (x (t), y (t), z (t)) in general.
The velocity ~v = (x , y , z ) and the acceleration ~a = (
x , y , z ). The applied force
on the mass m is given by
~ + T~ ,
F~ = A
~ is the force of gravitational attraction and T~ is the tension in the string as
where A
before. The the equation of motion of the bob in the non-inertial frame S is given
by
~ + F~c ,
m~a = m~a + F~t + F~c = F~ + F~t + F~c = T~ + W
where F~t , F~c are the force of transport and force of Coriolis respectively. In particular
with our choice of the axes in S we have
~ =A
~ + F~t = (0, 0, mg)
W
along the local vertical. It is also evident that the tension acts along the length of
the string and hence
T~ = m~r .
42
z*
90
O*
y*
x*
*
Y (East)
t= /2
t=0
u
t=
X* (South)
(3.65)
For small amplitude oscillation, we may neglect the effect of in the equation
for z since z l. Consequently we have z = 0 = z . Therefore
g
l
= x + 2 y sin = x + 2uy
= y 2 x sin = y 2ux
x
y
(3.66)
where u = sin .
It is easier to solve the coupled equations for x, y by making the change of variable
to
(t) = x (t) + iy (t).
In terms of the complex variable (t) the equation of motion can be written as
= 2ui.
Choosing a solution of the form = exp(it) we have the characteristic equation
2 = + 2u
43
u2 .
A1 + B2 = 0.
Observe that the turning points are characterised by zero velocity = 0 sin(t u2 ) =
0. The turning points therefore occur at
t u2 = n, n = 0, 1, 2, 3,
The period of the pendulum is given by
=
2
2
.
=p
2
u
2 sin2 + g/l
(3.67)
Note that even though we denote as the period in Eq.(3.67), it is not the period
in the conventional sense since the bob does not return to the same turning point
after one period as is assumed in the case of a simple pendulum. To see this consider
the value of :
(t) = exp(iut)(A exp(i2t/ ) + B exp(i2t/ )
(t = 0) = A + B = a
(t = ) = a exp(i sin ).
(3.68)
Thus the turning point has shifted around a circle of radius a by an angle u in the
clock-wise direction in one period of oscillation. This would mean that the plane of
oscillation of the Foucault pendulum itself has moved through an angle u radians in
seconds. Therefore the plane rotates through an angle 2 radians in 2/ sin =
24hours/ sin . Evidently there would be no deviation at the equator while at the
north pole the period of rotation of the plane of the pendulum is exactly 24 hours
which is indeed the shortest period of rotation of the plane of the pendulum.
The deflection of the plane of the pendulum is entirely due to the term sin
which has its origin in the Coriolis force which arises because the earth is non-inertial
(rotating) frame of reference. Foucault carried out the first experiments in 1851 in
Paris which confirmed the rotation of the earth.
44
Coriolis force and its implication As we have seen in the case of the Foucault
pendulum, the deflection of the plane of the pendulum is entirely due to the Coriolis
force. It is entirely due to the rotation of the earth and nothing to do with the
curvature but the magnitude of the force varies with the latitude. The effect is named
after Gaspard-Gustave Coriolis who analysed it in 1835. The Coriolis acceleration, as
gleaned from the formula, is perpendicular both to the velocity of the moving particle
and to the rotation of the axis.
Obviously the Coriolis force exists only in a rotating frame and should not be
confused with the centrifugal force. Centrifugal force is always there in a rotating
frame where as the Coriolis requires the particle to be in motion relative to the
rotating frame.
The most important effect due to Coriolis force is in the large scale flows of
ocean currents and in the atmosphere. Since the earth is rotating, both centrifugal
and Coriolis forces are present. However, if we use a rotating frame in which the
earth is stationary, the centrifugal force is cancelled as we have seen before. In such a
frame only the Coriolis force has a significant effect. To see how this occurs consider
a low pressure region in the atmosphere. The surrounding air mass will tend to flow
in toward the low pressure are. The Coriolis force will then deflect it perpendicular
to the velocity leading to a circulation or a cyclonic flow. In the northern hemisphere
the direction of movement is counter clockwise where as in the southern hemisphere,
the direction is clockwise. Since the effect depends on the latitude, the cyclones can
not form near the equator since the Coriolis force is small or exactly zero on the
equator. Coriolis force also plays a major role in the erosion of right banks of rivers
moving south to north, or even wearing out of rails because of moving trains. You
can get more information from Wikipedia on Coriolis effect.
Chapter 4
Variational Principle and Lagrange
equations
The Lagrangian formulation of classical mechanics was introduced by Joseph-Luis
Lagrange (1736-1813). It is now recognised that the method used to derive the equations of motion are much more general and is applicable over a broad range of fields
not necessarily restricted to classical mechanics. The main advantage of the method
is that the equations of motion are not dependent on any one coordinate system or
a set of variables. Any set of independent variables may be used to formulate the
equations of motion whose form itself remains the same independent of the choice of
the set. The equations of motion are derived using an Action Principle. We shall
discuss this in detail here.
4.1
The word bilateral constraint comes from the fact that = 0 may be looked upon as a combination of two unilateral constraints > 0 and < 0.
45
where j {1, . . . , k}
4.2
Calculus of variations
The basic problemin classical mechanics is to determine the physical trajectories using
the equations of motion given masses, forces and initial condition. Given at t = t0 , the
2N positions and velocities the system evolves according to the laws of mechanics.
47
v=
p
2gy,
where g is acceleration due to gravity and y is the height at some point along the
actual path. Since the motion is in a plane, xy, where x denotes horizontal distance
and y vertical distance, we may parametrise the path by y = y(x). Therefore,
r
p
dy
ds = dx2 + dy 2 = 1 + ( )2 dx.
dx
dx
A
1 + y 2
.
2gy
dxF (y, y , x) = 0
A
= 0.
dx y
y
(4.1)
1 + y 2
2gy
and using the Euler equation we obtain the equation of the curve along which the
time taken is the least. Thus
F
F y = C
y
where C is a constant (Beltrami identity).
The solution of the above equation is a cycloid which is given by the following
parametric equations
1
x = k 2 ( sin ),
2
1
y = k 2 (1 cos )
2
where k 2 = 1/2gC 2 and is the angle of the trajectory with respect to vertical. Note
that the brachistochrone solution does not depend on the mass or the strength of
gravitational constant.
4.3
49
Now we return to dealing with the description of the motion of a system of n-particles.
To describe the motion of the system, we also require N generalised velocities, qi =
d
q (t). We often use the shorthand notation:
dt i
q(t) = {qi (t)} ,
q(t)
= {qi (t)} .
(i = 1, 2, 3, . . . N )
(4.2)
Suppose we want to describe the motion of this many particle system at a point
A at time t1 and at point B at a later time t2 with the end points fixed. We assume
that there exists a function, L = L(q, q,
t) called the Lagrangian, which has all the
necessary information about the dynamical evolution of the system,
In order to derive the Lagranges equations of motion, we first define Hamiltons
Principle Function (also called the action integral) as:
R=
t2
L[q(t), q(t),
t] dt .
(4.3)
t1
Note that R is a functional of q(t), it depends on not just one value t, but on the
function q and all of t in a given interval t1 to t2 . The Hamiltons variational
principle states that the physical trajectory is one for which the action R is a
minimum (extremum in general)- also some times referred to as the principle of least
action which is a slight misnomer since we are only extremising the action and not
necessarily minimising it. This principle is the most general formulation of the law
governing the time evolution of mechanical systems. Minimising the action leads to
the Lagrange equations of motion.
The equations of motion are derived by infinitesimal variation of the path defined
by the set q(t) with the initial and end points fixed, that is
q(t) q(t) + q(t) ,
q(t1 ) = q(t2 ) = 0 ,
(4.4)
such that the action integral is a minimum, or equivalently it is stationary under the
variation qi . The change in the action is given by
R =
t2
Ldt =
t1
t2
Ldt
(4.5)
t1
(4.6)
t2
t1
t2
L
L
d L
qi dt +
qi
qi dt qi
qi
t1
(4.7)
qi dt qi
Note that the variational principle is a global statement which leads to the Lagrange
equations which are local differential equations.
Remarks :
The Lagrange equations for a system with N degrees of freedom are a set of N
differential equations of second order. They are therefore solved by specifying
2N initial conditions such as qi (0), qi (0) at some time t = 0, say. This will
completely specify the time evolution of the system. On the other hand one
may also specify N coordinates at two different times: qi (t1 ), qi (t2 ).
q(t) denotes the Trajectory or the path followed by the system.
f [q(t), q(t)]
= const.
d
f = 0.
dt
(4.9)
The equations of motion derived from the variational principle do not correspond to a unique Lagrangian. In particular suppose we effect a change in the
Lagrangian by a total derivative, then
L [q(t), q(t),
t] = L[q(t), q(t),
t] + f(q).
The action
Z t2
Z
R =
L [q(t), q(t),
t] dt =
t1
(4.10)
t2
t1
L[q(t), q(t),
t] dt . + f (q2 , t2 ) f (q1 , t1 ).
(4.11)
differs by the change induced by the function f at the end points. This change
however drops out since the end points are held fixed against any variation that is the equations of motion are the same for any two Lagrangians which
differ by a function which can be written as a total time derivative.
More generally we now understand that it is not only classical mechanics, but
all fundamental laws of physics can be written in terms of an action principle-all
the way from electromagnetic theory to standard model of particle physics or
even in string theory. A generalisation of action principle to quantum mechanics
by Feynman is known by the name path integral method in quantum mechanics.
51
The real power of Lagrangian mechanics lies in the description of the mechanical
system in a coordinate independent way and can be broadened to accommodate a
large class of other types of systems. For example in Newtonian mechanics, for a
conservative system in 1 dimension, the equation of motion is of course given by,
m
x = F (x) = V (x),
where the prime denotes the derivative with respect to x. A coordinate transformation of the form x = f (y) will transform the equation of motion in y as
m(f (y)
y + f (y)(y 2 )) = F (f (y))
whose structure is different from the original equation of motion.
Lagrangian formulation of mechanics, however, provides a method for describing
the mechanical systems such that coordinate transformations leave the structure of
the equations of motion invariant even in the presence of constraints.
4.3.1
4.3.2
Coordinate Transformation
The Lagrangian, is independent of the choice of a specific set of generalised coordinates. What this means is that, if a set {qi } is used to write the Lagrangian, then
another set {
qi } also fits the prescription, given that the map
qi = qi {q1 , . . . , qN }
and
qj
qj
qi +
qi
t
L
2 qj
L qj
L 2 qj
qk +
=
+
qi
qj qi qj qi qk
t qi
(4.13)
L
L qj
L qj
.
=
=
qj qi
qj qi
qi
(4.14)
Taking the derivative of the above equation with respect to time we have
2
qj
2 qj
L
d L
d L qj
.
qk +
+
=
dt qi
dt qj qi qj qi qk
t qi
Combining Eq.(4.13) and Eq.(4.15) we have
L
qj
L
d L
d L
=
=0
qi dt qi
qj
dt qj
qi
(4.15)
(4.16)
Since this is true for arbitrary coordinate transformations, the Lagranges equations
are valid in both coordinate systems provided the Jacobian is non-vanishing.
Example : As an example of the coordinate transformation consider the case of
free particle in the inertial frame S. The Lagrangian is given by
1
L = m~r.~r,
2
(4.17)
x2 = x1 sin t + x2 cos t;
x3 = x3
(4.18)
(4.19)
53
The Lagrangian equation of motion however is of the same form and we have
d L
L
= m[~r +
~ (~ ~r ) + 2~ ~r ] = 0
(4.20)
dt ~r
~r
which is the familiar equation of motion for a free particle in a rotating coordinate
system. Since there is no translation, the second term is simply the centrifugal force
and the third term is the Coriolis force- fictitious forces which arise only in a noninertial frame such as the one we have chosen.
4.3.3
xi = 0 .
2
dt v x i
v 2
which is indeed the law of inertia.
Thus we have the necessary condition that the Lagrangian has to be a function
of the modulus of the velocity. We may thus write, for example,
L = Cv 2 .
L = L(v 2 )
S2 :
L = L(u2 )
d
(2~r.V~ + tV 2 )
dt
Thus
d
(2~r.V~ + tV 2 )]
dt
Thus the Lagrangians in two different frames may be considered equivalent (describing
the same system) only when = 1 since then they would differ by a total derivative.
Thus
d
L(v 2 ) = L = Cu2 = C[v 2 + (2~r.V~ + tV 2 )]
dt
The constant C however remains arbitrary but the action has a minimum only when
C > 0 since v 2 is positive definite. In particular we may chose,
L = Cu2 . = C[v 2 +
1
L = mv 2
2
where m is the mass of the particle and the free particle Lagrangian is simply its
kinetic energy.
Newtonian systems : For systems obeying Newtonian mechanics, the Lagranges
equations are equivalent to Newtons second law. For particle motion in one dimension
we have,
dV
F =
= m
x; V = V (x).
dx
Assuming m to be a constant we can write the above equation as,
d
dV
(mx)
=0
dx
dt
55
d d mx 2
dV
(
) = 0.
dx
dt dx 2
Adding terms whose partial derivatives are equal to zero, we have
mx 2
d d mx 2
[
V (x)] (
) = 0.
x 2
dt dx 2
We can add the potential term in the second term without altering the nature of the
equation,
mx 2
mx 2
d
V (x)
V (x)
= 0.
x
2
dt x
2
In general therefore for systems with kinetic energy T and potential energy V , we
define
L=T V
and we recover the Lagrangian equation of motion for the Newtonian particle,
d L
L
=0
(4.21)
x dt x
The pendulum : Consider the example of a plane pendulum. In Cartesian coordinates the Lagrangian is given by
1
(4.22)
L = m[x 2 + z 2 ] + mgz,
2
where we have chosen the z-axis along the vertical. Since the length of the pendulum
is constant we also have a constraint
l 2 = x2 + z 2
(4.23)
One way to directly take into account the constraint is to rewrite the Lagrangian
including the constraint by introducing a Lagrange multiplier :
1
1
L = m[x 2 + z 2 ] + mgz + (x2 + z 2 l2 )
(4.24)
2
2
A more direct approach is to define the generalised coordinate for the system using the constraint and write the Lagrangian in terms of the generalised coordinate.For
example by introducing the polar coordinate
x = l sin ,
z = l cos
the constraint is automatically satisfied and is the generalised coordinate for the
system. The Lagrangian of a plane pendulum is therefore given by
= 1 ml2 2 + mglcos.
L(, )
2
The equation of motion is given by
d L
L
= 0 = ml2 + mgl sin
dt
(4.25)
4.4
Noethers Theorem :
If a Lagrangian is invariant under a family of transformations, its dynamical system possesses a constant of motion and it can be found from a
knowledge of the Lagrangian and the transformation(s).
4.4.1
Previously, we have argued that the Lagrangian for a free particle is proportional
to v 2 , using properties of space-time and the principle of relativity. For a system of
particles we generalise this to
N
1 X 2
L= m
qi
2
i
consistent with the equations of motion. Beyond this we can not determine the
Lagrangian for systems which are in general not free- may be confined and interacting.
Invoking experience and known experimental facts we deduce the correct form of L,
modulo the total derivative. For dynamical systems which have kinetic energy T and
potential energy V , the form of the Lagrangian is given by
L[q(t), q(t),
t] = T V .
(4.26)
(4.27)
(4.28)
The mass tensor in general can be quite complicated but may some times be written
as mij = mi ij .
The equations of motion are:
mi qi =
V
= Fi ;
qi
57
X L
i
qi
qi L]
to be a constant of time and we call this the energy of the system. For a mechanical
system2 since L = T V
X L
qi
= 2T.
qi
i
Therefore E = T + V and we are justified in calling E the total energy of the system.
Note that the energy is additive by definition and its conservation results from time
translation invariance.
Conservation of momentumhomogeneity in space :
Consider a system that is bodily displaced by an infinitesimal amount . Let us
assume that this will imply changing each of the position vectors ~ri by an infinitesimal
2
For the equation that we write in the next step, it is important that the term T is homogeneous
in {q}.
This is possible only when the original coordinates are functions of the qi s only, without
x
any explicit dependence on time, i.e. tj = 0.
X L
i
~ri
d X L
[
] = 0.
dt i ~ri
~ri
p~i = P~
which we call as the total momentum of the system. If the motion is described in
terms of generalised coordinates qi , then
pi :=
L
qi
Fi :=
L
qi
are generalised forces. In general the relation between conjugate momentum and
velocity vector may be nontrivial and may not reduce to mass time velocity.
Thus the property of Homogeneity of space implies translation invariance of the
Lagrangian which leads to the law of conservation of total linear momentum.
Conservation of angular momentum- isotropy of space :
Consider a system that is bodily rotated, that is all the vectors are rotated by
the same infinitesimal amount ~ri . Let us assume that this will imply changing each
of the position vectors ~ri by an amount,
~ri ~ri +
n ~ri ,
where is taken to be small and n
is chosen arbitrarily. The Lagrangian is invariant
under this transformation if
ri , t).
L(~ri , ~ri , t) = L(~ri + ~ri , ~ri + ~
Notice that both the position and velocities change under infinitesimal rotation.
59
X L
~ri
.~ri +
X L
~ri
.~ri .
~ri
.~ri =
X L
i
~ri
.(
n ~ri ) =
X
i
n
.~ri p~i = n
.
li .
is conserved.
Thus the property of isotropy of space implies rotational invariance of the Lagrangian which leads to the law of conservation of total angular momentum.
Gauge transformations :
Noethers theorem can be generalised. We have used the equations of motion
which remain the same, for example, when the Lagrangian is changed by a total
derivative of some function f = f (q)
L = L + df /dt
While the equations of motions are unchanged under such a change of the Lagrangian,
the conjugate momentum does change. For example
p=
and
p =
L
q
f
L
=p+
q
q
L = ( )2 T k V.
qi dt qi
61
(4.30)
The co-ordinate qi is called a cyclic co-ordinate. There may be more than one such
coordinate in the system. This result is an important tool in simplifying the equations
of motion.
4.5
4.5.1
Along with the free particle problem this is perhaps the simplest case on can consider.
The Lagrangian is given by
1
L = mx 2 V (x)
2
and the equation of motion is
dV
m
x=
.
dx
Instead of solving this second order differential equation, we directly integrate from
the energy which is specified once and for all for a given initial condition
p
1
E = mx 2 + V (x) x = (2/m)(E V )
2
t=
m
2
dx
+C
EV
V=E
x
2
x3
dx
T (E) = 2m
EV
x1 (E)
and the constant drops out.
Region x5 x < : Motion is unbounded and the particle eventually goes off
to infinity since V < E over the whole range.
For different choices of energy, we may have different bounded and unbounded regions
which is to be expected.
4.5.2
Consider a two particle system with position vectors ~r1 , ~r2 in three dimensions with
mass m1 and m2 respectively. The Lagrangian of the system may be written as
1
1
L = T V = m1~r1 2 + m1~r2 2 V (~r1 , ~r2 ).
2
2
Quite often it is convenient to use the so called centre-of-mass coordinate system,
where
~r = ~r1 ~r2
is the relative coordinate between the two particles and
~ = m1~r1 + m2~r2
R
m1 + m2
is the position vector of the centre of mass of the system.
In terms of these new coordinates the Lagrangian may be written as
1 ~ 2 1 2
~ ~r),
+ m~r V (R,
L = MR
2
2
63
where
M = m1 + m2
and
m=
m1 m2
m1 + m2
which is called the reduced mass of the system. The importance of this separation
lies in the fact that, often the potential energy term of the Lagrangian splits into two
parts,
~ ~r) = VR (R)
~ + Vr (~r).
V (R,
This then allows us to separate the Lagrangian in a separable form as
L = LR + Lr
where
1 ~ 2
~
LR = M R
VR (R)
2
and
1 2
Lr = m~r Vr (~r).
2
Such a decomposition often simplifies the solution of the problem. Typical examples
are Earth moving around the Sun, the electron orbiting around the nucleus. Typical
of such problems is also the fact the two-body potential depends only on the relative
distance between the two particles, that is
~ ~r) = Vr (r)
V (R,
which is not only independent of the center-of-mass coordinate, but also independent
of the relative orientation of the two particles. The first property leads to translation
invariance while the second property implies conservation of angular momentum,
often referred to as the central force problem.
Here after we shall assume that the system is translationally invariant. Therefore
it is of little interest to consider the motion of the centre of mass. Properties of the
system will not be affected the actual position of the centre of mass. We shall also
assume that the force field is central and concentrate on the relative Lagrangian Lr .
The problem is analysed more easily in spherical polar coordinates:
x = r sin cos
y = r sin sin
z = r cos ,
(4.31)
where r, , are the new generalised co-ordinates. The Lagrangian in spherical polar
coordinates is then given by
1
Lr = m[r 2 + r2 2 + r2 sin2 2 ] V (r)
2
(4.32)
(4.33)
p = l = 2mA.
Conservation of angular momentum then implies equal areas are swept in equal timesKeplers second law.
The effective equation of motion for the system may therefore be written as
2
d
l
2
m
r = mr Vr (r) =
+ Vr (r) = Vef f (r)
(4.34)
dr 2mr2
where the effective potential Vef f (r) is given by
l2
Vef f (r) = Vr (r) +
2mr2
with a centrifugal term which is l dependent. Thus the central force problem is essentially reduced to describing one-dimensional radial motion in an effective potential
that depends on the value of the angular momentum.
The conserved energy for the motion in a central force field may then be written
as
1
1
E = m[r 2 + r2 2 ] + Vr (r) = mr 2 + Vef f (r)
2
2
2
[E Vef f (r)].
m
dr
2
[E
m
65
+ constant.
Vef f (r)]
l dr/r2
p
+ constant.
2m[E Vef f (r)]
Equations for t, above give general solution of the central force field problem. The
solution to the equation for provides the equation for the path. It should be noted
that varies monotonically in time. As a result does not change sign. The implication of the symmetry is that the radial motion may be regarded effectively as an
one-dimensional motion problem.
There are exceptional situations in which the motion takes place between two
limits or turning points r1 and r2 . The turning points are defined as those points
where the radial velocity goes to zero. Such motions are called finite. Then we have
= 2
r2
r1
l dr/r2
.
2m[E Vef f (r)]
m
,
n
where m, n are integers- that is after n periods the radius vector would have made
m revolutions around the origin and returns to the original position. There are only
two types of potentials where all finite motions take place in closed orbits, these are
1/r and r2 potentials.
l2
+
r
2mr2
r0
Vmin
Vmin = V0 = m2 /2l2 .
From the form of the potential it is easy to see that the motion is bounded when
E < 0 and unbounded for positive energies.
The solution of the parametric equation for is given by
#
#
"
"
r0 /r 1
l/r m/l
= cos1 p
= cos1 p
2
2mE + (m/l)
1 + E/V0
Defining =
with one focus at the origin (location of the centre-of-mass) and is called the eccentricity. The two turning points correspond to the minimum and maximum values of
the radius r,
r0
r0
r1 =
; r2 =
1+
1
where the maximum value is physical only when < 1.
The dynamics of relative motion may be analysed in the following limits:
E < 0 ( < 1): In this case the motion is bounded and finite and the trajectory
is an ellipse. From analytical geometry we can work out the semi-major and
semi-minor axes of the ellipse and are given by
a=
r0
=
;
2
(1 )
2|E|
b=
l
r0
=p
1 2
2m|E|
67
p
2m
ab = 2a3/2 m/
l
which shows the proportionality between square of the period to the cube of
the linear dimension of the orbit.
E 0 ( 1): In this case the motion is unbounded the trajectory is a
hyperbola for E > 0. This distance to perihelion from the focus is given by
r1 = r0 /(1 + ) = a( 1).
At exactly E = 0, the eccentricity = 1 and the trajectory is a parabola. This
happens when a particle starts at infinity at rest.
y
r
0
r
0
a (1)
2b
a (1)
2a
<1
>1
4.5.3
~r
d~v
= 3
dt
r
~
dR
= 0.
dt
A different pendulum
The period of a pendulum is independent of the amplitude only for very small oscillations. However, the amplitude dependence of the period for large amplitudes may
be eliminated by rapping the string of the pendulum around a limiting curve.
Consider a particle of mass m constrained to move in a vertical plane, as in the
case of a simple pendulum, but along a smooth cycloid under the influence of gravity.
The cycloid is given by the parametric equations,
x = A( + sin ),
z = A(1 cos ),
where < < , and A is a positive constant with zaxis pointing vertically
upwards. (Cycloid is the curve described by a point on the rim of a wheel as it rolls
along a flat surface).
The kinetic energy of motion is
1
T = m(x 2 + y 2 ) = 2mA2 cos2 (/2) 2
2
and therefore the Lagrangian is given by
L = T V = 2mA2 cos2 (/2) 2 mgA(1 cos )
It is more convenient to express this in terms of the arc length s along the cycloid.
Note that
ds2 = dx2 + dy 2 = 4A2 cos2 (/2)d2 ds = 2A cos(/2) 1.
Let us make ds be positive by choosing + sign without loss of generality. Then the
arc length measured from x = A, z = 2A is
s=
If the arc length measured from x = 0, that is from the bottom of the curve then
s=
69
s = 2A cos(/2).
mgs2
.
8A
Thus the Lagrangian using the arc length as the dynamical parameter is given by,
mgs2
1
.
L = ms 2
2
8A
To find the period use the equation of motion. We have
m
s+
or equivalently
s +
mgs
=0
4A
g
s = 0.
4A
p
g/4A,
T =
p
2
= 4 A/g
which is independent of the amplitude of oscillation with the period identical to that
of a simple pendulum.
4.5.4
Small oscillations
As a further application of the Lagrangian formulation, consider the example of mechanical vibration of an atomic lattice, such as in a crystal. The mean positions of
the atoms are given according to the crystal structure. However, atoms may undergo
small displacements about the mean position if the crystal is not infinitely rigid. The
mechanical vibrations of the atoms are in general complicated, but as we show below,
may be resolved into the so called normal modes of vibration which consists of
finite number of simple harmonic motions. The method finds wide spread application
also in acoustics, coupled electrical circuits etc.
Consider a system of mutually interacting and vibrating particles about a stable
equilibrium. The stable equilibrium of the system is given by minimising the potential
energy for an arrangement of the atoms. The only motion possible for individual
particles is a small displacement about this equilibrium position.
Let V (q1 , , qf ) be the potential energy of the system for a system with f
degrees of freedom. If qi are the generalised coordinates of the particles and q0i are
=
qi
i
where i is the small displacement around the equilibrium value of qi . Thus i now
play the role of generalised coordinates. The Taylor expansion above may be rewritten
as
2
X V
1 XX
V
i j +
i +
V (1 , , n ) = V (0) +
i 0
2 i j
i j 0
i
At equilibrium, by definition,
V
i
=0
0
and V (0) is an arbitrary constant which may be set equal to zero without loss of
generality since the potential energy is indeterminate up to a constant. Thus we may
write
1 XX
V =
aij i j + ,
2 i j
where
aij =
2V
i j
71
X xi
qj
qj =
X xi
j
!
X X xi xi
j k
j 0
k 0
j
k
1X
T =
mi
2 i
where we have assumed
xi
=
j
xi
j
for small amplitude motion. Once again we may write the kinetic energy in the matrix
form
1 XX
Tjk j k
T =
2 j k
mi
xi
j
0
xi
k
Q T B Q,
where Q T = ( 1 , 2 , , f ) is a row matrix and Q is the corresponding column
matrix. The kinetic energy is a non-negative quantity, the matrix B is a positive
symmetric matrix. Like A it is also a constant matrix since its elements are evaluated
at equilibrium points.
The conservative Lagrangian to this order is
L=T V =
1 XX
[Tjk j k Vjk j k ].
2 j k
X
k
Tjk k +
Vjk k = 0;
j = 1, , f
f
X
j=1
pij j ;
P = {pij }.
where the amplitude k and phase k are constants and k is the frequency of the
k-th normal mode. The generalised coordinates representing the original system are
given by
X
p
k =
pkl l cos( l t + l )
l
Two theorems: Every real symmetric matrix can be diagonalised by an orthogonal transformation. The product of two symmetric matrices, one of which is positive definite, can be diagonalised
by a coordinate transformation.
73
For the first pendulum the kinetic and potential energies are the same as the
pendulum example which we considered before, namely,
1
T1 = m1 l12 21 ;
2
V1 = m1 gl1 cos 1
(4.35)
V2 = m2 gy2 ,
(4.36)
T =
(m1 + m2 )l1 1 + m2 l2 2 + 2m2 l1 l2 1 2
2
2
2
V = 2(m1 + m2 )gl1 ( 1 ) + 2m2 gl2 ( 2 )
(4.38)
2
2
We shall make life simple by assuming m1 = m2 ; l1 = l2 . From the form of the kinetic
and potential energies we have
ml2 21 ml2
(4.39)
B=
1
ml2 12 ml2
2
and
mgl
0
A=
1
0
mgl
2
which is already diagonal. As stated earlier the characteristic equation is
(4.40)
det(A B) = 0
and the corresponding eigenvectors and eigenvalues are given by
g
X1T = (1, 2), 1 =
(2 + 2)
l
g
T
X2 = (1, + 2), 2 =
(2 2)
(4.41)
l
where X T denotes transpose of the eigenvector X.
The two modes of oscillations correspond to the case when 1 and 2 are varying
in opposite directions and in the same direction respectively. Note that the former
frequency is larger than the frequency when both are moving in the same way.
1
1
75
k k 0
1
A = k 2k k
2
0 k k
(4.42)
1
T = [m( 12 + 32 ) + M 22 ] = Q T B Q
2
where B is a diagonal matrix given by
m 0 0
1
B= 0 M 0
2
0 0 m
(4.43)
1
1
L = [m( 12 + 32 ) + M 22 ] k[(1 2 )2 + (2 3 )2 ]
2
2
The equations of motion are given by
m1 = k(2 1 ),
m2 = k(3 2 ) k(2 1 ),
m3 = k(3 2 ).
(4.44)
k/m k/m
0
(B 1 A) = k/M 2k/M k/M
0
k/m k/m
2 = k/m,
3 = k(2m + M )/mM.
p
2 t + 2 ),
p
3 t + 3 ).
X
l
Pkl l =
X
l
Pkl l cos(
p
l t + l );
l = 1, 2, 3,
77
111
000
000
111
000
111
l
111
000
00
11
000
111
00
11
000
111
00
11
1111111111111111111
0000000000000000000
000
111
00
11
00
11
000
111
00
11
000
111
111
000
1111
0000
l
111
000
000
111
000
111
000
111
000
111
000
111
000
111
111
000
00
11
000
111
00
11
1111111111111111111
0000000000000000000
000
111
00
11
000
111
00
11
1111
0000
000
111
000
111
111
000
00
11
1111111111111111111
0000000000000000000
000
111
00
11
000
111
00
11
000
111
00
11
000
111
1111
0000
=0
= k/m
= k(2m+M)/mM
q
000
111
At equilibrium all the beads are equidistant with a distance d separating them.
In general the distance between two beads at arbitrary times is given by
d2ij = d2 + (qi qj )2 .
Assuming the vertical displacement is small compared to the distance of separation
we have
1
dij = d + (qi qj )2 .
2d
The potential energy of the system is then given by
V =
S 2
[q1 + (q2 q1 )2 + (q3 q2 )2 + + (qn qn1 )2 + qn2 ] = QT AQ,
2d
2 1 0
1 2 1
S
0 1 2
A=
0 1
2d 0
..
..
..
.
.
.
0
0
0
is given by
0
0
1
2
..
.
0
... 0
... 0
... 0
... 0
...
1
1 2
where
B 1 A =
2 1 0
0 ... 0
1 2 1 0 . . . 0
0 1 2 1 . . . 0
0
0 1 2 . . . 0
..
..
..
.. . .
. 1
.
.
.
.
0
0
0
0 1 2
n = det[B 1 A E]
then it is easy to see from the structure of the matrix that the following recursion
relation is obeyed by the determinant
n = (2 )n1 2 n2 ,
79
2 = (2 )2 2 .
To find the solution of the eigenvalue equation let n = n . Using the recursion
relation above, we have
n (2 ) n1 + 2 n2 = 0
which immediately yields the solution
p
1
= [(2 ) (2 )2 42 ].
2
exp(i)
exp(i)
; B0 =
2i sin
2i sin
using the values of 1,2 . Substituting these values in the determinant we have
A0 =
n = n
sin(n + 1)
.
sin
k
= k = 4 sin2 (k /2);
(n + 1)
k = 1, , n.
1
k
k
S
=
sin
.
k =
2
md
2(n + 1)
The equation of motion for normal coordinates are given by
qk + k qk = 0
whose solutions are
qk = Ak cos(2k t + k ).
Let us compute the eigenvectors corresponding these normal modes. The transformation to normal modes is given by
Q = P Q ,
X=
x1
x2
..
.
xn
..
..
..
.
.
.
sin n/(n + 1) sin 2n/(n + 1) sin 3n/(n + 1)
. . . sin n/(n + 1)
. . . sin 2n/(n + 1)
. . . sin 3n/(n + 1)
..
...
.
. . . sin n2 /(n + 1)
where each column denotes the eigenvector X1 , . . . , Xn . The transition from normal
coordinates to the original transverse displacement of beads is then given by
qj =
X
k
Cjk sin
jk
cos(2k + k ),
n+1
where Cik is some constant. This is the complete solution of the problem.
81
4.6
Until now we have dealt with particle dynamics in the Lagrangian formalism. When
the number of particles is large or equivalently degrees of freedom is large the dynamics is best described by the Lagrangian field theory or equations in the continuum.
Many systems like deformable systems, fluids may not be described by finitely many
coordinates. The meaningful observables and their evolution in such systems may
not simply by the particle coordinates and velocities.
We will show this through an example and write the equations of motion for the
field.(The treatment given here follows closely the account given in the book Classical
Dynamics by Jose and Saletan.)
Consider a series of plane pendula of length l and mass m such that
The pendula are attached to nuts which move along a horizontal screw in the
x direction. Denoting the pitch of the screw by , we have
x = ,
corresponding to an angular displacement of of a pendulum. As the pendulum
swings the displacement x changes back and forth.
The nuts are in turn coupled to springs with a spring constant k. As the
pendulum swings the nuts compress the springs by an amount proportional .
At equilibrium j = 0 for all j and the distance between pivots is a.
The kinetic energy of the j-th pendulum is given by
1
1
1
Tj = m(l2 j2 + x 2j ) = m(l2 + 2 )j2 = m2 j2 .
2
2
2
Let us assume there are (2n + 1) pendula with indices n, ..., n. The Lagrangian of
the system is then given by
n
n1
X
X
X
1
1
j2 mgl
(1 cos j ) k 2
(j+1 j )2 = T Vext Vint ,
L = m2
2
2
n
n
n
where Vext denotes the external gravitational potential and Vint is the interaction
potential arising from the springs coupling the pendula with a spring constant k
which is assumed to be the same for all the springs. Clearly we are not interested in
what happens at the edge since we take the limit n limit.
The Euler-Lagrange equation for the system is given by
j 2 [j+1 2j + j1 ] + 2 sin j = 0,
where 2 = k 2 /m2 and = gl/2 . These are a set of coupled non-linear secondorder differential equations.
We take the large n limit by replacing each pendulum by s-other pendula with
he same length l, but with a mass m/s = m, distributed in the interval a (distance
L =
.
x (x) gl
x (1cos (x)) ak
x
2
2
x
The Euler-Lagrange equation now takes the form
(x + x) 2(x) + (x (x))
x + gl sin (x) x = 0
x2
Note that x falls out and there is no difficulty in taking the limit s or
equivalently x 0. Replacing the differences by appropriate derivatives in this
limit we have the Euler-Lagrange equation in the continuum limit given by
x ak 2
2 (x)
2
2
2
v
+ 2 sin = 0,
t2
x2
where
ak 2
.
2
A simple dimensional analysis tells us that the v above has the dimensions of velocity,
has the dimension of frequency.
This is the famous one-dimensional sine-Gordon equation. The function
(x, t) is some times called the wave function. It is important to realise what we have
done we began by the generalised coordinate j (t) corresponding to each pendula.
In the continuum limit this is replaced by a function (x, t) which is defined at every
point x and whose dynamics is determined by the equation of motion given above
in terms of the variables x, t. Often the function (x, t) is called the field and its
equation of motion is called the field equation.
The Lagrangian in the continuum limit may written as an integral by replacing
the sum by an integral:
"
#
2
Z
2
1
1
L = dx2
v2
2 (1 cos ) .
2 t
2
x
v2 =
#
"
2
2
1
v2
2 (1 cos )
L(, /x, /t) = 2
2 t
2
x
83
Special cases: Suppose there is no external potential, then let (x, t) = (x, t).
The Lagrangian density becomes
"
2 #
2
1
1
L=
.
v2
2 t
2
x
The corresponding Euler-Lagrange equation is given by
2
2
2
v
=0
t2
x2
which is the one-dimensional wave equation with wave velocity v. It is the continuum
limit of a chain of one-dimensional oscillators.
Suppose the external force is provided not by gravity but by an elastic spring
with spring constant k. Then
"
#
2
2
1
1 2
1 2 2
L=
v
.
2 t
2
x
2
The corresponding Euler-Lagrange equation is given by
2
2
2
v
+ 2 = 0
t2
x2
4.6.1
Variational principle
Though we have derived the field equations in the continuum limit of the particle
dynamics, we would like to derive them directly from a variational principle. For
example, the wave equation can be derived from the Lagrangian density using the
equation
L
L
+
=0
t (/t) x (/x)
which is reminiscent of the Euler-Lagrangian equation in particle dynamics with L
replaced by L with all the dependencies.
Therefore as in the case of particle dynamics we may write for action
Z
S = Ldt.
Consider now a general field function (~x, t) = (x) where x = (t, x2 , x2 , x3 ) =
(x0 , x1 , x2 , x3 ) by treating t also as one of the coordinates. We denote the partial
derivatives with respect to space-time variables as
= /x :
= 0, 1, 2, 3,
L(, , x) d4 x.
The problem in field theory is to find the in the four-dimensional region R by fixing
the values on a chosen boundary. This is equivalent to fixing the end points at two
different times in particle mechanics. The variational principle then states that of all
the possible values of (x) in the region bounded by the end-surface, the physical
ones are those that minimise the action S. That is
Z
S = L(, , x) d4 x = 0.
We use the methods of calculus of variation as before:
L
L
L
L
L
+
( ) =
+
L =
( )
( )
Therefore
Z
L
L
L
4
= 0.
+
d x
S =
dx
( )
R
R
Z
The second term vanishes on the boundary and since this should be true for arbitrary
variations, we have for the functional derivative of L
L
L
L
=
=0
( )
which are the Euler-Lagrange field equations valid locally at every point in the fourdimensional space.
4.6.2
85
the solutions scatter but emerge in the original form away from the scattering
region.
An example of such a solution is given below in one dimension: let the solution
be of the form
(x, t) = (),
where = x vt and v is some constant velocity. Making the change of variables
/t = v/
/x = /
2 2
2 + sin = 0
2
or
2
= 2 sin
2
2
where = 1/ 1 v . This may be obtained from a Lagrangian of the form
2
1
L=
2 cos .
2
kink
+
antikink
1
=
2
()
(0)
d
p
.
E 2 cos
The two solutions are shown in the figure. The solution with positive sign is called
the kink while the one with the negative sign is called the anti-kink solution. They
represent the disturbance modelled by which moves right or to the left without
altering its shape.
Chapter 5
The Hamiltonian Formulation
Another reformulation of classical mechanics is Hamiltonian mechanics introduced by
William Rowan Hamilton in 1833. The equations of motion in Lagrangian mechanics is formulated in terms of second-order differential equations in the n-dimensional
coordinate space. In Hamiltonian method the equations of motion are first order
differential equations in a 2n-dimensional phase space. Both methods are equivalent
and in general do not provide a convenient way of solving the equations of motion.
Rather, they provide deeper insights into the general structure of classical mechanics. In particular the Hamiltonian mechanics is useful because of its connection to
quantum mechanics where as the Lagrangian method provides the route towards field
theories. Furthermore Hamiltonian mechanics is useful in understanding the structure of the phase space flows when the equations are non-linear as we shall see in this
chapter.
5.1
The transition from Lagrange to Hamilton formalism is done by first defining momentum pi (which is canonically conjugate to the configuration coordinate qi ):
pi :=
L
,
qi
(i = 1, 2, 3, . . . N )
(5.1)
Note that the simple relation between the momentum pi and the velocity qi for Newtonian systems in Cartesian coordinates is rather deceptive. We shall now discuss the
dynamics of the system in terms of pi eliminating the velocities.
An important property of momenta is that they may be expressed as gradient of
action. Consider the variation of the action S with only one end point fixed, say the
initial point:
t2 Z t2
L
d L
L
S =
q +
( ) qdt
q
q
dt q
t1
t1
Using the equation of motion,
S = pq p =
87
S
q
88
where p is now the gradient of action. This is, in general, not true of the velocities.
The Hamiltonian is then obtained through the Legendre-Transformation:
H(p, q, t) :=
N
X
i=1
t) ,
pi qi L(q, q,
(5.2)
1 X 2
p + V (qi )
2m i i
where pi = mqi . But the simple relation between the momentum and velocity is
not always taken for granted especially when the system moves in the presence of
constraints.
Example: A simple example where the relation between momentum and velocity
is not simple is in the case of a particle sliding on a wire shape z = f (x), say, under
gravity. Since
m
m
T = (x 2 + z 2 ) = x 2 (1 + (df /dx)2 )
2
2
and
V = mgz = mgf (x).
The generalised momentum p = mx(1
+ f (x)2 ) and
H=
p2
+ mgf (x).
2m(1 + f (x)2 )
Remarks:
pi und qi form a set of conjugate variables whose product pi qi has the dimension
of work.
1
The passage from one set of independent variables to another is effected by Legendre transformation in mathematics. To see how it is done, consider an arbitrary function f (x, y). Define another
arbitrary function g(x, y, u) = ux f (x, y) which is in general a function of all three variables x, y, u.
However, if u(x, y) = f /x, then it is easy to show that
g(u, y) = ux f (x, y)
where g now depends only on u, y and x = x(u, y). This is the Legendre transform which takes one
function f (x, y) to a different function g(u, y). The information content of both f and g are the
same and we have not lost any thing.
89
=
=
=
=
X
L
L
L
dpi qi + pi dqi
dt
dqi
dqi
q
t
i
i
i
X
L
L
d L
dt
dqi
dqi
dpi qi + pi dqi
dt
t
i
i
i
X
L
(dpi qi + pi dqi pi dqi pi dqi )
dt
t
i
X
L
(dpi qi pi dqi )
dt .
t
i
(5.3)
H
= pi .
qi
(5.4)
L
H
=
t
t
The Hamilton equations are a set of 2N first order differential equations, ideally
suited for phase space description of the dynamical system.
If L
= 0, then it follows
t
conserved.
H
t
5.2
For conservative systems it is often more convenient to discuss the behaviour of the
system in the phase space where the equations of motion are given by Eq.(5.4). The
90
dq F (q )
0
p
d(p2 /2m)
=
m
dp
(5.7)
(5.8)
Here we have transformed the second order differential equations into a set of two first
order differential equations. This is an example of a more general class of dynamical
systems of order 2. The solutions for p, q describe the motion of the system in a two
2
91
dimensional phase space or state space. More about treatment of dynamical systems
of order n is given in chapter 1. A treatment of modern dynamics starts from the
mathematical foundations given in this chapter.
We may also define the phase velocity in the phase space as
~vs = (
H H
,
)
p
q
~ = 0.
~vs .H
92
m
p2
+ 2q2
2m
2
where
q(t) = A cos(t + C),
p(t) = mA sin(t + C)
p
q
V(q)
E0
q0 q
m 2 2
q
2
(5.10)
93
Origin (0,0) is the fixed point and the eigenvalues of the stability matrix around the
origin are equal in magnitude and opposite in sign . Therefore the origin is a
hyperbolic fixed point.
The equation of the phase curves is obtained from the energy expression itself
since it is a conservative system (E is conserved): The phase curves are hyperbolas
in general corresponding to constant energy E > 0.
2mE = p2 m2 2 q 2 = (p mq)(p + mq).
When E = 0 the phase curves are straight lines corresponding to p = mq. These
obviously pass through the fixed point at the origin. Such phase curves which pass
through a hyperbolic fixed fixed points are called Separatrixes. They mark boundaries between phase curves which are distinct and can not be deformed continuously
into one another.
Example 4 : We can consider many variations on oscillator theme, for example
the quartic oscillator.
V (q) = q 4 /4
(5.11)
and the equations of motion are
dq/dt = vq (q, p) = p/m
dp/dt = vp (q, p) = q 3 .
Again the origin is the only fixed point of the system. The phase curves correspond
to the constant energy contours. The phase portrait is given below.
-1
V(q)
E0
q0 q
94
Example 5 : The Quartic Oscillator with a saddle point has a potential given by
V (q) =
1 4 1 2
q q
4
2
(5.12)
Note that the system has three fixed points (q,p) phase space, two stable elliptic
fixed points at, (q, p) = (1, 0), and one hyperbolic fixed point at the origin, (q, p) =
(0, 0), as shown in the figure. The nature of motion E < E , E = E andE > E
(E = energy at the local maximum q = 0) are very different.
V(q)
E0
0
q0 q
Note that the above examples, except example 3, correspond to cases where the phase
curves are compact. Now consider some in which it is mixed:
(5.13)
The fixed points of the system are given by (q, p) = (0, 0), (1, 0). While the first one
is an elliptic fixed point, the second one is a hyperbolic fixed point. The phase space
is not compact. For energies E > E (E = Energy at the local maxima) the particle
goes of to infinity at large times.
95
E0
0 q0
V(q)
(5.14)
or equivalently,
q = p,
p = cp 2 q.
(5.15)
The fixed point is of course the origin. The nature of motion depends on the value
of the damping coefficient c. The characteristic equation is of course
2 + c + 2 = 0
whose eigenvalues are
1
1,2 = [c c2 4 2 ].
2
We shall first consider the case when c is positive. The solution for > c/2 is
given by
q(t) = q0 ect/2 sin( t + )
(5.16)
p
where = 2 c2 /4. It is a case of weak damping. The fixed point is a stable
spiral fixed point for this case.
96
If < c/2, then strong damping occurs since both the eigenvalues are real and
positive with the solution
q(t) = A exp(1 t) + B exp(2 t)
where A and B are determined by the initial conditions. The phase curves do not
show any oscillatory behaviour, tend to the origin as t increases. The fixed point is
therefore a stable node.
If c is negative, and > c/2, we have the opposite situation with unstable spiral
and unstable node in the two cases that we considered above.
Example 8 : Consider the free rotations about an axis. The variable is the angle
(t). For free rotations,
= p
which is a constant and therefore the angular momentum
p = 0.
The Hamiltonian is
H=
p2
2I
where I is the momentum of inertia of the freely rotating body. The phase space
trajectories are straight lines parallel to the axis.
Example 9 : The mathematical pendulum is the prototype of one-dimensional
nonlinear systems. The actual solutions are complicated involving the elliptic integrals
and Jacobian elliptic functions.
97
V()
Since the length of the pendulum l = const. the system has only one degree of
freedom, namely the angle .
p = ml2 ,
p2
H(p , ) =
mgl cos .
2ml2
(5.17)
(5.18)
The phase space is defined by variables (p , ). Note that I = ml2 is the moment of
inertia of the system and further the angular momentum L = I = I = p both
defined with respect an axis of rotation. In this case it is the line passing through the
support, perpendicular to the plane of vibration (rotation) or the pendulum.
While the Newtonian equation of motion is of second order,
l + g sin = 0
(5.19)
the first order evolution equations in the phase space are given by
p
,
ml2
= mgl sin .
=
p
(5.20)
98
Phase portrait As mentioned before, when the motion is periodic, we can limit
the angle [, ] and < p < . The phase space corresponds to a cylinder
of infinite extent in the y-direction.
The phase portrait of a pendulum is given below:
1. Elliptic fixed point: For small - the pendulum just oscillates about the fixed
point at the origin. Close to the origin linear stability analysis shows that that
this is an elliptic fixed point which corresponds to E = mgl. The motion
in the region mgl < E < mgl is usually referred to as Libration which is
characterised by an average momentum over one period: hp i = 0. The phase
curves are approximate ellipses around the fixed point.
2. Unstable hyperbolic fixed point: The hyperbolic point corresponds to = the point where the pendulum is held vertically upwards corresponding to (E =
mgl); This is an unstable situation since a small displacement forces the mass to
move away from the fixed point unlike the stable case. The E = mgl curve in the
phase space passing through the hyperbolic fixed point is called the separatrix
whose equation is given by
p2
2mgl cos (/2) =
2ml2
2
3. The motion in the region E > mgl is rotation. Here p does not change sign
and hp i 6= 0.
4. The separatrix divides the phase space into two disconnected regions: inside the
separatrix the phase curves are closed and may be continuously deformed into
one another by changing energy. Outside the separatrix the motion is characterised by rotations with p having a definite sign through out the phase curve
and hence open. Thus there are two distinct homotopy classes separated by
the separatrix.
t() = l
.
2 0
E + mgl cos
99
(5.22)
(5.23)
Obviously the angle 0 corresponds to the maximum deviation from the mean position
when the velocity is zero. The integral in (5.22) can then be converted into the
standard form as follows:
s Z
l
d
.
(5.24)
t() =
2g 0
cos cos 0
Let
cos cos 0 = 2 cos2 (/2) cos2 (0 /2) = 2 sin2 (0 /2) sin2 (/2)
and further
sin ( ) =
sin( /2)
sin(0 /2)
(5.25)
Substituting we get
s
Z
d
1
l
p
t() =
g 2 sin(0 /2) 0
1 sin2 ( )
s Z
s
l
l
d
p
=
F (, k) ,
=
2
2
g 0
g
1 k sin
(5.26)
(5.27)
(5.28)
The integral in (5.26) is usually known as the elliptic integral of the first kind.
Z
d
p
F (, k) :=
.
(5.29)
1 k 2 sin2
0
100
The period of the pendulum T corresponds to a change in angle such that the pendulum returns to its original position. that is, = 0 , corresponding to sin = 1
= /2. In this limit we have
K(k) := F ( , k) .
(5.30)
2
where K(k) is the Jacobian (or complete) elliptic function. The period of the pendulum is given by
s
s
"
#
2
l
lX
2n!
T =4
K(k) = 2
sin2n 0 /2 .
(5.31)
g
g n=0
(n!2n )2
The factor 4 comes from the fact that a single period T consists of four traversals
0 0 . The leading term gives the period of the simple pendulum whereas the
actual period is given by an infinite series.
Following remarks are useful:
1. When the energy E mgl the parameter k 1. In this limit K(k) diverges
and the period T .
2. In the limit of small energies,E mgl we have 0 0 k 0 and K(0) = 2 ,
the period of the pendulum is that given usually
q by the period of a simple
pendulum- small amplitude vibration: T = 2 gl
Case E > mgl
d
d
1
p
= F
,
,
=
2
1+
1+ 0
1 + cos
1 q 2 sin2 /2
0
where
q2 =
2mlg
2
=
< 1.
1+
E + mlg
We therefore get
t() =
2ml2
F
E + mlg
,q .
2
2ml2
K(q)
E + mlg
(5.32)
( < 1)
(5.33)
(5.34)
(5.35)
(5.36)
101
=
g/l; The period is T = 2/ =
p
2 l/g as in the case of simple pendulum.
5.3
Poisson Brackets
We will now discuss the Poisson Bracket description of Hamiltonian dynamics. This
is an elegent representation and provides an axiomatic foundation of mechanics. The
geometry of this representation, namely simplectic geometry is not only elegent but
has far reaching implications.
The Hamiltons equations of motion for phase space variables are given interms
of the partial derivatives of a single function, namely the Hamiltonian, which plays
a special role in dynamics. We may ask the question, what about other functions of
phase space variables, how do they vary in time along the flows in phase space. Let
f (q, p, t) be such a function, differentiable, defined on the phase space. The total
time derivative of f is given by
df
f X f
f
f
=
+
+ [f, H] .
(5.37)
qi +
pi =
dt
t
qi
pi
t
i
Here we have made use of the Hamilton equations (5.4). The symbol [f, H] is referred
to as the Poisson Bracket or simply PB. If this is zero then the function f is a
constant of motion along the phase space trajectories. In particular, if f does not
explicitly depend on time then the PB of f with H vanishes. We may say that it
commutes with the Hamiltonian in the PB sense. That is
df
= [f, H] .
(5.38)
dt
Evidently, if [f, H] = 0 then f (p, q) is a constant motion along the phase space
trajectories for autonomous systems. Some times curly brackets are used to denote
PBs but in here we use square brackets.
We may generalise this notion of PB to any two functions on the phase space.
In general for any two functions f, g defined on the phase space the PB is defined as
X f g
f g
.
(5.39)
[f, g] :=
q
p
p
q
i
i
i
i
i
102
=
=
=
=
+
[f, g]
[g, f ]
[f, h] + [g, h]
f [g, h] + [f, h]g
[g, [h, f ]] + [h, [f, g]] = 0 ,
(5.40)
where the last property is known as the Jacobi identity for PB. Note that the above
properties define a Lie algebra- that is the functions on the phase space form a
Lie algebra under PBs. The corresponding Lie group is the group of all canonical
transformations.
Evidently q, p themselves are functions on phase space. Therefore we have
[qk , H] =
H
= qk
pk
[pk , H] =
H
= pk .
qk
Notice the PBs have the same sign even though the partial derivatives with H have
opposite signs in the above equations. The equation of motion for qk picks out pk and
viceversa. Therefore qk , pk form a conjugate pair.
Furthermore we have the following important PB relations:
[qi , qj ] = [pi , pj ] = 0 ,
[qi , pj ] = ij .
(5.41)
The set of relations given above may be used as axioms for describing the mechanics.
Consider the following PB:
[q n , p] = nq n1
for some k which is not shown. By induction we have
[q n+1 , p] = [q.q n , p] = q[q n , p] + [q, p]q n = (n + 1)q n
since for n = 1 this holds as shown above, [qk , pk ] = 1 for any k. In fact we may write
[q n , p] =
d(q n )
,
dq
where the rhs is a derivative wrt to q. If in particular for a function f (q, p) which
may be a polynomial or even an infinite series in phase space variables, we have
[f (q, p), pk ] =
f (q, p)
,
qk
103
f (q, p)
,
pk
Since all the derivatives with respect to q, p, t are given interms of PBs, the entire
dynamics may be described using the relations between PBs.
For the components of angular momentum L = r p, we have:
[Li , Lj ] = ijk Lk .
and
(i, j, k = 1, 2, 3)
(5.42)
L2 , Li = 0.
(5.43)
(5.44)
dt
where f and g are two functions on the phase space. The result follows by first noting
that
[f, g]
d[f, g]
= [[f, g], H] +
dt
t
= [[f, H], g] + [f, [g, H]] + [
= [f, g] + [f, g],
(5.45)
f
g
, g] + [f, ]
t
t
q(0) exp(ct)
p(0) + q(0) exp(ct)
p(t) = const.
p(0)
p(0) + q(0) exp(ct)
H
pq
(5.47)
104
Another simple example is the motion of a free particle. Consider this example
in one dimension for simplicity. The Hamiltonian is given by
H=
p2
2m
p
m
p = 0.
Therefore p is a constant of motion as also p and the H itself. However there is
only one independent constant of motion. The solution for q(t) is given by
q(t) = q0 +
p
t
m
p
t
m
is itself a constant of motion, for arbitrary initial conditions, but is explicitly dependent on time. The extended phase space is three dimensional (q, p, t) and these
two conditions ensure that the motion takes place on a trajectory or line in this
phase space provided atleast one constant of motion has a form with explicit time
dependence.
5.4
Simplectic Structure
As discussed before, in the Lagrangian dynamics the fundamental quantities are the
configuration coordinates q Q, where Q denotes the configuration space. The
Lagrange equations are second order and are solved to obtained the trajectories in
the configuration space. Often the the configuration space is the full Eucledian space,
all the position vectors as also the velocity vectors are also contained in the Eucledian
space. This however is not always the case- for example for particles confined to move
on a circle S 1 or sphere S 2 . While the trajectories are contained in some Q contained
in the Eucledian space, the velocity vector lies on the tangent plane. Thus one needs
an extension of Q, denoted T Q, in order to describe the motion. The Lagrangian then
is a scalar function on this space, that is the Lagrangian provides a map between T Q
and the space of real functions R.
In Hamiltonian dynamics, however, the dynamics is described on the phase
space where the position qi and momentum pi are treated on equal footing, indeed
they can be arbitrarily mixed in a phase space coordinate transformation. In order
to distinguish the two spaces, the phase space is often denoted by T Q.
If we denote the position space by RN and momentum space P N :
= RN P N ,
q RN
p PN .
(5.48)
105
q1
..
.
q
qN
=
= {i } =
,
p
p1
.
..
pN
,
=
,...,
,
,...,
:=
q1
qN p1
pN
the equations of motion take a simple form
H
p
=
= J H(, t) ,
H
q
(5.49)
(5.50)
(5.51)
where H is a column vector of partial derivatives of H in the full phase space. The
2N 2N dimensional matrix J is referred to as the Simplectic Matrix and is given
by,
0 1
,
(5.52)
J :=
1 0
where 1 refers to the unit matrix and 0 is the null matrix of dimension N . Again we
note that the H(, t) denotes Hamiltonian flow (-vector) and the (t) form the
Phase space Trajectories(-path).
This simplectic structure allows us to express PBs in a compact form. For
example the PB between two functions f, g may be written as
[f, g] = (f )T J(g),
where the PB is now expressed in a form which resembles a scalar product but with
a metric given by the simplectic matrix J.
5.5
Canonical Transformations
106
Consider a coordinate transformation (q, p) (Q, P). We call this a Canonical Transformation or CT when
qi Qi (p, q, t)
pi Pi (p, q, t)
e
H(p, q, t) H(P,
Q, t)
(5.53)
e
H
= Q i ,
Pi
(5.54)
107
(5.55)
e
H
= 0.
Qi
(5.56)
Qi (t) = fi t + Qi (0)
While the set Pi , Qi (0) constitute a set of 2N constants of integration, in particular
Pi are referred to as constants of motion in involution, that is
[Pi , Pj ] = 0,
i, j = 1, , N.
Example: A simple and well known example that illustrates the use of CTs is the
the one dimensional harmonic oscillatorH(p, q) =
p2
m
+ 2q2 .
2m
2
(5.57)
(5.58)
108
(5.59)
This is an optimal transformation since the integration of the Hamilton equations are
trivial:
e
H
E
= P = 0 P = const. = ,
Q
e
H
= Q = = const. Q(t) = t + .
P
p(t) =
2Em cos(t + ) ,
(5.60)
(5.61)
(5.62)
5.6
The Theorem of Liouville may now be stated in the following equivalent formulations:
The Hamiltonian flows are divergence free:
X qi pi
=
=0
+
qi pi
i
(5.63)
i=1
i=1
i=1
(5.65)
109
(5.66)
Q = p
= [q, p]Q,P Q.
+ q
P
P
Therefore we have the result
[q, p]Q,P = 1
and since the inverse exists,
[Q, P ]q,p = 1
The PB of (p,q) calculated in any representation formed out of canonical transformations is representation independent. As a result
(Q, P )
= [Q, P ] = 1
J = det
(q, p)
when (p, q) (P, Q) is a canonical transformation.
This can be generalised to arbitrary PB structures since
[f, g]P,Q = [f, g]p,q [q, p]P,Q = [f, g]p,q
by virtue of the above result.
Remarks :
The canonical transformations are used to identify the so called ideal transformations where the new Hamiltonian may be independent of new coordinates
Qi , such a system is called integrable which will be discussed in detail later.
It is easily shown that the Poisson Brackets [f, g]p,q are invariant under Canonical Transformations[f, g]p,q = [f, g]P,Q .
(5.67)
110
(5.68)
dq
t + O(t2 ),
dt
dp
t + O(t2 ).
dt
Substituting Hamiltonian equations of motion, we have
p1 (t) = p(t + t) = p0 +
q1 (t) = q0 +
H
t + O(t2 ),
p
p1 (t) = p0
H
t + O(t2 ).
q
2H
1 + 2 H t
t
q0 p0
p20
=
2H
2
qH2 t
1 p0 q
t
0
0
p0
= 1 + O(t2 ).
(5.69)
111
Liouville Equation :
Consider a system which is described by some probability distribution function
(q, p, t) in the phase space. By definition
Z
dpdq (q, p, t) = 1.
Consider a collection of such identical systems, then the normalisation is given by
Z Y
dpi dqi (q, p, t) = N.
i
Classically the particles are neither created are destroyed, the number of particles is
conserved. As a result we have
d
=0=
q + p + .
dt
q
p
t
Using the equations of motion we have
= [, H]
t
This holds for all Hamiltonian systems, conservative or not and is called Liouville
equation.
For a time independent system, we may choose
= (H(q, p))
representing a class of systems. An example of such a distribution is given by
= exp(H/kT ),
where k is the Boltzmann constant and T is the temperature. For a free particle
system this yields
= exp(mv 2 /2kT )
which is precisely the Maxwell-Boltzmann velocity distribution in statistical mechanics which satisfies the Liouville equation.
5.7
Action-Angle Variables
The simplest possible description of a conservative system is provided by the actionangle variables. The original variables (q, p) may not necessarily be the best suited
variables to solve (integrate) the system even if the physics of the problem is best
illustrated in terms of these variables. To be specific consider the simple case,
H(p, q) =
p2
+ V (q),
2m
p
where p(q, E) = 2m(E V ) is a multi valued function. We shall seek a new set
of variables, (I, ) such that
112
each phase curve is labelled uniquely by I, called action, which is constant along
the phase curve and
each point is identified by a single variable , called angle.
The first requirement gives,
H
dI
=
= 0 H = H(I)
dt
d
H
=
= ,
dt
I
where is now a constant since H is a function of I only.
For a one-dimensional system may be periodic (though this is not always the
case unless the motion is bounded), that is + 2 after every period.
Consider a general Hamiltonian with one degree of freedom given above. The
area enclosed by the phase curve at energy E, say, is
Z
I
Z q2 p
A(E) = dpdq =
p(q, E)dq = 2
dq 2m(E V (q))
C
q1
1
I=
dI = 2I.
0
q2
dq
q1
p
2m(E V (q))
5.7.1
Angle variable
For fixed I, the relation between and q is obtained by considering area between two
neighbouring curves defined by, I andI + I. The change in the area
Z Z
Z
Z
p
+
A =
dpdq = dq[p(q, I + I) p(q, I)] = I dq
I
S
Since in the I, plane,
A = I.(q, I)
we have
(q, I) =
I
dq p(q , I)
0
113
While we have assumed that is bounded and periodic, there are exceptional
systems in one dimensions where is not bounded. For example the inverted oscillator
in one-dimension is described by the Hamiltonian,
H(p, q) =
whose solutions are given by
p2
1
m 2 q 2
2m 2
2I 1/2
) sinh
p = (2I)1/2 cosh
q=(
and
Even though we may write
e = I.
H
(t) = t + (0)
the angle is unbounded in this case and can not be interpreted as an angle variable
as in the case of the harmonic oscillator. Similar situation occurs in the case of a free
particle.
5.7.2
We have already analysed the harmonic oscillator problem in one dimension in terms
of action and angle variables. Consider a more general problem of oscillator in Ndimensions. For simplicity let us set m = = 1 for all the oscillators or better still
scale the momenta and coordinates by the corresponding oscillator lengths so that
they are dimensionless. The Hamiltonian is given by
2H =
N
X
(p2i + qi2 ).
i=1
i = 1, , N.
H
Ij
=
pi
pi
Ij
H
=
qi
qi
we have
[Ii , H] = 0
for i = 1, ..., N . Further it is easy to show that
[Ii , Ij ] = 0
114
proving the fact that Ii are indeed constants of motion in involution, that is their
PBs vanish. Thus the Hamiltonian in action variables may be written as
X
e 1 , , IN ) =
2H(I
Ii .
i
i = t + 0 ;
0 i 2
For example for a two dimensional oscillator the action variables are I1 = p21 + q12
and I2 = p22 + q22 . The full phase space is four dimensional (p1 , q1 ; p2 , q2 ). But the
constants I1 and I2 define a two dimensional surface embedded in the 4-dimensional
space. In particular I1 and I2 describe separately the equation of a circle. Thus the
2d surface on which the motion takes place is a torus.
In general for an n-dimensional oscillator system the motion takes place on an
n-torus embedded in a 2n- dimensional space.
5.8
Integrable systems
Mechanical systems that can be integrated (solved) completely and globally are rare
and exceptional systems. In general the chances of finding a complete solution depends on the existence of integrals of motion.
Having gone through many examples, by now it seems that the existence of nintegrals of motion in involution (same as the number of degrees of the system) renders
a 2n-canonical equations integrable. By this token, all conservative systems with one
degree of freedom, and phase space dimension of two, are integrable. Similarly the
central potential problem in 3-dimensions is also integrable since apart from energy, we
have angular momentum in z-direction Lz and the square of the angular momentum
vector L2 as integrals of motion.
We shall make a general statement below, some times also called the LiouvilleArnold integrability theorem, which clarifies the above point- without proof:
Let I1 , , In be dynamical quantities defined on a 2n-dimensional phase space
T Q(qi , pi ) of an autonomous Hamiltonian system, H. Let Ii be in involution,
[Ii , Ij ] = 0;
i, j = 1, , n.
That is Ii s are independent in the sense that at each point on the n-dimensional
surface M = {qi , pi } the differentials dI1 , dI2 , , dIn may be effected in a linearly
independent way. Then
M is a smooth surface that stays invariant under the flow (evolution) corresponding to H. If in addition M is compact and connected, then it can be
mapped on to an n dimensional torus:
T n = S 1 S 1 S 1 S 1
repeated n times and every S 1 is a circle.
115
if and only if ri = 0 for all i, the motion is in general quasi periodic since any trajectory
on T n never closes.
More Examples:
If a system can be written in a separable form after a suitable CT,
X
H(q1 , ..., qn , p1 , ..., pn ) =
Hi (qi , pi )
i
116
(a)
(b)
(c)
action variables are I1 = p2x , I2 = p2y which are evidently in involution. Motion
inside the enclosure is essentially free with elastic reflection on the hard wall.
The system is integrable. However, the corners are points where the reflection of
the orbit can not be defined. Therefore while the necessary condition is satisfied,
namely the existence of n (2 in this case)- constants of motion in involution is
satisfied, the other conditions are not valid. Such systems where the involution
condition alone is satisfied are called Pseudo-integrable systems.
We may avoid this problem by choosing a circular boundary without corners as
shown in (b). Because of the symmetry, the action variables in involution are
given by
p2 + p2 /r2
H= r
+V
2m
which is the Hamiltonian itself and the angulr momentum in the plane given by
l = (xpy ypx ).
The orbits are either straightlines along the diameter or polygonal orbits with
many reflections.
Interestingly, if we insert a hard circular boundary inside a square (c), it turns
out that the system not only has regular (periodic) motion but also chaotic
motion for a set of initial conditions with measure non-zero.
Next consider a non-trivial problem in 3-dimensions- the central force problem
which we considered before. In this case the Hamiltonian is given by
H=
p2
+ V (r),
2m
117
p2
p21
+ 2 + V (r1 , r2 )
2m1 2m
2
Pcm
p2
+
+ V (r),
2M
2m
where ~r = ~r1 ~r2 , M = m1 + m2 , m = m1 m2 /M . The phase space is 12dimensional, but because of the form of the potential we have 6-constants in
involution, namely
Pcm 2x , Pcm 2y , Pcm 2z , H, L2 , Lz ,
where H is the total Hamiltonian and L is the relative angular momentum.
Therefore transaltionally invariant, central force two-body problem is integrable.
These arguements may also be extended to three or more number of particles.
However, even if the potential depends only on the distance between two particles, the many-body problem in general is non-integrable. Only exceptions are
the separable problems but these are trivial.
5.9
where the Poisson bracket is the same as the Jacobian given in the previous section
and is unity.(The arguments of this sections may be easily extended to systems with
more than one degree of freedom.)
In general a transformation from one set of phase space variables to another set
requires specifying two functions. However, the constraint emerging from the area
preserving property reduces this to just one function.
118
and
(5.71)
dP dQ =
P dQ,
P = P (Q)
(5.72)
That is out of the four variables (p, q, P, Q) we may choose any two of them to be
independent for variation. This implies that, given a CT, in each neighbourhood in
the phase space there exists a function F such that
I
I
[pdq P dQ] =
dF (q, Q) = 0
(5.74)
C
Since the (q,p) or (Q,P) are independent and treated on an equal footing in Hamiltonian formalism the canonical transformations may be extended to include a wide
variety of transformations through generating functions. Since there is a choice of
four variable from which to choose, we can define four types of generating functions:
Type I: F = F1 (q, Q).
I
[pdq P dQ] =
dF1 (q, Q) = 0
(5.75)
F1
= P ,
Q
e =H.
H
(5.76)
2 F1
6= 0 ,
(5.77)
qQ
which is a necessary and sufficient condition that F1 generates a canonical transformation.
(5.78)
F2
= Q,
P
e =H.
H
(5.79)
Note that the generating functions F1 and F2 may be related by the LegendreTransformation
F2 (q, P ) := F1 (q, Q) + QP.
119
Type III: F = F3 (p, Q). The other two variables are then given by,
F3
= q ,
p
F3
= P ,
Q
F4
= Q,
P
e =H.
H
(5.80)
e =H.
H
(5.81)
pi = Pi ,
Qi = fi (q) ,
Qi = qi .
pi =
Pj
fj
.
qi
(5.82)
(5.83)
p i = Qi ,
5.9.1
Pi = qi .
(5.84)
(5.85)
(5.86)
where t is held fixed for partial differentiation. Once this is satisfied, one may directly
take over the theory of generating functions as outlined before. For example:
120
F1
= Pi ,
Qi
(5.87)
(5.88)
where we have made use of the invariance of the Poisson Brackets under canonical transformation. Substituting for the PB we have
e
H
H (F1 /Q)
=
P =
Q
t
Q
(5.89)
e = H + F1 .
H
t
Type II:
F = F2 (q, P, t)
F2
= pi ,
qi
F2
= Qi ,
Pi
(5.90)
P
e = H + F2 .
H
t
Qi Pi
(5.91)
Analogously we obtain 3. and 4. Type: F3 (p, Q, t) and F4 (p, P, t). In all these cases
the Hamiltonian is given by,
e = H + Fn /t
H
5.9.2
Consider a transformation
= M ,
where is a 2n-dimensional column vector with elements (q1 , , qn , p1 , , pn ), and
M is a 2n 2n matrix. This is a canonical transformation provided
= J H
= J H,
where J is the simplectic matrix defined earlier,
J 2 = J, J T = J 1 = J
121
F2
W
= Pi +
qi
qi
Qi =
W
F2
= qi +
Pi
Pi
122
W
qi
W
.
Pi
Pi
W
=
Qi
W
Qi
=
.
Pi
Thus if we regard as the time parameter, then these are the Hamilton equations
of motion with the generator W playing the role of the Hamiltonian. Thus the
Hamiltonian itself is a generator of canonical transformations.
We may use the above analysis to restate Noethers theorem: The variation in
the Hamiltonian may be written as
H =
H
H
Qi +
Pi = [H, W ].
Qi
Pi
5.10
Hamilton-Jacobi Theory
Consider a general Hamiltonian H(p, q), p and q may have the physical interpretation
of momentum and position variables. Even in the case of an exactly solvable system,
these may not necessarily be the best set of variables to solve the system in spite of
the fact that the physics of the problem is best illustrated in terms of these variables.
In fact the ideal situation would be to find a set of local coordinates on the phase
space such that the Hamiltonian is a constant.The generating function of such a
transformation is in general a solution of a non-linear partial differential equation,
namely, the Hamilton-Jacobi equation or simply HJ equation.
The HJ equation are not easy to solve, but they lend themselves to perturbative
calculations (for example the way it is used in celestial mechanics). Historically, they
played an important role in the development of quantum mechanics.
5.10.1
For a conservative system with N degrees of freedom, an optimal canonical transformation is of the form
!
e
e
H(p, q) H(P,
Q) = H(P)
,
(5.92)
123
by which the new Hamiltonian function is a function of the new momenta Pi and is
independent of the new coordinates Qi . Such a transformation may be effected using
the generator of Type 2, S given by,
S(q, P) := F2 (q, P) .
(5.93)
S
,
q
Q=
S
,
P
S
= 0.
t
(5.94)
If such a transformation satisfies the ansatz in (5.92), immediately we get the time
independent Hamilton-Jacobi equation:
S
H
, q = E = H(P) ,
(5.95)
q
where the RHS is a number for a given set of Pi which are constants of integration.
Remarks:
Thus (5.95) is a partial differential equation for the generating function
S(q1 , . . . , qN , P1 , . . . , PN ) in the N coordinates qi , for a given set of Pi which are
constants of integration. The equations are in general non-linear.
Once it is shown that the (5.95) exists, the equations of motion are trivially
solved:
e
H
= P i = 0 Pi = const.,
Qi
e
H
= Q i = i = const. Qi (t) = i t + i .
Pi
(5.96)
(5.97)
For a give set of Pi , S is often referred to as the action integral (also known
as the Hamilton Characteristic Function) since it can be written as an integral
over coordinates alone (apart from constants):
S=
N Z
X
i=1
qi (t)
pi dqi =
qi (0)
q(t)
q(0)
p dq .
(5.98)
In the case of time dependent transformations, the Hamilton principal function (4.3) itself is the generator of the desired time independent transformation: We
therefore have
F2 (qi , Pi , t)
= pi ,
qi
F2 (qi , Pi , t)
= Qi ,
Pi
e = H + F2 (qi , Pi , t) .
H
t
(5.99)
e = 0- then
Let us choose S(qi , Pi , t) = F2 + C, where C is some constant, such that H
dPi
= 0,
dt
dQi
= 0.
dt
(5.100)
124
S
S
= 0.
, t) +
qi
t
(5.101)
This is an equation for the generating function S. Once S is determined the transformations may be inverted to obtain the coordinates (qi , pi ).
We also have
dS X S
S X
=
qk +
=
pk qk H = L
(5.102)
dt
qk
t
k
k
Therefore
S(q, P, t) = =
=
), t ] dt
L[q(t ), q(t
t0
t
t0
X
i
pi qi H(q, p, t ) dt ,
(5.103)
p=
S(q, E)
;
q
Q=
S(q, E)
.
E
(5.105)
Q(t) = t t0 .
(5.106)
S
q
2
+ V (q) = E ,
(5.107)
125
For Q = t t0 we find
Q = t t0
Z qp
S(q, E) =
2m[E V (q )] dq .
(5.108)
q0
Z qp
S
2m[E V (q )] dq
=
=
E
E q0
r Z q
dq
m
p
=
.
2 q0 E V (q )
(5.109)
which upon inversion gives the solution for the variable q(t).
N
X
h(pi , qi ) ,
(5.110)
i=1
then the system is separable. For example h may be of the form h(p, q) =
p2 /2m + V (q). Such a situation, as in (5.110), occurs in the mean field approach to many body problems (Hartree-Fock-Theory, Density Functional theory). When the Hamiltonian is of the form (5.110) the corresponding generating
function S(q, P) also has a separable form:
S(q, P) =
N
X
s(qi , Pi )
(5.111)
i=1
The problem then reduces to finding the solution for the generating function
s(q, P ). The HJ-equation(5.95) may be broken into N equations given by
s
h
, qi = Ei = const.,
(i = 1, . . . , N )
(5.112)
qi
where Ei is the energy associated with each degree of freedom. The total energy
is of course
X
E=
Ei .
(5.113)
i
Vr (r)
V
V
+
+
2
2m
2mr
2mr2 sin2
(5.115)
126
(5.116)
dS 2
,
) + V () +
d
sin2
(5.119)
where is a constant and finally the radial equation is solved using the ordinary
differential equation
1
dSr 2
E=
(5.120)
(
) + Vr (r) + 2
2m
dr
r
which completes the solution for S.
Analogy with Quantum Mechanics: Finally it is interesting to draw a comparison between the the Hamilton- Jacobi equation and Schroedinger equation in quantum
mechanics. The Schroedinger equation in terms of a complex valued function (x, t)
is
h
2 2
ih
=
+ V (x).
t
2m x2
Let
(x, t) = R(x, t) exp(iS(x, t)/h).
Such that R is the modulus and S is the phase of the wave function. Substituting
this in the Schroedinger equation we find
"
#
2
h
2
R S
2i R S iR 2 S 2 R
iR S R
=
2
+ V (x)R
+
ih
+
+
+
h
t
t
2m
x
h
x x
h
x2
x2
h
Chapter 6
The Classical Perturbation Theory
Much of our discussion in earlier chapters centered on exceptional systems. Completely integrable Hamiltonian systems are exceptional. Real systems are more complicated. So approximate methods are useful and here we shall discuss a class of
problems in which the Hamiltonian may be written in the form
H(p, q) = H0 (p, q) + H1 (p, q),
|| << 1
(6.1)
where H0 is an integrable (solvable) Hamiltonian but the addition of H1 makes H nonintegrable. H is called the perturbed Hamiltonian where as H0 is the un-perturbed
Hamiltonian and H1 is the perturbation. The main aim of perturbation theory is
to obtain approximate solutions as a function of . There are many examples of
such systems- the most important one is the solar system. We first treat Sun-Earth
problem as a two-body system which is solvable and consider the effect of the other
planets (mainly Jupiter) as a perturbation. Of course not every problem is amenable
to perturbative solutions. One important factor in using the methods of perturbation
is the existence of a small parameter in terms of which the solution may be expanded
and solved at each order. Thus the central idea of perturbation theory is to expand
the solution as a series in terms of this small parameter, similar to Taylor expansion,
around the exact solution. Thus for example the solution may be of the form
q(t) = q0 (t) + q1 (t) + 2 q2 (t) + .
(6.2)
However care must be exercised in using perturbation expansions since, as in the case
of Taylor expansion, the theory can diverge at some order or indeed may diverge for
all values of .
We first discuss some simple examples of perturbation of algebraic and differential
equations.
Regular and irregular Perturbations: Let us start with a simple quadratic
equation
x2 + x = 0,
X
x=
an n
n=0
127
128
a20 + a0 = 0, a0 = 0, 1
2a0 a1 + a1 1 = 0, a1 = 1, 1
a21 + 2a0 a2 + a2 = 0 a2 = 1, 1
where the solutions for a1 , a2 are obtained iteratively using the solution for a0 and
a1 . The procedure may be continued up to the desired order in . Therefore we may
write the solution of the algebraic equation obtained in perturbation in the form
x 1 = 2 +
x2 = 1 + 2 +
These solutions indeed correspond to the binomial expansion of the square root in
the exact solution
1
x1,2 = [1 1 + 4]
2
However a slight change in the equation produces what is called singular or
irregular perturbations. For example consider the following equation
x2 + x 1 = 0.
This equation is an example of structural instability since the nature of solutions are
very different with = 0 and 6= 0. Again, in perturbation we expand the solution
as
X
x=
an n
n=0
a0 1 = 0, a0 = 1
a20 + a1 = 0, a1 = 1
2a0 a1 + a2 = 0 a2 = 2,
129
Example-1: To illustrate the method, consider a first order differential equation
given by
dx
= x = x + x2 , || << 1,
dt
with x(t = 0) = A as the initial condition. Let the solution be of the form
x(t) = x0 (t) + x1 (t) + =
xn (t)n .
n=0
x n (t) =
n=0
xn (t) +
n=0
xn (t)
n=0
xn (t)n .
n =0
x 0 = x0 , x0 (t) = Aet
x 1 = x1 + x20 , x1 (t) = A2 et (et 1)
x 2 = x2 + 2x0 x1 , x2 (t) = A3 et (et 1)2
which is indeed the exact solution of the differential equation. Note that the power
series expansion becomes invalid over long times since the denominator vanishes at
t = tc = ln[
1 + A
].
A
Example-2: Continuing with examples, next let us consider the second order differential equation
d2 x
+ 02 x + x3 = 0.
dt2
where we assume 0 < << 1. This equation of motion can be derived from a quartic
oscillator system whose Hamiltonian is given by
1
1
p2
+ m02 x2 + x4 .
H=
2m 2
4
In deriving the equation of motion we have set m = 1. This system has one fixed
point at the origin which is elliptic and the motion for all energies is bounded and
periodic as seen from the phase curves.
130
where H0 is the integrable harmonic oscillator Hamiltonian and the quartic term is
taken as perturbation with as a small parameter. Once again we expand the solution
as a perturbation series
X
x(t) =
xn (t)n .
n=0
and extract the solutions at each order with the boundary condition x(0) = A, x(0)
=
0
0 at the initial time t = 0. At O( ) we have
x0 + 02 x0 = 0
and the solution is straight forward
x0 (t) = A cos(0 t);
x0 (0) = A.
At O(1 ) we have
x1 + 02 x1 = x30
Substituting the solution for x0 (t), driving term at this order, we have
1
x1 + 02 x1 = A3 [3 cos 0 t + cos 30 t]
4
and the solution is
x1 (t) =
1
A3
[3(0 t) sin 0 t + (cos 0 t cos 30 t)].
2
80
4
A3
1
[3(0 t) sin 0 t + (cos 0 t cos 30 t)].
2
80
4
Obviously the solution satisfies the initial conditions, however there is a serious problem here. The solution diverges as t . The linear dependence on time in x1 (t)
implies that the amplitude of oscillation will grow without bound for any no matter
how small. However the motion in a quartic oscillator is bounded for any energy E.
Thus the perturbative solution, as outline above, fails even in first order.
The reason for this failure of perturbation is traced to the fact that the driving
force of the undamped oscillator also has the same frequency 0 . This can be easily
seen by analysing the following example:
x1 + 2 x1 = cos t
which is an equation of motion of a simple harmonic oscillator with a forcing term of
frequency . The solution is given by
x1 (t) = a cos t 2
cos .t
2
131
The solution breaks down at resonance, when = . Consider the limit = + ,
where is small. In this limit
cos t cos t t sin t.
The second term has the same linear dependence on time as in perturbation theory
leading to unbounded oscillations if interpreted naively.
Poincares solution: The way around this difficulty was suggested by Poincare in
1892. In actual fact the perturbation changes not only x(t) but also the fundamental
frequency. Let us go back to the quartic oscillator where the usual perturbation
theory failed.
H = H0 + H1 ,
where the quartic term is again taken as perturbation with as a small parameter.
Let us try a slightly different perturbation series by writing
x = x(t),
where
x(t) =
xn (t)n
n=0
and
=
n n .
n=0
and extract the solutions at each order with the boundary condition x(0) = A, x(0)
=
0 at the initial time t = 0. Let x = dx/d(t) = (1/)x.
The differential equation to
solve is then
2 x + 02 x + x3 .
At O(0 ), expanding both and x we have
02 (x 0 + x0 ) = 0;
x0 (0) = A.
At O(1 ) we have
02 x 1 + 20 1 x0 + 02 x1 = x30
Substituting the solution for x0 (t), we have
3
1
1
02 x 1 +02 x1 = 2A0 1 cos 0 t A3 [3 cos 0 t+cos 30 t] = (21 0 A2 )A cos 0 t A3 cos 30 t.
4
4
4
Again we have a driving term in resonance which can lead to divergent amplitude.
Since we still have the choice of fixing 1 at this order let us choose
1 =
3A2
80
132
6.1
Let us first consider the first order perturbation of a Hamiltonian system with one
degree of freedom
H(p, q) = H0 (p, q) + H1 (p, q),
(6.3)
where H0 is an integrable Hamiltonian and H1 is a perturbation when the parameter
is small. We should assume that there is a region in the phase space where the
phase curves of H0 and H may be continuously deformed into each other. Obviously
this can not be done if the phase curves are separated by separatrix or if the phase
curves of each are in different invariant subspaces. We also assume that the motion
is bounded and periodic, with a period which is not too large so that perturbative
method is applicable.
In any given problem H1 has to be chosen with care. In the same problem the
choice may differ depending on the limit we are interested. For example, in the case
of a vertical pendulum the Hamiltonian is
H(, p) =
p2
2 cos ,
2
2 = g/L, m = 1.
p2
; H1 = 2 cos ;
2
p2 >> 2
p2 2 2
+
,
2
2
H1 = 2 [1 +
4
+ . . .],
24
where itself is the small parameter for a given constant and we may ignore the
constant term 2 . The H0 in either case is chosen such that it is an integrable
Hamiltonian.
Let us now go back to eq.(6.3) to continue with a general analysis. Let us assume
that both the unperturbed and the perturbed Hamiltonians are integrable. The
action-angle variables of the perturbed Hamiltonian are denoted by (I, ). Therefore,
the transformation
(p, q) (I, )
133
K
I =
=0
K
= (I) (t) = (I)t +
=
I
The problem now is to determine K(I) perturbatively.
Suppose (J, ) are the action-angle variables of H0 and (p, q) (J, ) is a CT,
then
H0 (p, q) H0 (J)
and
H0
=0
J =
H0
=
.
J
Let us now proceed sequentially from H0 to H relating the action-angle variables of
the two Hamiltonians perturbatively.
Consider first the CT (p, q) = (J, ):
H(J, ) = H0 (J) + H1 (J, )
and
H1
H
=
J =
H
H0 (J)
H1
=
=
+
.
J
J
J
Since (J, ) are not action-angle variables of H. Infact J = J() under the action of
the full Hamiltonian.
The canonical transformation we are interested is (J, ) (I, ) such that
H(I, ) = K(I). Area conservation under CT implies
I
I
Id = Jd.
In perturbation theory we assume that the variables (I, ) are related to (J, ), to
first order in , as given by
(I, ) = 0 (I, ) + 1 (I, ) +
J(I, ) = J0 (I, ) + J1 (I, ) +
where n , Jn are independent of . Implicit here is the fact that even though H1 is a
first order perturbation, (J, ) may in fact involve higher order terms (as in singular
perturbation theory).
134
H0
J1 + H1 (I, )
I
which yields
H0
J1 + H1 (I, ),
I
where the only unknown is J1 . To determine this we use the area conservation property under CT, that is
I
I
I
I
H0
H0
dK1 (I) = 2K1 (I) = d
J1 + H1 (I, ) =
dJ1 + dH1 (I, ).
I
I
K0 (I) = H0 (I);
Since
K1 (I) =
dI = 2I =
dJ
to order we have
I
I
I
I
1
1
][I + J1 ] = 2I + dJ1 + I d
2I = dJ = d[1 +
If we now assume that and 1 are periodic with the same period and furthermore
if the mean of 1 is zero (otherwise it is a constant), we may set the last integral on
the rhs to be zero. Therefore we have
I
dJ1 () = 0.
Thus we obtain
1
K1 (I) =
2
dH1 (, I) =
dH1 (I, )
H
= hH1 (I, )i
d
Therefore, the first order correction to the Hamiltonian K(I) is simply the mean of
H1 with respect to , taken over unperturbed motion giving
K(I) = H0 (I) + hH1 (I, )i.
135
H0 (I)
J1 = H1 (I, ) + 0 (I)J1
I
which yields
J1 =
Note that for perturbation theory to be valid the frequency 0 should not small as
it happens when we approach the separatrix. This will effect convergence in higher
orders since the n-th order terms will have a denominator 0n .
To find 1 we use the fact that the Jacobian
J
J
(J, )
=
= 1.
(I, )
I
I
Substituting
we have
1
=1+
J1
J
=1+
I
I
= O()
I
J
= O()
(J, )
1 J1
= 1 + [
+
]
(I, )
and therefore
J1
1
=
I
This completes the determination of all relevant quantities up to first order.
A simple example :
Consider the problem of vertical pendulum in the limit of fast rotations. The
Hamiltonian is
p2
2 cos = H0 + H1 .
H(p, ) =
2
2
The small parameter = in this problem and we may choose (J, ) = (p, ). It
is easy to see that
I2
K0 (I) = H0 (I) = ; 0 = I
2
and
Z 2
1
K1 (I) =
d cos = 0.
2 0
136
1 = 2 sin /I 2
apart from constant of integration. Hence to first order we have the complete solution
= = + 2 sin /I 2
J = p = I + 2 cos /I
where = It + .
6.1.1
The steps are identical to that of one dof given above but with some complications.
Consider the Hamiltonian for system with many dof:
H(pi , qi ) = H0 (pi , qi ) + H1 (pi , qi );
i = 1, , n
~ H
I~ =
~
~ =
~ (I)
IH =
Consider now the sequence canonical transformations
~
~ );
(~p, ~q) CT 1 (J,
~
~ ).
(~p, ~q) CT 2 (I,
137
~ = H0 (I)
~
K0 (I)
~ =
~
~ =
~ I H0 .J~1 + H1 (I,
~ )
~ ).
K1 (I)
~ 0 .J~1 + H1 (I,
~ =
K1 (I)
1
~0 (I).
(2)n
Z Y
i
~
~ )i.
di J~1 + hH1 (I,
~ (I,
~ can be induced by a
~ )
~ )
Recall that the canonical transformation, (J,
generating function S(I, ) (type II). We may then choose
~
~
~ S(I, );
~ I S(I,
~ ).
J~ =
~ =
In perturbation theory we may write
~ I~ + S1
S = S0 + S1 = .
The first term is simply the identity transformation in limit 0.
~ S = I~ +
~ S1
J~ =
The second term is J1 and to first order in perturbation we have
~
~ S1 + H1 (I,
~ ).
K1 =
~ 0 .
~ S1 is a periodic function in then taking average on
As before if we assume that
both sides we have
Z Y
1
~
~ ).
~
~
di H1 (I,
K1 (I) = hH1 (I)i =
(2)n
i
and
~ = H1 (I,
~
~ S1 = hH1 i H1 (I,
~ )
~ ),
~ 0 .
where the rhs is a deviation of H1 from the mean. Thus if the perturbation is periodic
then the deviation is also an oscillatory function of .
138
where m
~ = (m1 , ...mn ) is a set of integers which may take both positive and negative
values. Since S1 is also periodic in we may expand it also in Fourier series
X
~
S1 =
S1m exp(im.
~ ).
m
~
Since
~ S1 =
~ 0 .
~ = H1 =
i~0 .mS
~ 1m exp(im.
~ )
~
H1m exp(im.
~ ),
where exp(im.
~ are orthogonal functions, we obtain
S1m =
and therefore
~ = i
~ )
S1 (I,
H1m
i~0 .m
~
X H1m exp(im.
~
~ )
~ 0 .m
~
We now encounter a serious problem with this perturbation method. Note that
X
m.~
~ 0 =
mk 0k ,
k
where the mk form a set of integers. The sum is over all possible integers in general.
Therefore, when the frequencies 0k are commensurate (resonant or degenerate), there
always exists a set of (mk ) such that
~ 0 .m
~ = 0.
Therefore the sum diverges even in first order of Canonical perturbation theory. In
fact, even when the frequencies are not commensurate, the sum may become arbitrarily small. This is the content of the (in)famous small divisors problem which
plagued the advance of mechanics for nearly a hundred years.
The main problem with CPT is that it assumes that the perturbed system is
completely integrable as the unperturbed one. We use this fact in averaging process
to get K. It forces the integrability to be valid at every order of perturbation theorya kind of self fulfilling strategy. Nevertheless, it is not as bad it sounds at a first
glance.
The problem was first addressed by Komogorov in 1954 and extended to Hamiltonian systems by Arnold and to other systems by Moser (twist maps). As we have
established, the trajectories of an integrable Hamiltonian systems are confined to an
invariant tori. The most general motion on the tori is quasi-periodic.
When this system is subjected to a weak non-linear perturbation, some of the invariant tori may be deformed but survive when they meet the non-resonance condition
139
6.1.2
Diophantine condition
Let us look at more closely the convergence of the perturbation in the phase space.
There must be some restrictions on the frequencies for convergence to hold. We may
state this using two conditions:
Non-degeneracy condition: The unperturbed frequencies are given by
~ J H0 .
~0 =
In order that this be invertible we need the Hessian condition
det[(0 )(J )] = det[( 2 H0 )(J J )] 6= 0.
This is so when the frequencies are non-degenerate.
Furthermore, if H0 depends linearly on any J, the corresponding frequency does
not vary from Tori to Tori. Then the tori can not be characterised by unique
set of frequencies.
140
,
|m|k
m.
~ m
~ and and k are positive constants. Furthermore
lim () 0.
0
This is known as the Diophantine condition from number theory and is a way
of dealing with the small divisors problem.
Therefore for sufficiently (how small?) frequencies may be found that satisfy
the above conditions for an appropriate and define Tori on which the perturbed
flow is quasiperiodic.
The measure of such quasi-periodic frequencies increases as decreases. There
will be aperiodic frequencies that do not define Tori with quasiperiodic flows. Such
tori breakup when > 0, with a measure |=0 = 0.
Digression to number theory :
The KAM theorem mainly refers to irrational tori which are preserved under
sufficiently small perturbations and applies to perturbed Hamiltonian systems. In
this digression from the main theme, let us look at what is sufficiently irrational!
A straight forward way of looking at this aspect is to think of the vectors
~ 0 and
m
~ as possibly vectors in an integer lattice space. When they are orthogonal we have
the situation when the denominators in CPT are singular. What does is to define a
width to the planes perpendicular to
~ 0 vector. The union of all such planes will be
eliminated when the frequencies are sufficiently irrational. However, we are still left
with a set of non-zero measure where the perturbation theory is valid. What happens
to those rational frequencies is the content of of yet another theorem, namely the
Poincare- Birkhoff theorem which we will discuss later.
To see how this comes about, qualitatively, consider the set of rational numbers
p/q in the interval 0 1. These form a dense set, that is the neighbourhood of any
point however small contains rationals. Inspite of being dense they occupy little space
in the interval since they are countable, hence the measure is zero.
Let us illustrate this by the following procedure: Order the rationals in the
interval 0 1 (this is enough since we may take the ratios of frequencies with the
largest frequency). Construct an interval about the first rational, 2 about the
141
second, etc and n about the n-th. The total length of these intervals is then given
by
= + 2 + =
.
1
We may make arbitrarily small by choosing as small as we want. Since many of
these intervals may overlap, the actual length covered by rationals this way is always
lesser than .
A more relevant method to the KAM theorem is as follows: Each rational in the
unit interval may be written as p/q where q > p. For each integer q, there are utmost
q 1 rationals in the unit interval. For example
q = 5;
However, 2/6 = 1/3, 4/6 = 2/3 belong to q = 3 and 3/6 = 1/2 belongs to q = 2. For
every such p/q, construct an interval of the size 1/q 3 , for example. The length of the
interval is no more than (q 1)/q 3 for each q. The sum of such intervals about each
q is more than the actual interval of rationals Q in the unit interval. Therefore
Q<
X
q1
q=2
q3
<
X
2
1
=
(2)
1
=
1,
2
q
6
q=2
where (s) is the Riemann zeta function. The interval may be decreased by replacing
1/q 3 by /q 3 with < 1.
Every rational frequency 0 that satisfies
p
|0 | < 3
q
q
for some q lies in one of the covered intervals. The dosconnected irrationals left
uncovered by these intervals are those that satisfy the Diophantine condition
p
| | 3
q
q
q.
This measure is non-zero and therefore there are large number of irrationals that
satisfy this. Infact, one may replace the power 3 by some higher power k such that
p
| | k
q
q
q.
The farther an irrational torus is from a rational one, the more likely it is to survive.
This distance to rationals from a rational is through continued fractions: Note
that every positive number a can be written as a continued fraction
= a0 +
1
a1 +
1
a2 + a
1
3 +
= [a0 ; a1 , a2 , a3 , ],
142
1
3
= [0; 2];
= [0; 1, 1, 2].
2
5
If a is an irrational then the continued fraction does not terminate:
2 = [1; 2, 2, 2, ]
3 = [1; 2, 1, 2, 1, ]
19 = [4; 2, 1, 3, 1, 2, 8, 2, 1, 3, 1, 2, 8, ]
These are called quadratic irrationals since the pattern repeats indefinitely with some
well defined period. On the otherhand there is no such repetition in the case of
transcendental numbers, for example
e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, ]
= [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, 3, 1, ]
The most difficult irrational to approximate is the so called golden ration given by
= [1; 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ].
Every such irrational may be approximated by k-th approximant such that
a ak =
pk
qk
= a = lim ak .
k
pk
1
| <
.
qk
qk qk+1
Let us relate to the statements related to KAM theory. Suppose the torus corresponding to J() has an irrational frequency , whose k-th convergent is pk /qk . The
closest rational torus is then J(k )with a period q < qk . Thus if /qkk > 1/qk qk+1 ,
then
| k | < k
qk
which does not satisfy the Diophantine condition leading to the break down of the
torus.
To use the above in the case of two degrees of freedom, we simply replace the
frequency by the ratio of two frequencies, that is
|
where k 2.5 according to KAM.
1 p
| > k,
2 q
q
6.1.3
143
For simplicity let us consider an integrable Hamiltonian system with two degrees of
freedom. We consider a situation when this is perturbed. KAM theory relates to
what happens to the irrational tori of such a system and provides a condition for the
survival of such tori with small deformations. Poincare-Birkhoff fixed point theorem
relates to the case when the frequencies are rational.
Theorem: For the rational curve of an unperturbed system with rotation
number r/s, only na even number of fixed points survive under perturbation.
Consider an integrable Hamiltonian system H(I1 , I2 ) where I1 , I2 are action variables (constants of motion) in involution. As a result
H
I1 =
= 0;
1
Furthermore,
H
1 =
= 1 (I1 , I2 );
I1
H
I2 =
=0
2
H
2 =
= 2 (I1 , I2 ).
I2
Therefore
1 = 1 t + 1 (0)
2 = 2 t + 2 (0).
The two periods are given by
T1 =
2
;
1
T2 =
2
.
2
The motion in phase space takes place on a 2-torus. If the frequencies are irrational
then the motion is quasi-periodic, otherwise the trajectories close.
One way of analysing the motion is through Poincare surface of section, where
the motion is viewed through a slice or section of the torus. A surface of section map
is given either in the (I1 , 1 ) or (I2 , 2 ) plane. As a phase trajectory mover around
the torus it intersects a section (circular) periodically. For example consider a section
in (I1 , 1 ) plane. Consequent intersection with this plane may be written as
(1 )i+1 = (1 )i + 1 .T2
(I1 )i+1 = (I1 )i ,
that is as the angle 1 moves around the circle the trajectory intersects the curve
again after a period T2 corresponding to the motion around the torus. This is indeed
a map since on the surface of section we are looking at the intersections after a finite
time period. This is an area preserving map with the Jacobian
J =|
144
2
1
1
I
2
4
3
2
1
The curve C generated by this map T is an invariant curve since every subsequent
map lies on the same curve
C = T (C).
Furthermore, since T2 = 2/2 we have
(1 )i+1 = (1 )i + 2
1
= (1 )i + 2(I1 , I2 ),
2
where = 1 /2 is called the winding number, since it counts the number of times
the map has gone around the curve C. For example, if = n, then the map returns
the same point after every n periods. In general if is a rational, only a finite number
of points on the curve C is visited. If is irrational then the map never returns to
the same point so that the motion that results on the torus is quasiperiodic. The
transformation
(, I) ( , I = I)
is a canonical transformation and is area preserving.
Let us assume that this canonical transformation
= + 2
is generated by a type II function: S0 (I , ) where the subscripts are omitted for
simplicity. We have
S0
S0 (I )
I=
= I , = +
,
I
where is just the initial value of . This map generates an invariant curve C since
under the action of the map C is mapped on to itself.
Consider a small perturbation of this map given by
S = S0 (I ) + S1 (I , )
145
C
C
such that
S1
S1
S0 (I )
+
.
= +
I
I
Let us denote this as the map given by T . According to KAM theorem most of the
Tori are preserved if is sufficiently small. Consider a set of invariant curves of the
unperturbed map- these form a family of nested tori- that is
I =I +
T (C + ) = C + ;
T (C) = C;
T (C ) = C ;
= + 2r =
Therefore under T s every point on C is a fixed point. However, under T s , the curves
C rotate relative to C in opposite directions. That is
C = C + 2s = C + 2
C + = C + + 2s+ = C + + 2
X H1m
m
~ .m
~
S1
,
~
eim.
146
there always exists a set of integers such that Fourier expansion becomes singular.
Thus all rational tori corresponding to = r/s are destroyed. However, for sufficiently
small and if are sufficiently irrational in the neighbourhood of , then the tori
corresponding to C may be preserved. That is
T (C ) = C .
As in the case of T , T s leaves C rotating in opposite directions. Along any
radial line there is atleast one point where the angular coordinate is preserved under
the map Ts . Let R be the curve joining all such points. Obviously
T (R) 6= R,
since R is not an invariant curve of the map. Let us assume that T s leaves C
unchanged, R is therefore a curve on which is preserved but not the radius. That
is not all points on R are fixed points of the perturbed map repeated s times. But
there may be some fixed points of the map that must lie on R. To find these fixed
points subject R to the map Ts again:
Ts (R) = R .
This is nevertheless an area preserving map, therefore R must intersect R at an
even number of points. These must now be the fixed points of the map Ts . This is
essentially the content of the Poincare-Birkhoff theorem.
The PB theorem applies to rational tori, they are destroyed under pertubations,
where as KAM theorem applies to irrational tori which are stable under sufficiently
small perturbations. The two theorems are complementary to each other.
6.2
Adiabatic Theory
Conservative Hamiltonian systems are especially simple, the phase curves, especially
in one degree of freedom, are simply constant energy curves. However, when Hdepends explicitly on time this simplicity is lost. However, if there exists a related
conservative H, then often one can obtain approximations.
There are two extreme possibilities:
The motion is approximately periodic- the period and the Hamiltonian change
very little over a periodAdiabatic condition.
Simple pendulum with a slowly varying length, quartz oscillators, motion of
planets due to changing mass of the Sun (one part in 1013 in a year) are some
examples.
External periodic perturbation such that the period of perturbation is very
small compared to the unperturbed motion.
147
p2
,
2m
p = 2mE.
mvx
1
2 2mE x =
,
2
2
1 I
,
E=
2m x
where x is the distance between the planes which is fixed for now and v is the velocity.
We shall put m = 1 hereafter.
Now consider the case when the planes are moving relative to each other at a
velocity V , such that
0 |V | << |v|.
1
I=
2
p dq =
For simplicity we may fix one plane and let the other move. We make the convention
that V is positive when moving towards the other plane, and negative when moving
away.
Since |V | << v, the planes move very little between two successive collisions.
Therefore,
2x
t
,
v
where 2x is the approximate distance travelled between two collisions. If v is the
velocity after a collision with the plane, then
v = v + 2V.
dv
v v
V
= v.
dt
t
x
Since I = vx/, the change in action is given by
dv
dx
dI
x +v
= V v vV = 0.
dt
dt
dt
148
Thus the change in action I remains approximately constant this is easy to see since
as v increases x decreases and vice versa.
One can make this more precise: Suppose vn+1 is the velocity after n collisions,
then
vn+1 = vn + 2V vn = v0 + 2nV,
where v0 is the initial velocity. If xn is the separation of the planes at the instant of
the nth collision, then
xn+1 xn = V tn
xn+1 + xn = vn+1 tn
Therefore
1
vn+1 V
1 + zn
xn+1 = (vn+1 V )tn =
xn =
xn ,
2
vn+1 + V
1 + 3zn
where
zn =
V
<1
v0 + 2nV
2zn2
In
= 1+
1 + 3zn
correspondingly
1
1
1 + zn
E = vn2 = (v0 + 2nV )2 ; xn+1 =
xn .
2
2
1 + 3zn
Even though xn 0, En as n (when the walls are approaching each
other), In remains an approximate constant. That is the action changes very little
under adiabatic conditions. If the planes move apart, then the amplitude x increases
but the energy decreases, however the action again remains an approximate constant.
An elementary application of this is to the gas of non-interacting atoms in a cubic
box. If the walls move adiabatically, then vL a constant. But the temperature
kT mv 2 /2. Since L V 1/3 , v T 1/2 we have
T V 2/3 = constant
which is the gas law.
6.2.1
149
where is fixed such that the Hamiltonian changes significantly only over variations
of or order unity. Then
H
H
=
t
F1
F1
= H(I, ) +
.
t
R1 =
2 F1
,
I
2 F1
.
2 0
2
I = R2 (, I, ) :
R2 =
150
6.3
6.3.1
Fast perturbations
F sin t
m 2
F cos t
and
1
2
151
Thus the average motion always follows the unperturbed motion given by q0 (t) = vt
which is linear in time. Without further proof we may in general assume that
q(t) = q0 (t) + O(1/ 2 )
p(t) = p0 (t) + O(1/).
(6.4)
in the presence of fast perturbations, that is when is large. Thus in the order of
perturbations we assume that 2 and are of the same order. We will use this fact
presently.
Consider now a general Hamiltonian with a perturbing field that is rapidly oscillating. We want to find the Hamiltonian of mean motion. Let
H(p, q, t) = H0 (p, q) + V (q) sin t,
(6.5)
where V (q) sin t is an external perturbation. In the general the solution may be
written as a combination of the smooth and oscillating part:
q(t) = q0 (t) + (t),
H
H0
= q0 + =
,
p
p
H
= p0 + .
q
Now consider making a Taylor expansion of the full Hamiltonian given in Eq.(6.5)
in two variables around the variables of the mean motion (p0 , q0 ):
p =
H0 (p, q) = H0 (p0 , q0 ) +
H0
2 H0 2 H0
+
+
+
p0
p20 2
q0
and
V (q) = V (q0 ) +
V
+ ,
q0
where we have used the fact that and 2 are of the same order.
The equations of motion to first order in are given by
H0 2 H0
2 H0
3 H0 2
q = q0 + =
+
+
+
+
p0
p20
p30 2
q0 p0
2 H0
3 H0 2 2 H0
V
2V
H0
+
+
+ 2
]
[
+
] sin t +
q0
p0 q0
p0 q0 2
q02
q0
q02
In order to get the mean motion we average over the period of rapid oscillations
with hi = 0 = hi. We have
p = p0 + = [
hqi
=
H0 3 H0 2
+
h i +
p0
p30
2
152
2V
3 H0 2
H0
h sin ti +
2
h i
q0
p0 q0 2
q02
The equation of motion of the oscillatory terms keeping terms up to the leading
order only is given by
2
= H0 +
p20
and
V
sin t + ,
=
q0
Assuming q0 , p0 vary very little over a period, the approximate solutions for the
oscillating parts, (t), (t) from the above equations are therefore given by
hpi
=
(t) =
and
cos t V
q0
sin t V 2 H0
2 q0 p20
Further we also have, for the averages,
(t) =
h(t) sin ti =
and
1 V 2 H0
2 2 q0 p20
2
V
1
h (t)i =
2 2 q0
We now substitute these results in the equations of mean motion:
2 3
H0
H0
1
V
hqi
=
+ 2
p0
4
q0
p30
2
and
2 3
H0
V
1
1 V 2 H 2 V
H0
2
hpi
=
q0
4
q0
p20 q0 2 2 q0 p20 q02
Both these equations may be combined and written in the form of Hamiltonian
equations albeit with a new Hamiltonian K,
hqi
=
K
p0
and
hpi
=
where
K
q0
2 2
H0
V
1
(6.6)
K(q0 , p0 ) = H0 + 2
4
q0
p20
which is the desired effective Hamiltonian of mean motion. Note the correction is of
the order 1/ 2 .
6.3.2
153
p2
H=
ml(g F ) cos
2ml2
This is indeed a nice form since the forced vertical movement can at best alter the
acceleration, hence shift g.
Now consider the case when the F is given by an oscillating form:
F (t) = A sin t.
154
This is the same form as the rapidly oscillating perturbation that we considered in
the general theory in the previous section. The Hamiltonian may be written in the
form
A 2
p2
mlg(1
+
sin t) cos .
H=
2ml2
g
Solving this explicitly we may consider the Hamiltonian of the mean motion K given
in Eq.(6.6). The additional potential due to vibrations is given by
V (q) sin t = ml 2 A sin t cos .
Substituting this in Eq.(6.6) we have
K(p0 , ) =
p20
p2
mgl[cos k sin2 ] = 0 + Vef f ()
2m
2m
(6.7)
where
A2 2
.
4gl
The fixed points of the system are given by p0 = 0 and = 0, , and, cos =
1/2k where we have the first two, which are the old fixed points of the pendulum
and the third one is due to the effective potential.
k=
d2 Vef f
= mgl(2k 1)
d 2
Thus the fixed point is always stable if 2k > 1 or equivalently
2 > 2gl/A2 ,
that is for fast enough oscillations the originally unstable fixed point can be
made stable.
Chapter 7
Rigid Body Dynamics
Definition : A rigid body is a system of point particles held together by internal
forces such that the distance between any two particles is always a constant. The
internal forces may be imagined as due to light rigid rods connecting the pairs of
particles. Such forces are usually regarded as forces of constraint and by Newtons
third law these forces of constraint do no work in the rigid motion of the system.
Obviously no solid body is ever perfectly rigid. The following theory is for ideal
rigid bodies. In most applications however the deviation from the ideal rigid body is
assumed to be not significant.
A rigid body is characterised by six degrees of freedom. These are three translations and three rotations.
To describe the motion of a rigid body, let us consider two systems of coordinatesa fixed coordinate system S : OXY Z and a coordinate system fixed at some point in
the rigid body S : O X Y Z . Note that the origin O need not be the centre of mass
in general. We call these space fixed and body fixed coordinate systems respectively.
The motion of the rigid body can be represented as sum of two parts: The motion
of its centre of mass which is a translation of the rigid body without affecting its
orientation and the other part consists of rotation about the centre of mass whereby
the body moves to its final orientation. The rigid body motion can be described in
either of these frames depending on the convenience and simplicity. In what follows
we first describe the motion in the body fixed frame and later in the space fixed frame.
If ~v is the velocity of any point P in the rigid body relative to the fixed system
of coordinates, we have
~v = V~ +
~ r~ ,
(7.1)
where V~ is the translational velocity of the centre of mass of the body, r is the position
vector in the body fixed frame and the
~ is the instantaneous angular velocity of the
body and is along the axis of rotation as seen before in chapter 2. In what follows we
shall take the origin of the moving system of coordinates is taken to be at the centre
of mass of the body so that the axis of rotation passes through it. Considering rigid
body as a discrete system of particles, the kinetic energy may be written as
T =
X1
i
X
1
1X
mi (V~ +
~ ~ri )2 = M V 2 +
mi V~ .~ ~ri +
mi (~ ~ri )2 , (7.2)
2
2
2 i
i
155
156
Z*
P
r*
O*
r
R
O
X*
7.1
Inertia Tensor
Let us now consider the motion of the rigid body such that only its orientation is
changing- we can always add the translational motion of the centre of mass to get the
more general motion of the rigid body. That is the motion of a rigid body with one
point fixed at let us say O. We choose the frame S such that the origin is at O and
call this S : OXY Z the laboratory frame or the space fixed frame. Relative to S the
total angular momentum of a system of n-particles is given by
~ =
L
n
X
i=1
~ri p~i ,
(7.4)
where ~ri is the position vector of a typical particle of the system and p~i = mi~vi is
its linear momentum. Assume that there is one fixed point in the rigid body with
157
respect to S such that the body can only rotate about an axis passing through the
fixed point. Without loss of generality we can choose the origin of S to be the fixed
point itself.
We may now introduce a body fixed frame S : OX Y Z which rotates along
with the rigid body. Evidently the frames S and S have a common origin in O. If
~r is the relative coordinate in S we have the relation
d~r
d~r
=
+
~ ~r .
(7.5)
dt
dt
But ~r is fixed relative to S . Therefore the first term is zero in the above equation
and therefore
~v =
~ ~r .
(7.6)
We recall here that
~ is the instantaneous angular velocity of S relative to S and
hence it is the angular velocity of the rigid body relative to S. We drop the superscript
and choose ~r = ~r even for the body fixed frame. Substituting Eq.(7.6) in Eq.(7.4)
we get
n
X
X
~
L=
~ri (mi~vi ) =
mi (ri2
~ (~ri .~ )~ri )
(7.7)
i=1
~ and
which shows that the angular momentum L
~ are not in general parallel to
~ = (L1 , L2 , L3 ), ~ri = (x, xi2 , xi3 ),
each other. Denoting the components of L
~ =
i1
(1 , 2 , 3 ) relative to the frame S, we have
X
X
X
L =
mi [ri2
xi xi ] =
I ,
(7.8)
i
where I defines the Cartesian Inertia Tensor of rank 2 of the the rigid body,
I =
n
X
i=1
mi [ri2 xi xi ].
1
I11 I12 I13
L1
L2 = I21 I22 I23 2
3
I31 I32 I33
L3
(7.9)
(7.10)
Note that the numbers I are the components of the Cartesian tensor written in the
frame S . In S the inertia tensor would have components which are in general different
from I given above. In the body fixed frame, the components are independent of
time, by definition, where as they are in general time dependent in space fixed frame.
However the two are related by an orthogonal transformation since the inertia tensor
is real and symmetric. We will discuss this transformation later.
To understand the meaning of the inertia tensor components, first consider a
diagonal element, say I11 . We have
I11 =
n
X
i=1
mi [ri2
x2i1 ].
n
X
i=1
mi [x2i2 + x2i3 ]
(7.11)
158
which we immediately recognise as the moment of inertia of the rigid body about the
x-axis. Similarly I22 and I33 are moments of inertia of the rigid body about the yand z-axis respectively. A typical non-diagonal component such as I12 is given by
I12 =
n
X
mi xi1 xi2
(7.12)
i=1
dV (y + x );
I12 =
dV xy.
(7.14)
Principal moments of inertia : The inertia tensor being a real symmetric matrix
can be diagonalised by a rotation of the axes. Thus we can always choose the body
fixed frame S such that in this frame the matrix of inertial tensor in this frame, I
is diagonal,
I1 0 0
(7.15)
(I
) = 0 I2 0 .
0 0 I3
The eigenvalues Ii of the inertia tensor are called the principal moments of inertia of
the body. We shall assume from now on that the body fixed frame S has been so
chosen that the axes are along three eigenvectors (principal directions) of the inertia
tensor of the rigid body. Hence the above equation gives the inertia tensor relative
to the body frame S .
Two important properties of the Inertia tensor that we need to remember are
the following:
The eigenvalues I of the inertia tensor are real and positive: This is seen from
the followingcontract the inertia tensor with an arbitrary vector,
X
I a a =
mi [ri2 a2 (~ri .~a)2 ] 0
(7.16)
i
(7.17)
159
Parallel Axis Theorem: The inertia tensor computation is usually difficult since
it depends on where the fixed point is located. Let us assume that this fixed
point is the centre of mass of the rigid body. Let us displace the the point by
~a from the centre of mass. We have by definition
X
I
=
mi [(~ri ~a)2 (~ri ~a) .(~ri ~a) ].
(7.18)
i
P
However all the linear terms in ~ri vanish since i mi~ri = 0 when ~ri is measured
from the centre of mass. Therefore we have the theorem that
I
= I + M [~a.~a ~a~a ].
(7.19)
The first term is the inertia tensor about the centre of mass and the second
term is the inertia tensor if the rigid body was concentrated at the centre of
mass.
7.2
The equation of motion for a rigid body with one point fixed are simply the equations
of motion for its total angular momentum components, that is,
X
~
dL
~ =
=N
~ri F~i
dt
i
(7.20)
where F~ is the external force on a particle of the rigid body situated at ~r. As we have
seen before the internal forces binding the particles of the rigid body together do not
contribute to the total external torque experienced by the rigid body, a consequence
of Newtons third law of motion. Note that this equation holds in S as well as S . In
the body fixed frame we have
~
d L
~ =N
~
+
~ L
dt
(7.21)
which is more suited for analysis in the body fixed frame. If we now choose in the
~ = (L , L , L ),
~
body fixed frame L
1
2
3 ~ = (1 , 2 , 3 ), N = (N1 , N2 , N3 ), we obtain
from Eq.(7.20)
d L1
+ (2 L3 3 L2 ) = N1
dt
d L2
+ (3 L1 1 L3 ) = N2
dt
d L3
+ (1 L2 2 L1 ) = N3
dt
(7.22)
Now substitute Li = Ii i since the inertia tensor is diagonal in the body fixed frame
by choice. Furthermore we note that the eigenvalues Ii of the inertia tensor are
160
I1
(7.23)
These are the celebrated Euler equations of motion for a rigid body with one point
fixed. Note that these are equations written in the body fixed frame as otherwise Ii
are not constants.
Force free motion of a rigid body with one point fixed : When the torque
~ is not zero, it is difficult to integrate Eulers equations 7.23 as the components of
N
the torque are seldom known relative to the body fixed frame S . However, when it
is force free motion, that is torque is zero, we get
I
d
+ (I I ) = 0;
dt
(, , cyclic1, 2, 3).
(7.24)
Note that when the torque is zero, that is force free motion, the angular momentum
is conserved in the space fixed frame S. We shall integrate these equations in some
simple cases.
Spherical top: For a spherical top the moments of inertia I1 = I2 = I3 .
The Euler equations (in 7.24) give = constant. This also implies that
~ = I~ , that is, the an =constant since d /dt = d/dt. Further we have L
gular momentum and angular velocity vectors are parallel to each other which
does not hold in general.
Axially symmetric top: In this case we can choose the axis of symmetry
along the z-axis. Then I1 = I2 = I and I3 = I 6= I. Then Eq.(7.24) becomes
d 1
I
+ (I I)2 3 = 0
dt
d 2
I
+ (I I )3 1 = 0
dt
d 3
I
= 0, 3 = constant
dt
(7.25)
Defining
I I
I
we may combine equations in 7.25 as follows:
= 3
(7.26)
d (1 + i2 )
d (1 + i2 )
(2 i1 ) =
+ i(1 + i2 ) = 0
dt
dt
(7.27)
161
(7.28)
2 = |A| sin(t + ).
Thus it follows that precesses around the z axis, that is the axis about which
the rigid body is symmetric) with an angular speed .
Z*
Z*
Y*
I > I
Y*
I < I
X*
X*
Consequently we have
L3 = I 3 , which is constant
L1 = I1 = I|A| cos(t + )
L2 = I1 = I|A| sin(t + )
(7.29)
2
300 days
||
162
a result first obtained by Euler in 1749 who also obtained 300 days. The effect was
detected much later by Chandler in 1891 and the wobble period was found to be 427
days. The amplitude of precession is variable and very small approximately about 10
metres from the North Pole at the surface of the earth. The discrepancy between the
calculated and the observed period (435 days according to recent measurements) is
attributed to the fact that the earth is not a perfect rigid body- it has a fluid outer
core etc. This discrepancy is not well understood though there are various theories.
7.3
T =
1X
1 ~
~ .L.
=
I =
2 ,
2
(7.30)
The asymmetrical top : Let us now apply Eulers equations to the more complicated case of the free rotation of an asymmetric top. Here all the three moments
of inertia are different. In the body fixed frame, since the motion is force free, there
are two constants of motion, T and L. We may write these in terms of the angular
velocity as
2T = I1 12 + I2 22 + I3 32
L2 = I12 12 + I22 22 + I32 32
(7.31)
We have chosen the axes in the body fixed frame to be the principal axes of inertia. These two equations can be written in terms of the components of the angular
~ as
momentum vector L
L21 L22 L23
+
+
I1
I2
I3
2
2
= L1 + L2 + L23
2T =
L2
(7.32)
Regarding L1 , L2 , L3 as coordinates,
the first
of these equations describes an ellipsoid
163
~ then moves relative to axes of inertia of the asymmetrical top along the
The vector L
intersection of these two surfaces. The existence of an intersection is obvious by the
following condition
2T I3 > L2 > 2T I1 .
The radius of the sphere lies between the smallest and the largest semi-axes of the
ellipsoid.
Let us look at how the paths of intersection, corresponding to the tip of the
~ change with kinetic energy. For L2 slightly greater than the lower limit
vector L,
2T I1 , the sphere intersects the ellipsoid in two small circles around the x1 axis. As
L2 approaches the limit, the circles degenerate to two points at the poles. As L2
increases the closed curves become larger and for L2 = 2T I2 they become ellipses
which intersect at the poles of the x2 axis. As L2 2T I3 they again shrink to poles
on the x3 axis. The corresponding curves described by
~ are called polhodes (meaning
pole paths in greek).
~ is periodic.
Since the paths are always closed the motion of the tip of the vector L
During one period the tip of the vector describes some conical surface and returns to
its original position. Close to the x1 and x3 axes the paths are always nearly circles
and lie in the neighbourhood of the poles. The paths which pass through the poles
on the x2 axis are interesting lines which cover the ellipsoid with large ellipses. This
has to do with the stability of rotation of the top- near x1 and x3 axes the rotation
about these axes is stable in the sense that any slight disturbance will make the top
deviate slightly from the original path and resembles the original motion. However,
a rotation about x2 axis is unstable since any small deviation is sufficient to take the
top away from the original motion.
L
L
L
Let us try to understand this in terms of the angular velocity. The Eulers
equations may be written for the asymmetrical body in the body fixed frame may be
written as
d 1
+ A2 3 = 0
dt
d 2
B3 1 = 0
dt
d 3
+ C1 2 = 0,
(7.33)
dt
164
where A = (I3 I2 )/I1 , B = (I3 I1 )/I2 , C = (I2 I1 )/I3 are all positive since
I3 > I2 > I1 by choice.
Suppose the rotation is almost about the major axis but not quite, 1 >> 2,3 .
Therefore we have
d 1
0
dt
d 2
B3 1 = 0
dt
d 3
+ C1 2 = 0
dt
If we define =
(7.34)
Thus the angular velocity traces an ellipse centred around the major axis.
Suppose the rotation is almost about the minor axis but not quite, 3 >> 2,1 .
Therefore we have
d 3
0
dt
d 2
B3 1 = 0
dt
d 1
+ A3 2 = 0
dt
If we define =
(7.35)
Again the angular velocity traces an ellipse centred around the minor axis.
Next consider the rotation is almost about the intermediate axis, 2 >> 1,3 .
Therefore we have
d 2
0
dt
d 3
+ C2 1 = 0
dt
d 1
+ A3 2 = 0
dt
If we now define =
The angular velocity diverges exponentially away from the intermediate axis.
(7.36)
165
7.4
Eulerian angles
Until now we have considered the motion of the rigid body from the body fixed frame.
As such there was no need to introduce the orientational degrees of freedom. However,
a view from S is possible only after specifying the orientation of the rigid body in the
space fixed frame. As mentioned in the beginning, the number of degrees of freedom
of a rigid body is six. Out of these three coordinates may be chosen as the coordinates
of the centre of mass and any three angles which determine the orientation of the axes
in the body fixed frame S relative to the axes of the space fixed system S. There
are many ways of choosing these angles and convenient representation is in in terms
of Eulerian angles.
Eulers Theorem: An arbitrary rotation can be expressed as a product of three
successive rotations about 3-distinct axes.
As before let us assume the two origins to be the same located at the centre of
mass of the rigid body. The line ON denotes the line of nodes corresponding to the
intersection of the XY plane in the space fixed system with the X Y plane of the
body fixed system. The line of nodes is evidently perpendicular to the Z and Z axes.
We take the positive direction along the vector product Z Z .
The orientation of X Y Z axes with respect to XY Z is defined in terms of the
Euler angles , , . Here is the angle between Z and Z axes. Angle is the
angle between the X axis and the line of nodes where as angle denotes the X
axis and the line of nodes ON. Angles , are measured around the axes Z and Z
respectively in the direction given by the corkscrew rule. Furthermore it is clear that
0 ,
0 , 2.
166
fixed axes in the frame S in terms of the Eulerian angles and their derivatives. First
,
:
The angular velocity is along the line of
consider the angular velocities ,
nodes ON. Its components along X Y Z are given by
1 = cos , 2 = sin , 3 = 0
The angular velocity is along the Z axis and its components are given by
1 = sin sin , 2 = sin cos , 3 = cos
The angular velocity is along the Z axis and therefore
1 = 0, 2 = 0, 3 = .
Collecting the components along each axis, we have
1 = sin sin + cos
2 = sin cos sin
3 = cos +
(7.37)
Heavy symmetrical top : When the axes X Y Z are chosen such that they are
along the principal axes of inertial of the body, the kinetic energy of rotation takes a
simple form. Consider for example an axially symmetric top: we have I1 = I2 6= I3 .
A simple calculation give the kinetic energy as
1
1
1
1
2.
T = I1 (12 + 22 ) + I3 32 = I1 ( 2 sin2 + 2 ) + I3 ( cos + )
2
2
2
2
(7.38)
The Lagrangian for a heavy symmetrical top (symmetrical top in a gravitational field
M gl cos ) is given by
1
1
2 M gl cos .
L = I1 ( 2 sin2 + 2 ) + I3 ( cos + )
2
2
(7.39)
(To be precise we should write I1 = I1 + M l2 since the fixed point is at the bottom
of the top and not at the centre of mass. Using parallel axis theorem, we may add a
constant. However we shall continue to use I1 with the understanding that it is about
the fixed point.) We immediately notice that the angles and do not explicitly
appear in the Lagrangian but only their time derivatives. These are cyclic or ignorable
coordinates. The corresponding momenta are therefore conserved and are given by
p = L/ = I3 ( + cos ) = I3 3 = LZ
which is angular momentum component about the Z axis,
p = L/ = I1 sin2 + I3 ( + cos ) cos = LZ
Apart from the momenta, the energy E of the heavy symmetrical top is also conserved,
1
1
E = I1 ( 2 sin2 + 2 ) + I3 32 + M gl cos .
2
2
(7.40)
167
Z*
Line of Nodes
X*
X
N
and
LZ LZ cos
=
I1 sin2
(7.41)
LZ
LZ LZ cos
=
cos
I3
I1 sin2
(7.42)
If now we can solve for (t), we can integrate the above equations to get (t) and
(t). Using the constants of motion we can define the reduced energy as
E = E
L2Z
M gl
2I3
(7.43)
(7.44)
(7.45)
(7.46)
Vef f =
168
which is a Jacobian elliptic integral. Once we know (t), in principle the solutions for
, may be written by using the same procedure, rather difficult task in practice.
We shall discuss the motion arising from the above equations only qualitatively.
Firstly the time variation of describes the rotation of the top about the Z axis.
Since is a constant of motion the top spins around this axis with uniform angular
velocity.
The range of variation of the angle , inclination of the top with respect to the
Z axis in the space fixed frame (as we view the top normally). The range of can be
obtained by noting that E Vef f (). The function Vef f tends to infinity at = 0, ,
and has a minimum between the two values. The range of allowed values of is
obtained from the two roots of the equation E = Vef f . Suppose we obtain 1 < 2
as the two roots, then we have the following cases:
(LZ LZ cos ) > 0 for 1 2 . In this case does not change sign and
the axis of the top precesses about the vertical while oscillating up and down.
This is shown in the first figure below where the curve traces the trajectory of
the Z on the surface of a sphere whose centre is at the fixed point of the top.
If (LZ LZ cos ) changes sign for 1 2 , we have changing sign. The
precession direction is opposite at each end of the range of . The axis of the
top describes the loops as shown in the figure as it moves around the vertical
in the space fixed axis.
If one of the i is the zero of the equation (LZ LZ cos ), then both the and
vanish on the corresponding limiting circle, the path will then have cusps as
shown in the figure.
2
1
2
1
1
L2Z
M gl)2 ,
8I1 2
169
So if L2Z > 4I1 M gl or if 32 > 4I1 M gl/I32 the top is stable otherwise it is unstable.
This is some thing we actually observe in the top motion- when the angular velocity
is large according to the above condition, a slight disturbance of the top will push
it back to the stable position, however due to friction the angular velocity decreases
and below the critical value even a slight disturbance will take it away from the
equilibrium rotation.
170
Chapter 8
Non-linear Maps and Chaos
8.1
Introduction
The Classical Mechanics that is often presented in class rooms is in most of the cases
limited to specific kinds of systems which are exactly solvable. We have, however,
extended our analytical tools to handle systems which are not exactly solvable with
the following methods:
Identifying the fixed points in the phase space.
Do stability analysis to have local information even in the absence of global
understanding.
Use perturbation theory when exact solutions are not available.
While this enlarges the kind of problems that one can deal with, these methods do
not go too far either. It is more a rule than an exception to have systems which are
not amenable to analytical tools. Quite often we have to resort to numerical methods
and fast computers to solve complex systems.
However, there are systems, more common than is normally expected, which
are qualitatively different from any thing that we have known before. They may
be chaotic. Even when the system does not look complicated, (e.g. the dynamical
equations may look very simple), the system may exhibit a rich repertoire of dynamics
and can go beyond the scope of any theoretical prediction within a very short time!!
Determinism vs. predictability :
The basic structure of Classical Mechanics tells us that once we know the differential equations governing the dynamics of some system (from Newtons second
law, or from the knowledge of the Hamiltonian of the system) and the values of the
quantities of interest at some point in time (initial conditions), we know in principle
all about the past and the future of the system. Thus, we say Classical Mechanics is
a deterministic theory.
For centuries, people including prolific mathematicians like Lagrange have thought
that from a deterministic theory we can always predict everything about the system,
171
172
given enough computing power. It is only recently that people have come to appreciate the important property about the chaotic systems where long term predictions
are doomed no matter how large the computer is. In fact, we will see that the outcomes of such systems behave almost like random variates, which are by definition
unpredictable1 .
Therefore, determinism does not ensure predictability.
Let us go back a little to what we learnt from Hamiltonian dynamics. If the
system is of degree one, the phase space is two dimensional. If the Hamiltonian is
independent of time, then energy is conserved. Therefore the phase space trajectories
are determined by the constancy of energy alone. In general a one degree of freedom system with a two dimensional phase space may display many fixed points and
depending on the nature of fixed points one may get either stable periodic orbits or
unstable hyperbolic orbits. If the dynamics is dissipative, such trajectories or family
of trajectories may end up in a single periodic orbit or a limit cycle after some time.
This is so because on a plane the phase space trajectories can not cross in a deterministic evolution. That is about as complicated as it gets. But there is no more
complication in such a system.
If we go to systems with many degrees of freedom, things get more complicated.
Even when the Hamiltonian is integrable, we have to find the appropriate action-angle
variables which is complicated in general. Once such these are found, we can predict
the long time behaviour of the system for a given set of initial conditions. If the system
is non-integrable in the Liouville sense, then the situation is more complicated. We
may therefore qualitatively generate a Hierarchy of motions (we consider bounded
motion, unless otherwise specified, as these dominate physical situations):
In an integrable Hamiltonian system a bounded motion may be completely
Periodic if the frequencies are comensurate. The motion is then confined to a
torus in the phase space with the trajectories closing after a period which is
given by the frequencies for a given set of initial conditions. The phase space
in general resembles nested tori when the dynamics is considered for a set of
initial conditions.
When the frequencies are not comensurate, the motion that results is in general
Quasi-periodic. Infact, it may be ergodic on the surface of the torus, that is if
we wait long enough the trajectory visits the neighbourhood of every point on
the surface of the torus but not on the full phase space.
We have already seen examples of the above type earlier.
On the otherhand if the dynamics is not integrable then the phase space trajectory still lies on an energy surface for a conservative Hamiltonian. We start
with a point ~ in some part of the phase space. A trajectory is a sequence of
points
~ T 2 (),
~ T 3 (),
~ , T n (),
~
T (),
1
We should not confuse chaotic systems with Stochastic systems where the system is intrinsically
non-deterministic due to the presence of external random disturbance or noise.
173
8.1. INTRODUCTION
~ =
F ()
~
F ()d.
R
where R is the relevant region of the phase space and d is a measure of the
volume. The dynamics is said to be Ergodic if for sufficiently large class of
functions we have
< F >= F .
The time average may vary from point to point, but if the system is sufficiently
ergodic it may remain almost independent of the initial point.
The flow may be ergodic, in particular, only on an invaraint surface as in the
case of quasi-periodic motion on a torus.
There is yet another property of the phase space flows which is realised some
times. Consider a small volume in phase space A0 embedded in a larger volume
R. Just as an ink drop in a volume of clear water splits and spreads over the
whole volume over time, the points in A0 may spread over the whole volume. It
is ergodic in this sense, but it may be worse than just being ergodic. The initial
volume, even if the volume is preserved as in the case of Hamiltonian flows, may
be distorted so much so the little pieces of the original A0 may be arbitrarily far
apart- even as far as the size of the R itself just as the little pieces of a drop of
ink are found over the whole volume of water. This critical concept is known as
Mixing. Such a mixing is of course ergodic, but ergodicity is a weaker condition
in the sense that points in A0 visit arbitrary neighbourhoods over a long period
of time, while still remaining close to each other. Therefore mixing is ergodic
but the converse is not always true.
We can make this quantative with the following definition of what is usually
called Strong Mixing: Let R denote the volume of the phase space- set of all
points in the of the phase space. Let A0 denote a small volume element in this
space. We may define a measure (A0 ) as the volume of A0 itself. This over
time evolves into A1 , A2 , A3 , , An , where each one of these Ai denotes
how A0 evolves at different times with measures given by (Ai ). Suppose we
define a constant volume B inside R, with a measure (B), as a reference
volume. The concept of mixing is then given by the number of points in volume
A0 remaining in the reference volume B after the phase space fluid is stirred
during time evolution. Strong mixing dynamics then occurs if
(An B)
(B)
=
;
(A0 )
(R)
n .
174
The next important question is, how fast does this mixing occur? Consider two
points close by in A0 . In the presence of strong mixing these points separate
out. We want to know the rate at which the points are getting separated from
an initial region in phase space. It is surprising but in most dynamic systems
(dimension greater than two) this rate of separation happens exponentially fast.
If d(0) denotes the separation between two points at time t = 0, at time t we
may write this as
d(t) = et d(0),
with as the controlling parameter characterising the separation between two
initial points. The parameter is called the Lyapunov exponent and is a diagnostic of the behaviour of orbits. We may define the exponent for each projection
or orientation, say in some direction x, in which case there may be more than
one Lyapunov exponent. For example elliptic orbits have zero Lyapunov exponents in all projections, where as hyperbolic orbits have positive and negative
exponents for distances in differnt directions.
However if the Lyapunov exponents are positive in all directions, we have what
is referred to as Exponential Sensitivity. If we consider the distance between any
two points, instead of looking at separation in different orientations, in general,
the rate of expansion is controlled by the largest exponent.
This is an in general an average statement since the phase space is bounded.
Furthermore, there may still be regions of regularity in phase space where this
does not occur. In fact the same system may start out as regular almost every where in phase space and end up as completely irregular with exponential
sensitivity depending on a parameter in the Hamiltonian. In this case we have
global exponential instability.
There are however some trivial cases where there could be exponential sensitivity. For example an one dimensional dynamical system
x = x
displays exponential sensitivity since two points at x and x + after time t are
separated by et however small is. But this is a perfectly integrable system
and there is no irregular behaviour here.
Now we are in a position to define what is chaos? or more precisely what is
commonly assumed, deterministic chaos since the equations of motion are deterministic. There is no unique way of defining this, but we give below a definition that is
intuitive.
What is Chaos? The essential property due to which systems become chaotic
is the sensitivity to initial conditions. By this we mean, that any two phase space
trajectories that start out very close to each other separate exponentially with time. In
practice this means that after a small time the separation ~x(t) attains the magnitude
175
of L, the characteristic length scale of the system as a whole. As pointed above, this
alone is not sufficient for the onset of chaos.
A chaotic dynamical system must have exponential growth of error in computing
a phase space trajectory and typically should also have an attractor with in bounded
phase space volume along with unstable fixed points. To satisfy both these conditions,
the phase trajectories should come back close together, a property of mixing and hence
ergodic. Chaotic systems may have phase space diagrams where a certain pattern
seems to occur over and over infinitely many times, but never actually repeating.
Using these facts, we may therefore define chaos as:
The system obeys deterministic laws of evolution, but the outcome is
highly sensitive to small uncertainties in the specification of the initial
state: Locally unstable and globally mixing.
The measure or the property of sensitivity to initial conditions is quantified by
the Lyapunov exponent
|~x(t)| et |~x(0)|,
where , the mean rate of separation of the trajectories of the system, is called the
Lyapunov exponent. For the system to be chaotic, the Lyapunov exponent has
to be positive. As pointed out earlier, positive Lyapunov exponent alone does not
guarantee chaos. This is because of the fact that Lyapunov exponents carry absolutely
no information about the mixing property.
It is also worth pointing out that while dynamics described by continuous evolution in time can become chaotic when the dimension of phase space is 3, the
discrete time dynamics with so called maps may display complex behaviour even in
the most simple cases. Next we will focus on modelling the dynamics through maps.
8.2
To study the dynamics of a system, we first need to model them in a neat mathematical construct. This in most physical systems is a set of differential equations
which describes how the quantities to be observed change with a parameter time. By
discretising time, we can convert our differential equations to difference equations2 or
maps. For example, the differential equation
dx(t)
= g(t)
dt
has the equivalent map
xn+1 = xn + gn
where the discretised time is labelled by n with some arbitrary constant time step
t.
2
176
This eases out the computational task of solving or simulating the system to a
great extent and allows us to concentrate on the qualitative nature of the dynamics
without caring a great deal about the mathematical difficulties of the model.
More importantly, in practical applications, what we have is a set of observations,
a sequence of states, at times differing by multiples of a period. Such a situation is
more common in biological evolution, economics, agriculture or medical data. In some
cases this is the real situation since the changes occur, like breeding season in some
populations, at specific time intervals. The dynamics is discrete in such cases and not
just a convenience. Maps turn out to be a natural choice to express such dynamics.
The system may have many variables, and corresponding no. of dimensions in the
phase space. We, henceforth, shall confine ourselves to maps involving one variable.
8.2.1
R R,
x 7 fr (x) .
(8.1)
that is,
xn+1 = fr (xn ) .
nN
(8.2)
Here the variable is x and r denotes the control parameter(s) which is system dependent.
To determine the variable after k time steps, the map can be iterated k times to
obtain a new iterated map denoted by f k which is given by:
f k (xn ) := f (f (. . . f (xn ) . . . ) = xn+k .
|
{z
}
k times
(8.3)
177
It is important to keep the order in which the limit are taken as shown above since
if the time step is taken to to be very large first, the limit may diverge. This may
written in terms of the derivatives of the map function:
k
df
,
(8.5)
(x0 ) = lim ln
k
dx
and since f k (x0 ) = f (f (f ( f (x0 ) ), we have
df (xk1 ) df (xk2 )
df (x1 ) df (x0 )
(x0 ) = lim ln
,
k
dxk1 dxk2
dx1 dx0
(8.6)
as a result the Lyapunov exponent depends on the entire trajectory. Note that the
notation here means that the derivative taken along the trajectory is evaluated at
some discrete points xi .
A Fixed point of the map xr , if it exists, is obtained by
xr = fr (xr ) .
(8.7)
which may be stable or unstable. Note that as before once the system is at a fixed
point by definition, it stays there for all times that is for all further iterations of the
map.
The fixed point may be, in general, stable or unstable. It is stable if the system
tends towards the fixed point, that is,
lim xn = lim f n (x1 ) = x
(8.8)
(8.9)
One may add a further constant, but that can be absorbed in the redefinition of x
itself. For r 6= 1, the fixed point is x = 0. Further iterations of the map gives,
frk (x0 ) = rk x0
(8.10)
Thus the system evolves according to two possible scenarios: When |r| < 1 and any
initial value x0 R the system converges to x = 0. The fixed point is therefore
stable.
For |r| > 1 the fixed point at the origin is unstable. The Lyapunov-Exponent is
(from Eq.(8.4))
"
#
1 df k (x)
(x0 ) = lim
ln
(8.11)
k k
dx x=x0
1 k
ln r
(8.12)
= lim
k k
= ln |r|
(8.13)
178
and is independent of the initial starting pointx0 . The behaviour of the system may
be summarised as follows:
|r| < 1
|r| > 1
limn xn 0
limn xn
x = 0 stable FP
x = 0 unstable FP
<0
>0
Linear map is obviously not chaotic. But a small variation on this can lead to sensitivity to intial conditions:
Bernoulli Map :
This is a variation of linear map which is produced by the rule
xn+1 = (2xn ) mod 1 n 0.
Equivalently the map function may be written as
f (x) = 2x 0 x < 0.5,
f (x) = 2x 1 0.5 x < 1.
All the iterates of the map x0 , x1 , x2 , lie in the region d = [0, 1). This is often
called dyadic transformation. The map function is non-linear and is not invertible.
The fixed point of the map is given by
f (x ) = x = x1 = 0.
The fixed point is unstable since the slope is 2 and is positive.
The basic map function may be used to define higherorder maps: for example
g(x) = f (f (x)) = f 2 (x) = 4x mod 1.
The fixed points of g(x) = f 2 (x) are then given by, apart from x1
x2 = 1/3,
x3 = 2/3,
which are unstable. It is easy to see that under the basic map function f , we have
g(x1,2 ) = x1,2 = f (x2 ) = x3 ;
f (x3 ) = x2 .
This leads to the interesting behaviour that under the action of f , the fixed points of
the second order map generate a periodic orbit of period 2.
It is straightforward to see that any q-th iteration of the map, generates its own
set of 2q1 unstable fixed points of the form p/(q 1). The basic map function maps
this set to itself with a period given by 2q1 . Therefore if the initial point x0 is a
rational, its image contains a finite number of points in the interval [0,1) and is always
periodic. All possible periods are possible. For example
1/3 2/3 1/3 ,
179
xn+1 =
xn
xn
180
1
51
=
x =
2
1 + 1+ 1 1
1+
which is the golden ratio, the most irrational number. Infact since in each strip an
integer n = 1, 2, 3, (from the right) is subtrated, there are infinitely many fixed
points given by the solution of the equation
xn =
1
n;
xn
n = 1, 2, 3, .
181
1
n2 + 4 n
=
xn =
2
n + n+ 1 1
n+
All the fixed points are unstable and the map is actually chaotic.
It is easy to guess that points of the type
1
x =
1 + 2+ 1 1
1
1+ 2+
x =
1
2+
1
1+
1
1
2+ 1+
are not fixed points of the basic map function, but the fixed points of the first iterate
of the map f 2 (x) = f (f (x)). This results in a period-2 cycle. One may use the iterates
to generate all the period-n fixed points and period-2n cycles or periodic orbits. All
these fixed points of the iterates are also unstable.
Here the slope of the function in the neighbourhood of the fixed point increases
as it moves towards zero. Without going into details, we will just give the average
value of the Lyapunov exponent here:
=
2
6 ln(2)
which reflects the average of the slope. The system therefore has infinitely many
unstable periodic orbits, is sensitive to the initial conditions and is therefore chaotic.
Quadratic maps : Quadratic 1-D maps can have the most general form
f:
R R
x 7 gx2 + hx + c .
(8.14)
For simplicity, we shall consider one-parameter maps, i.e. maps where g and h can
be parametrised by a single variable r as, g = g(r) and h = h(r). These maps are
very important in dynamics because: Quadraticity is the minimum possible non-linearity. So, these are the simplest
of all systems that exhibit chaos as we shall show.
They manifest almost all the qualitatively different and interesting features of
a typical chaotic system in their dynamics.
As these have only one observable in phase space and only one variable in the
parameter space, all of their asymptotic dynamics can be plotted on a single
2-D graph to give us a pictorial understanding of chaotic motion.
The prototype of 1-dimensional chaos is the famous Logistic Map. This produces so rich a dynamics with least possible analytic or computational clutter that
we shall devote a whole section to the study of this map which will, in its course,
reveals a host of interesting properties of non-linear phenomena.
182
8.3
Logistic Equation was first introduced by P.F. Verhulst in 1845 to model the population dynamics of a single biological species constrained only by natural resources
(food, say) available. The map was popularised in a seminal 1976 paper by the
biologist Robert May 3 . The map is given by the equation
x = rx(1 x).
This equation can be solved analytically and leads to stable population for some values
of x. The implicit assumption here is that the birth and death rates are uniform over
all time. This, however is not a reasonable assumption in many situations. Often
species have their own breeding seasons and we are interested in their population
data taken yearly, or seasonally. Here is where maps come handy since only periodic
updates are needed without recourse to continuum dynamics.
As per the procedure of getting maps from differential equations mentioned earlier, we define Logistic Map as:
fr :
Therefore
[0, 1] [0, 1]
x 7 fr (x) = r x(1 x)
xn+1 = r xn (1 xn ).
(8.15)
183
184
The figures 8.3 and 8.4 illustrate the dynamics of the map for two parameter values
in the fixed point regime. At r = 3 the stable fixed point at x2 has |f (x2 ) = 1|
and it becomes unstable for values of r > 3. Thus both fixed points of the first order
quadratic map become unstable at this point. That is all the further iterates move
away from these two fixed points unless we start exactly on them. The question then
is where does the system go?
The answer is hidden in the second order map, f 2 , that is the map of second
iterates: This map, which goes directly from xn to xn+2 , has a mapping function
given by,
gr (x) := fr (fr (x)) = fr2 (x) = r2 x(1 x) r3 x2 (1 x)2 .
(8.16)
The fixed points of the map are determined by the quartic equation
x = r2 x(1 x) r3 x2 (1 x)2 .
(8.17)
x3,4 =
.
(8.18)
2r
It turns out that both these fixed points are attractive for the map of the second
iterates. Furthermore it is easy to show that
x3 = f (x4 ),
x4 = f (x3 ),
R.M.May, Simple mathematical models with very complicated dynamics, Nature 261,459(1976)
We still get periodic points, and the system becomes 2,4,8... -periodic as we increase r further,
until at r = 3.56 it becomes chaotic.
4
185
Without submitting this map to a rigorous algebraic study, we may take resort to a
computer experiment. All the four fixed points known until now become unstable,
but four new attractive or stable fixed points appear in the place of old ones:
x6 = f (x5 ),
x7 = f (x6 ), x8 = f (x7 ),
x5 = f (x8 ),
The iterates of the original map, after the initial transient, approach these values and
stay forever in the above cycle with period 4T0 .
The point to note here is that as we keep increasing the parameter r, the pattern
of period doublings continues to occur. Old fixed points become unstable, but reign
of order is still found in higher order maps.5
Bifurcation diagram :
Equipped with the power of computers today, we are in a position to ask the
question: What happens to the systems long-term behaviour as we vary the parameter r over the whole continuous spectrum of allowed values? The set of instructions
that we may give to the computer are:1. Set r = 0.0
2. Iterate the Logistic Map for 200 (say) times starting with a random initial
condition.
3. Plot the next 100 iteration values on a xn vs. r graph.
4. If r 4.0, terminate; else increase r by 0.001 and repeat from step 2.
5
A fixed point of a map is also a fixed point of its higher order iterated maps, but converse is not
true.
186
This produces the figure given above. We observe exactly what is to be expected
from the pattern above and a lot more. We see the fixed point solution increasing
and giving rise to periodic points, then 4-periods and so on. But what happens as r is
increased further? At each branching the original fixed point becomes unstable and
gives rise to two new stable fixed points. The values of r at which these bifurcations
occur are called bifurcation points. As we have seen the first bifurcation occurs at
r = 3.0, next at r = 3.45 and so on. If nth bifurcation occurs at a point rn , it is clear
from the figure that rn+1 rn decreases rapidly as n increases. By the time r reaches
a value r = 3.5699456 , an infinite number of bifurcations would have occurred
with as many fixed points in the interval 0 < x < 1.
We now consider the question of sensitivity to initial conditions. This is characterised by the Lyapunov exponent defined by the equation
x(t) = x(0) exp(t),
where x(0) is a tiny separation between two initial points and x(t) is the separation
between the two trajectories after time t. If indeed the trajectories diverge chaotically,
then will be found to be positive for almost all choices of the initial points. Computer
calculations for 0 < r < r show that the Lyapunov exponent is either zero or
negative throughout this range. This is to be expected since the motion after the
initial transient is almost always periodic for r < r .
However, once r > r , the iterates of the logistic map jump chaotically in the
interval 0 < x < 1. This is confirmed by the calculations of the Lyapunov exponent
which shows that is positive for almost all values of r in the range r < r < 4.00.
However, this is not the whole story. Within this regime of chaos, there exist small
187
windows of regular behaviour where < 0. These, rather narrow, windows are
characterised by odd cycles 3, 5, 6 instead of the even cycles characteristic of the
regular domain r < r . What we find therefore is an extraordinary mixture of chaotic
and regular behaviour.
The Feigenbaum constants and :
No treatment of the logistic equation is complete without a mention of the seminal discoveries of M.J. Feigenbaum which led to the application of statistical methods
in chaos analysis. We have already emphasised the cascade of bifurcations and the
period-doublings as a function of the control parameter r.
It is clear from the bifurcation figure that each time a bifurcation takes place
the new stable fixed points are distributed in the same way as before but appears
reduced in size and resembles a pitchfork, hence the name pitch-fork bifurcation.
Let us denote the width of a pitchfork by dn at the nth bifurcation. Based on high
precision numerical analysis, Feigenbaum found that the ratio of successive pitchfork
widths approached a limiting value as n became large. This gave rise to the universal
constant defined as
dn
= 2.5029078750957 .
n dn+1
= lim
(8.19)
Furthermore, the positions rn at which bifurcations occur also show a self similar
behaviour. Roughly speaking, if rn and rn+1 are the values of successive points at
which bifurcations occur, we define a second universal constant denoted as given by
rn rn1
= 4.669201609 .
n rn+1 rn
= lim
(8.20)
as n . The reason why we call them universal constants is due to the following
reason. The existence of the constants and in the case of logistic equation may
not be wholly unexpected. What is remarkable is the discovery by Feigenbaum that a
whole class of maps, called quadratic maps, display a bifurcation pattern where by the
constants , defined as above have exactly the same value as determined from logistic
map above. It is in this sense we call these constants universal. Quadratic maps are
defined as maps which have only a single maximum in the interval 0 < x < 1. It is in
this sense that we call all functions that give rise to bifurcations which scale according
to same and as belonging to the same universality class, a term borrowed from
theory of critical phenomena in statistical mechanics.
8.4
188
(x + y)
xz + x y
xy bz,
(8.21)
189
y
1
x
where the x, y, z denoted the state of the system at some time t, and , , b are
parameters of the system denoting
=
Ra
Rc
which the ratio of the Rayleigh number to the critical value at which the convection
sets in, 10 is the so called Prandtl number of fluid dynamics and b 8/3 is the
ratio of lengths scales describing convective cells. The phase space is 3-dimensional
and obviously it is not a Hamiltonian system.
The fixed points of the system are given by
1.
x0 = y0 = z0 = 0,
2.
x1 = y1 =
3.
p
b( 1),
x2 = y2 =
z1 = 1,
p
b( 1), z2 = 1.
+b+3
24.74.
b1
190
24.74 : Probably Lorenz would have been happy to get the behaviour above
which related to the qualitative behaviour of the Rayleigh Benard convection.
But he also found some thing very strange for > 24.74. In the original paper
of Lorenz he computed till it reached the value 28. The fixed points at 2 and
3 were no longer stable point attractors. Rather, any trajectory spirals around
one of these fixed point many times over and then is ejected out towards the
other fixed point where again the trajectory creates a spiralling motion around
the fixed point before being ejected out back towards the other fixed point.
This can repeat infinitely many times with varying turns around the fixed point
each time. Although the spirals appear to lie on a two dimensional place, a
careful analysis shows that they develop many close lying sheets. This complex
geometry allows the trajectory to continue for ever without crossing or closing
upon itself.
The fixed points are therefore not really attractors even though the trajectory
appears to move towards the fixed points for short durations. Lorenz called
them strange attractors or simply Lorenz attractors as they are some times
called.
The number of turns that a trajectory makes before a particular cross-over has
no particular pattern and the motion is in general chaotic even while the trajectories
are completely deterministic. For a given initial condition, one may integrate the
equations numerically using computers and obtain the evolution for all times. Two
trajectories starting from almost the same initial conditions diverge exponentially
after some initial time. The Lyapunov exponent is 0.9 which is positive and
makes the system unpredictable over long times.
8.4.1
191
The H
enon-Heiles potential
m 2 2
1
(x + y 2 ) + (x2 y y 3 )
2
3
(8.22)
1
m 2 2
r + r3 sin (3).
2
3
(8.23)
As can be seen from the polar form, the potential has discrete symmetries corresponding to the reflection at the three axes with the polar angles = /2 and = /6,
and of rotations about the angles 2/3. The three saddle points lying on the symmetry axes correspond to an energy E = (m 2 )3 /62 . The particle may escape if
E > E depending on the direction of motion.
There is only one constant of motion in the HH potential, that is the energy.
Hence the system is non-integrable. The trajectories depend on the two variables E
and . These may be elimated using the scaled variables
u = x,
v = y,
(8.24)
such that the Hamiltonian may be written interms of the scaled variables with a
scaled energy e
e = 62 E = E/E = 3 (u 2 + v 2 ) + 3 (u2 + v 2 ) + 6 v u2 2 v 3 ,
(8.25)
where we have further set m = = 1. The form of the potential and a section along
the symmetry axis is shown in the figure.
The HH Hamiltonian is an example of a system which is regular to quasi regular
for energies e < 1 and is chaotic for energies e > 1.
The equations of motion for the scaled coordinates u, v become
u = u 2uv,
v = v + (v 2 u2 ).
(8.26)
These equations depend only on the scaled energy e. These equations are solved
numerically to obtain the trajectories in the phase space.
The motion of a particle in HH potential may described in terms of the scaled
energy e.
6
The brief discussion here is mainly taken from the book Semiclassical Physics by M Brack and
R K Bhaduri, Addisson Wesly (1996) p.238.
192
1.0
e=1
6Vo(v)
0.5
C
0.0
e
0
v1
v2
v3
-0.5
-1.0
-0.5
0.0
0.5
1.0
-1
-1
Figure 8.7: Left: Equipotential contour lines in scaled energy units e in the plane of
scaled variables u, v. The dashed lines are the symmetry axes. The three shortest
periodic orbits A, B, and C (evaluated at the energy e = 1) are shown by the heavy
solid lines. Right: Cut of the scaled potential along u = 0.
193
1.0
0.5
v
0.0
-0.5
-1.0
0.5
-0.5
. ..
0.0
-0.5
1.0 -1.0
.. .
0.0
0.5
-0.5
0.0
0.5
1.0
.......... .. ...
.......... .... ............... ..
............................................................................................
.... ... ........................................... .................
........ .............................................................................................................................................
.. . ...
. . . .
..... . ....................................................................................................
...... ................................................... .... .. ........ .........................................
.. .............................. ....... ........ . ... .. .........................
....... .............................................................................. ................
... .....................................................................................
.
.
.
....................................................................................................................................
..... ...................................................
..... . ................. ...............
.............................. . ..
... .. ..
. .. ...
0.0
-0.5
0.5
.. .
pv
. ..
1.0
-0.5
0.0
0.5
1.0
Figure 8.8: Examples of a quasi-regular orbit (left panels) and a chaotic orbit (right
panels) in coordinate space (u, v) (upper panels) and in the Poincare surface of section
(v, pv ) (lower panels). Note the irregular filling of most of the phase space by the
chaotic orbit in the lower right panel. The empty islands correspond to the quasiregular motion seen on the left; elliptic fixed points at their centers correspond to the
stable double-loop orbit D.