Documente Academic
Documente Profesional
Documente Cultură
I Preliminaries 5
1 Physical theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1 Consequences of Newtonian dynamical and measurement theories 8
2 The physical arena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1 Symmetry and groups . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.1 Topological spaces and manifolds . . . . . . . . . . . . . . 20
2.3 The Euclidean group . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4 The construction of Euclidean 3-space from the Euclidean group 30
3 Measurement in Euclidean 3-space . . . . . . . . . . . . . . . . . . . . 32
3.1 Newtonian measurement theory . . . . . . . . . . . . . . . . . . . 32
3.2 Curves, lengths and extrema . . . . . . . . . . . . . . . . . . . . 33
3.2.1 Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 The Functional Derivative . . . . . . . . . . . . . . . . . . . . . . 36
3.3.1 An intuitive approach . . . . . . . . . . . . . . . . . . . . 36
3.3.2 Formal de!nition of functional differentiation . . . . . . . 42
3.4 Functional integration . . . . . . . . . . . . . . . . . . . . . . . . 53
4 The objects of measurement . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1 Examples of tensors . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.1.1 Scalars and non-scalars . . . . . . . . . . . . . . . . . . . 55
4.1.2 Vector transformations . . . . . . . . . . . . . . . . . . . . 56
4.1.3 The Levi-Civita tensor . . . . . . . . . . . . . . . . . . . . 57
4.1.4 Some second rank tensors . . . . . . . . . . . . . . . . . . 59
4.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.2.1 Vectors as algebraic objects . . . . . . . . . . . . . . . . . 65
4.2.2 Vectors in space . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 The metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.1 The inner product of vectors . . . . . . . . . . . . . . . . 74
4.3.2 Duality and linear maps on vectors . . . . . . . . . . . . . 75
4.3.3 Orthonormal frames . . . . . . . . . . . . . . . . . . . . . 81
4.4 Group representations . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5 Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
II Motion: Lagrangian mechanics 92
5 Covariance of the Euler-Lagrangian equation . . . . . . . . . . . . . . 94
6 Symmetries and the Euler-Lagrange equation . . . . . . . . . . . . . . 97
6.1 Noether#s theorem for the generalized Euler-Lagrange equation . 97
6.2 Conserved quantities in restricted Euler-Lagrange systems . . . . 101
6.2.1 Cyclic coordinates and conserved momentum . . . . . . . 101
6.2.2 Rotational symmetry and conservation of angular momen-
tum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.2.3 Conservation of energy . . . . . . . . . . . . . . . . . . . . 108
6.2.4 Scale Invariance . . . . . . . . . . . . . . . . . . . . . . . 109
6.3 Conserved quantities in generalized Euler-Lagrange systems . . . 112
6.3.1 Conserved momenta . . . . . . . . . . . . . . . . . . . . . 112
6.3.2 Angular momentum . . . . . . . . . . . . . . . . . . . . . 114
6.3.3 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.3.4 Scale invariance . . . . . . . . . . . . . . . . . . . . . . . . 117
6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7 The physical Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . 119
7.1 Galilean symmetry and the invariance of Newton#s Law . . . . . 120
7.2 Galileo, Lagrange and inertia . . . . . . . . . . . . . . . . . . . . 122
7.3 Gauging Newton#s law . . . . . . . . . . . . . . . . . . . . . . . . 128
8 Motion in central forces . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8.1 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
8.1.1 Euler#s regularization . . . . . . . . . . . . . . . . . . . . 138
8.1.2 Higher dimensions . . . . . . . . . . . . . . . . . . . . . . 140
8.2 General central potentials . . . . . . . . . . . . . . . . . . . . . . 143
8.3 Energy, angular momentum and convexity . . . . . . . . . . . . . 145
8.4 Bertrand#s theorem: closed orbits . . . . . . . . . . . . . . . . . . 148
8.5 Symmetries of motion for the Kepler problem . . . . . . . . . . . 152
8.5.1 Conic sections . . . . . . . . . . . . . . . . . . . . . . . . 155
8.6 Newtonian gravity . . . . . . . . . . . . . . . . . . . . . . . . . . 157
9 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
10 Rotating coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
10.1 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
10.2 The Coriolis theorem . . . . . . . . . . . . . . . . . . . . . . . . . 170
11 Inequivalent Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . 172
11.1 General free particle Lagrangians . . . . . . . . . . . . . . . . . . 172
11.2 Inequivalent Lagrangians . . . . . . . . . . . . . . . . . . . . . . . 175
2
11.2.1 Are inequivalent Lagrangians equivalent? . . . . . . . . . 178
11.3 Inequivalent Lagrangians in higher dimensions . . . . . . . . . . . 179
3
IV Bonus sections 251
16 Classical spin, statistics and pseudomechanics . . . . . . . . . . . . . . 251
16.1 Spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
16.2 Statistics and pseudomechanics . . . . . . . . . . . . . . . . . . . 254
16.3 Spin-statistics theorem . . . . . . . . . . . . . . . . . . . . . . . . 258
17 Gauge theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
17.1 Group theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
17.2 Lie algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
17.2.1 The Lie algebra so(3) . . . . . . . . . . . . . . . . . . . . 270
17.2.2 The Lie algebras so(p, q ) . . . . . . . . . . . . . . . . . . . 274
17.2.3 Lie algebras: a general approach . . . . . . . . . . . . . . 276
17.3 Differential forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
17.4 The exterior derivative . . . . . . . . . . . . . . . . . . . . . . . . 285
17.5 The Hodge dual . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
17.6 Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
17.7 The Levi-Civita tensor in arbitrary coordinates . . . . . . . . . . 291
17.8 Differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . 292
17.8.1 Grad, Div, Curl and Laplacian . . . . . . . . . . . . . . . 294
17.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
4
Part I
Preliminaries
Geometry is physics; physics is geometry. It is human nature to unify our expe-
rience, and one of the important specializations of humans to develop language
to describe our world. Thus, we !nd the unifying power of geometric description
a powerful tool for spotting patterns in our experience, while at the same time,
the patterns we !nd lead us to create new terms, concepts, and most impor-
tantly, pictures. Applying geometric abstractions to model things in the world,
we discover new physics. Creating new pictures, we invent new mathematics.
Our ability to make predictions based on perceived patterns is what makes
physics such an important subject. The more mathematical tools we have at
hand, the more patterns we can model, and therefore the more predictions we
can make. We therefore start this adventure with some new tools, Lie groups
and the calculus of variations. We will also cast some old familiar tools in a new
form.
[DESCRIBE EACH]
With this extension of the familiar calculus, along with some new ways to
look at curves and spaces, we will be able to demonstrate the naturalness and
advantages of the most elegant formulation of Newton#s laws of mechanics: the
phase space formulation due to Euler, Lagrange, Hamilton and many others, in
which the position and the momentum of each particle are treated as indepen-
dent variables coupled by a system of !rst-order differential equations.
We will take a gradual approach to this formulation. Our starting point
centers on the idea of space, an abstraction which started in ancient Greece.
Basing our assumptions in our direct experience of the world, we provisionally
work in a 3-dimensional space, together with time. Because we want to associate
physical properties with objects which move in this space rather than with the
space itself, we demand that the space be homogeneous and isotropic. This leads
us to the construction of (among other possibilities) Euclidean 3-space. In order
to make measurements in this space, we introduce a metric, which allows us to
characterize what is meant by uniform motion. We also treat the description
of matter from a fundamental approach, carefully de!ning the mathematical
structures that might be used to model the stuff of the world. Finally, we
develop techniques to describe the motion of matter.
Each of these developments involves the introduction of new mathematics.
5
The description of uniform motion leads to the calculus of variations, the de-
scription of matter leads to a discussion of vectors and tensors, and our analysis
of motion requires techniques of differential forms, connections on manifolds,
and gauge theory.
Once these tools are in place, we derive various well-known and not so well-
known techniques of classical (and quantum) mechanics.
Numerous examples and exercises are scattered throughout.
[MORE HERE]
Enjoy the ride!
1. Physical theories
Within any theory of matter and motion we may distinguish two conceptually
different features: dynamical laws and measurement theory. We discuss each in
turn.
By dynamical laws, we mean the description of various motions of objects,
both singly and in combination. The central feature of our description is gen-
erally some set of dynamical equations. In classical mechanics, the dynamical
equation is Newton#s second law,
dv
F=m
dt
or its relativistic generalization, while in classical electrodynamics two of the
Maxwell equations serve the same function:
1 dE
!"B = 0
c dt
1 dB 4!
+"E = J
c dt c
The remaining two Maxwell equations may be regarded as constraints on the
initial !eld con!guration. In general relativity the Einstein equation gives the
time evolution of the metric. Finally, in quantum mechanics the dynamical law
is the Schrdinger equation
= i! "
H"
t
which governs the time evolution of the wave function, ".
6
Several important features are implicit in these descriptions. Of course there
are different objects $ particles, !elds or probability amplitudes $ that must be
speci!ed. But perhaps the most important feature is the existence of some arena
within which the motion occurs. In Newtonian mechanics the arena is Euclidean
3-space, and the motion is assumed to be parameterized by universal time.
Relativity modi!ed this to a 4 -dimensional spacetime, which in general relativity
becomes a curved Riemannian manifold. In quantum mechanics the arena is
phase space, comprised of both position and momentum variables, and again
having a universal time. Given this diverse collection of spaces for dynamical
laws, you may well ask if there is any principle that determines a preferred space.
As we shall see, the answer is a quali!ed yes. It turns out that symmetry gives
us an important guide to choosing the dynamical arena.
A measurement theory is what establishes the correspondence between cal-
culations and measurable numbers. For example, in Newtonian mechanics the
primary dynamical variable for a particle is the position vector, x. While the
dynamical law predicts this vector as a function of time, we never measure a
vector directly. In order to extract measurable magnitudes we use the Euclidean
inner product,
#u, v$ = u % v
! "
If we want to know the position, we specify a vector basis ( , j, k say) for
comparison, then specify the numbers
x = % x
y = j % x
z = k % x
These numbers are then expressed as dimensionless ratios by choosing a length
standard, l. If l is chosen as the meter, then in saying the position of a particle
is a (x, y, z ) = (3m, 4m, 5m) we are specifying the dimensionless ratios
x
= 3
l
y
= 4
l
z
= 5
l
A further assumption of Newtonian measurement theory is that particles move
along unique, well-de!ned curves. This is macroscopically sound epistemology,
7
since we can see a body such as a ball move smoothly through an arc. However,
when matter is not continuously monitored the assumption becomes suspect. In-
deed, the occasional measurements we are able to make on fundamental particles
do not allow us to claim a unique path is knowable, and quantum experiments
show that it is incorrect to assume that unique paths exist.
Thus, quantum mechanics provides a distinct example of a measurement
theory $ we do not assume unique evolution. Perhaps the chief elements of
quantum measurement theory is the Hermitian inner product on Hilbert space:
#
#" |"$ = " ! "d3 x
V
and its interpretation as the probability of !nding the particle related to " in
the volume V. As noted in the preceeding paragraph, it is incorrect to assume a
unique path of motion. The relationship between expectation values of operators
and measurement probabilities is a further element of quantum measurement
theory.
The importance of the distinction between dynamical laws and measurement
theories will become clear when we introduce the additional element of symmetry
in the !nal sections of the book. In particular, we will see that the techniques
of gauge theory allow us to reconcile differences between the symmetry of a
dynamical law and the symmetry of the associated measurement theory. We
shall show how different applications of gauge theory lead to the Lagrangian
and Hamiltonian formulations of classical mechanics, and eventually, how a small
change in the measurement theory leads to quantum mechanics.
8
and physicists have returned again and again to the inescapable fact that we
only know space through the relationships between objects. Still, the idea of a
continuum in which objects move may be made rigorous by considering the full
set of possible positions of objects. We will reconsider the idea in light of some
more contemporary philosophy.
Before beginning our agruments concerning spacde, we de!ne another ab-
straction: the particle. By a particle, we mean an object sufficiently small and
uncomplicated that its behavior may be accurately captured by specifying its
position only. This is our physical model for a mathematical point. Naturally,
the smallness of size required depends on the !neness of the description. For
macroscopic purposes a small, smooth marble may serve as a model particle, but
for the description of atoms it becomes questionable whether such a model even
exists. For the present, we assume the existence of effectively point particles,
and proceed to examine space. It is possible (and desirable if we take seriously
the arguements of such philosophers as Popper and Berkeley1 ), to begin with
our immediate experience.
Curiously enough, the most directly accessible geometric feature of the world
is time. Our experience is a near-continuum of events. This is an immediate
consequence of the richness of our experience. In fact, we might de!ne any
continuous or nearly continuous element of our experience $ a succession of
colors or a smooth variation of tones $ as a direction for time. The fact that
we do not rely on any one particular experience for this is probably because we
choose to label time in a way that makes sense of the most possible experiences
we can. This leads us to rely on correlations between many different experiences,
optimizing over apparently causal relationships to identify time. Henceforward,
we assume that our experience unfolds in an ordered sequence.
A simple experiment can convince us that a 3-dim model is convenient for
describing that experience. First, I note that my sense of touch allows me
trace a line down my arm with my !nger. This establishes the existence of
a single continuum of points which I can distinguish by placing them in 1-1
correspondence with successive times. Further, I can lay my hand &at on my
arm, experiencing an entire 2-dim region of my skin. Finally, still holding my
hand against my arm, I cup it so that the planar surface of my arm and the
planar surface of my hand are not in contact, although they still maintain a
continuous border. This establishes the usefulness of a third dimension.
A second, similar experiment makes use of vision. Re&ecting the 2-dim
1
Brief statement about Popper and Berkeley.
9
nature of our retina, visual images appear planar $ I can draw lines between
pairs of objects in such a way that the lines intersect in a single intermediate
point. This cannot be done in one dimension. Furthermore, I am not presented
with a single image, but perceive a succession in time. As time progresses, I see
some images pass out of view as they approach others, then reemerge later. Such
an occultation is easily explained by a third dimension. The vanishing object
has passed on the far side of the second object. In this way we rationalize the
difference between our (at least) 2-dim visual experience and our (at least) 3-dim
tactile experience.
Exercise 1.1. What idea of spatial relation can we gain from the senses of
smell and taste?
As we all know, these three spatial dimensions together with time provide
a useful model for the physical world. Still, a simple mathematical proof will
demonstrate the arbitrariness of this choice. Suppose we have a predictive physi-
cal model in 3-dim that adequately accounts for various phenomena. Then there
exists a completely equivalent predictive model in any dimension. The proof fol-
lows from the proof that there exist 1-1 onto maps between dimensions, which
we present !rst.
For simplicity we focus on a unit cube. For any point in the three dimensional
unit cube, let the decimal expansions for the Cartesian coordinates be
x = .a1a2a3 . . .
x = .b1b2 b3 . . .
z = .c1c2 c3 . . .
This mapping is clearly 1-1 and onto. To map to a higher dimension, we take
any given w,
w = .d1d2 d3 d4 . . .
10
and partition the decimal expansion,
x1 = .d1dn+1 d2n+1 . . .
x2 = .d2dn+2 d2n+2 . . .
x3 = .d3dn+3 d2n+3 . . .
..
.
xn = .dn d2n d3n . . .
Now suppose we have a physical model making, say, a prediction of the position
of a particle 3-dim as a function of time,
x i (t )
w (t )
Exercise 1.3. Here is an example that shows how descriptions in different di-
mensions can reveal different physical information about a body. Consider the
description of an extended body. In a three dimensional representation, we
might specify a 3 parameter family of positions, xi ($, %, & ) together with suit-
able ranges for the parameters. Alternative, we may represent this as a single
number as follows. Divide a region of 3 -space into 1 meter cubes; divide each
cube into 1000 smaller cubes, each one decimeter on a side, and so on. Num-
ber the 1 meter cubes from 0 to 999; number the decimeter cubes within each
meter cube from 0 to 999, and so on. Then a speci"c location in space may be
expressed as a sequence of numbers between 000 and 999,
w = .999345801 . . .
11
This is clearly a 1-1, onto map. Now, for an extended body, choose a point
in the body. About this point there will be a smallest cube contained entirely
within the body. The speci"cation of this cube is a !nite decimal expansion,
w = .999345801 . . . 274
Additional cubes which together "ll the body may be speci"ed. Disuss the
optimization of this list of numbers, and argue that an examination of a suitably
de"ned list can quickly give information about the total size and shape of the
body.
x = a1a2 . . . an .b1b2b3 . . .
Exercise 1.5. Devise a 1-1, onto mapping from the 3-dim position of a particle
to a 1-dim representation in such a way that the number w (t) is always within
10"n of the x component of (x (t) , y (t) , z (t)) .
12
2.1. Symmetry and groups
Mathematically, it is possible to construct a space with any desired symmetry
using standard techniques. We begin with a simple case, reserving more involved
examples for later chapters. To begin, we !rst de!ne a mathematical object
capable of representing symmetry. We may think of a symmetry as a collection
of transformations that leave some essential properties of a system unchanged.
Such a collection, G, of transformations must have certain properties:
1. We may always de!ne an identity transformation, e, which leaves the sys-
tem unchanged: e G.
2. For every transformation, g, taking the system from description A to an-
other equivalent description A#, there must be another transformation, de-
noted g "1 , that reverses this, taking A# to A. The combined effect of the two
transformations is therefore g "1 g = e. We may write: g G, g "1 G "
g "1 g = e.
3. Any two transformations must give a result which is also achievable by a
transformation. That is, g1 , g2 G, g3 G " g1 g2 = g3 .
4. Applying three transformations in a given order has the same effect if we
replace either consecutive pair by their combined result. Thus, we have
associativity: g1 , g2 , g3 G, g1 (g2 g3 ) = (g1 g2 ) g3 .
These are the de!ning properties of a mathematical group. Precisely, a group
is a set, S, of objects together with a binary operation satisfying properties 1 ! 4.
We provide some simple examples.
The binary, or Boolean, group, B, consists of the pair B = {{1, !1} , "}
where " is ordinary multiplication. The multiplication table is therefore
" 1 !1
1 1 !1
!1 !1 1
Naturally, 1 is the identity, while each element is its own inverse. Closure is
evident by looking at the table, while associativity is checked by tabulating all
triple products:
1 " (1 " (!1)) = !1 = (1 " 1) " (!1)
1 " (!1 " (!1)) = 1 = (1 " (!1)) " (!1)
etc.
13
The pair B is therefore a group.
There is another way to write the Boolean group, involving modular addition.
We de!ne
De!nition 2.1. Let S be the set of n consecutive integers beginning zero, S =
{0, 1, 2, . . . , n ! 1} . Addition modulo n (or mod n), )n , is cyclic addition on S.
That is, for all a, b S
$
a+b a+b<n
a )n b =
a+b!n a+b * n
where + is the usual addition of real numbers.
Addition mod n always produces a group with n elements:
De!nition 2.2. Theorem 2.3. The pair Gn = (S, )n ) is a group.
Remark 1. For proof, we see immediately that multiplication mod n is closed,
because if a + b < n then a + b S, while if a + b * n then a + b ! n S.
Zero is the additive identity, while the inverse of a is n ! a. Finally, to check
associativity, we have four cases:
a + b < n, b + c < n
a + b < n, b + c * n
a + b * n, b + c * n
The "rst case is immediate because
$
a+b+c a+b+c < n
(a ) n b ) ) n c = (a + b ) ) n c =
a+b+c!n a+b+c * n
$
a+b+c a+b+c < n
a ) n ( b ) n c) = a ) n ( b + c) =
a+b+c!n a+b+c * n
In the second case, we note that since a + b < n, and b + c > n, we must have
n < a + b + c < 2n. Therefore,
(a ) n b ) ) n c = (a + b ) ) n c = a + b + c ! n
a )n (b )n c) = a )n (b + c ! n)
$
a + b + c ! n a + b + c < 2n
=
a + b + c ! 2n a + b + c * 2n
= a+b+c!n
14
For the "nal case, we have two subcases:
n < a + b + c < 2n
2n + a + b + c < 3n
(a )n b) )n c = (a + b ! n) )n c = a + b + c ! n
a )n (b )n c) = a )n (b + c ! n) = a + b + c ! n
(a )n b) )n c = (a + b ! n) )n c = a + b + c ! 2n
a )n (b )n c) = a )n (b + c ! n) = a + b + c ! 2n
)2 0 1
0 0 1
1 1 0
15
Finally, since b must have an inverse, and its inverse cannot be a, we must !ll
in the !nal spot with the identity, thereby making b its own inverse:
- a b
a a b
b b a
De!nition 2.4. Let G = (S, )) and H = (T, .) be two groups and let ' be a
one-to-one, onto mapping, ', between G and H. Then ' is an isomorphism if it
preserves the group product in the sense that for all g1 , g2 in G,
When there exists an isomporphism between G and H, then G and H are said
to be isomorphic to one another.
The de!nition essentially means that ' provides a renaming of the elements
of the group. Thus, suppose g1 ) g2 = g3 . Thinking of h = ' (g ) as the new name
for g, and setting
h1 = ' (g1 )
h2 = ' (g2 )
h3 = ' (g3 )
eq.(2.1) becomes
h1 . h2 = h3
Applying the group product may be done before or after applying ' with the
same result. In the Boolean case, for example, setting ' (a) = 0 and ' (b) = 1
shows that G = (-, {a, b}) and H = ()2 , {0, 1}) are isomorphic.
Now consider a slightly bigger group. We may !nd all groups with three
elements as follows. Let G = {{a, b, c} , .} , where the group operation, .,
remains to be de!ned by its multiplication table. In order for G to be a group,
16
one of the elements must be the identity. Without loss of generality, we pick
a = e. Then the multiplication table becomes
. e b c
e e b c
b b
c c
Next, we show that no element of the group may occur more than once in any
given row or column. To prove this, suppose some element, z, occurs in the c
column twice. Then there are two distinct elements (say, for generality, x and
y ) such that
x.c = z
y.c = z
(x . c) . c"1 = (y . c) . c"1
% & % &
x . c . c"1 = y . c . c"1
x.e = y.e
x = y
. e b c
e e b c
b b c e
c c e b
17
Example 2.5. Let G = {Z, +}, the integers under addition. For all integers
a, b, c we have a + b R (closure); 0 + a = a + 0 = a (identity); a + (!a) = 0
(inverse); a + (b + c) = (a + b) + c (associativity). Therefore, G is a group. The
integers also form a group under addition mod p, where p is any integer (Recall
that a = b mod p if there exists an integer n such that a = b + np).
Example 2.6. Let G = {R, +}, the real numbers under addition. For all real
numbers a, b, c we have a + b R (closure); 0 + a = a + 0 = a (identity);
a + (!a) = 0 (inverse); a + (b + c) = (a + b) + c (associativity). Therefore, G is
a group. Notice that the rationals, Q, are not a group under addition because
they do not close under addition:
! = 3 + .1 + .04 + .001 + .0005 + .00009 + . . .
Of course, the real numbers form a !eld, which is a much nicer object than
a group.
In working with groups it is generally convenient to omit the multiplication
sign, writing simply ab in place of a . b.
Exercise 2.1. A subgroup of a group G = {S, .} is a group G# = {S # , .} , with
the same product, . such that S # is a subset of S. Prove that a group with n + 1
elements has no subgroup with n elements. (Hint: write the multiplication table
for the n element subgroup and try adding one row and column.)
Exercise 2.2. Find all groups (up to isomorphism) with four elements.
Exercise 2.3. Show that the three re#ections of the plane
Rx : (x, y ) (!x, y )
Ry : (x, y ) (x, !y )
Rxy : (x, y ) (!x, !y )
together with the identity transformation, form a group. Write out the multi-
plication table.
e Rx Ry Rxy
e e Rx Ry Rxy
Rx Rx e Rxy Ry
Ry Ry Rxy e Rx
Rxy Rxy Ry Rx e
Exercise 2.4. Find the 8-element group built from the three dimensional re-
#ections and their products.
18
2.2. Lie groups
While groups having a !nite number of elements are entertaining, and even !nd
use in crystalography, most groups encountered in physics have in!nitely many
elements. To specify these elements requires one or more continuous parameters.
We begin with some familiar examples.
Example 2.7. The real numbers under addition, G = {R, +} , form a Lie group
because each element of R provides its own label. Since only one label is required,
R is a 1-dimensional Lie group.
Example 2.8. The real, n-dim vector space V n under vector addition is an
n-dim Lie group, since each element of the group may be labeled by n real
numbers.
x# = x cos ( ! y sin (
y # = x sin ( + y cos (
where " is normal matrix multiplication, form a group. To see this, consider
the product of two elements,
' (' (
cos ( ! sin ( cos ! sin
R (( ) R () =
sin ( cos ( sin cos
' (
cos ( cos ! sin ( sin ! cos ( sin ! sin ( cos
=
sin ( cos + cos ( sin ! sin ( sin + cos ( cos
' (
cos (( + ) ! sin (( + )
=
sin (( + ) cos (( + )
19
so the set is closed under multiplication as long as we consider the addition of
angles to be addition modulo 2!. We immediately see that the inverse to any
rotation R (( ) is a rotation by R (2! ! () , and the associativity of (modular)
addition guarantees associativity of the product. Notice that rotations in the
plane commute. A group in which all products commute is called Abelian.
Exercise 2.5. Show that the 2-dim rotation group R (() preserves the Euclid-
ean length, l 2 = x2 + y 2.
20
De!nition 2.12. A topological space, S, it a set S , for which we have a collec-
tion, * , of subsets of S satisfying the following properties
is also in * .
Then the in!nite intersection over all n is the set containing a single point, {0} .
Another familiar example is the plane, R2 . Let * 0 the collection of open disks,
and let * be the collection of all unions and !nite intersections of sets in * 0. This
is the usual topology of the plane. If we have any 'open( region V of the plane $
that is, an arbitrarily shaped region of contiguous points without any boundary
points $ then it is in * . To see this, pick any point in V. Around this point we
can !nd an open disk, U(P ) (P ) , that is small enough that it lies entirely within
21
V. Repeating this for every point in V, we see that V is the union of these open
disks, *
V = U(P ) (P )
P
so that V is in * .
Two topologies on a given set are equal if they contain the same open sets.
Typically, we can de!ne more than one distinct topology for any given set S,
and there is more than one way to specify a given topology. To see the !rst,
return to the real line but de!ne the open sets to be the half-open intervals,
[a, b)
together with their unions and !nite intersections. No half-open set is included in
the usual topology because such an interval is not the union or !nite intersection
of open intervals. To see the second, we need a way to compare two topologies.
The technique we used above for the plane works in general, for suppose we
want to show that two topologies, * and * # , are equal. Then for an arbitrary
open set U in * and any point P in U, !nd an open set VP in * # which contains
P and lies entirely in U. Since the union of the sets VP is U, the set U must be
in * # . Then we repeat the argument to show that sets in * # also lie in * .
As an example of the method, consider a second topology for the plane
consisting of unions and !nite intersections of open squares,
Picking any point P, in any V (a, b) , we can !nd an open disk centered on P
and lying entirely within V (a, b) . Conversely, for any point of any open disk,
we can !nd a rectangle containing the point and lying entirely within the disk.
Therefore, any set that can be built as a union or !nite intersection of open
rectangles may also be built as a union or !nite intersection of disks, and vice
versa. The two topologies are therefore equal.
This concludes our brief forray into topology. The essential idea is enough
to proceed with the de!nition of a differentiable manifold.
! : U! V!
22
for some "xed dimension n. Here $ is simply a label for convenience. These
mappings must satisfy the following properties:
!|# : U# V!|# 1 V!
$ |# : U# V$ |# 1 V$
to be differentiable.
The basic idea here is that we have a correspondence between small regions
of the manifold and regions of real n-space: the manifold is essentially Rn if you
look closely enough. The overlap condition allows us to carry this correspon-
dence to larger regions, but it is weak enough to allow M to be distinct from
Rn . For example, consider the circle, S 1 . The usual angle ( maps points on S 1
in a 1-1 way to points in the interval [0, 2! ) , but this interval is not open in R1 .
Nonetheless, S 1 is a manifold because we can choose two mappings from all but
one point of S 1 .
Every point of the circle lies in at least one of the sets S 1 ! {!} or S 1 ! {0} , each
angle maps open set to open sets, and on the overlap region S 1 ! {!} ! {0} , the
mapping
( - '"1
is just
( - '"1 (x) = x ! !
23
Exercise 2.6. Prove that the 2-sphere, S 2 = {(x, y, z ) | x2 + y 2 + z 2 = 1} is a
manifold.
Exercise 2.7. Prove that the 2-torus, T 2 = {(x, y ) | 0 + x < 1, 0 + y < 1} with
the periodic boundary conditions, (0, y ) = (1, y ) and (x, 0) = (x, 1) , is a mani-
fold.
Exercise 2.8. Show that a cylinder is a manifold.
Exercise 2.9. Show that a Mbius strip is a manifold.
24
which shows that TaTb = Ta+b and Ta"1 = T"a. Closure is guaranteed as long
as each of the three parameters ranges over all real numbers, and associativity
follows from the associativity of addition (or, equivalently, the associativity of
matrix multiplication).
Isotropy is independence of direction in space, equivalent to invariance un-
der rotations. The simplest way to characterize rotations is as transformations
preserving lengths. Therefore, any rotation,
x # = R (x )
must satisfy
x # % x # = R ( x ) % R (x ) = x % x
for all vectors x. Such relations are often most usefully expressed in terms of
index notation:
3
4
xi# = Rij xj
j =1
3 3
5 3 65 3
6 3
4 4 4 4 4
xi# xi# = Rij xj Rik xk = xi xi
i=1 i=1 j =1 k=1 i=1
25
or
Timnvm wn + i = Sij uj
but not
Tijj vj wj + i = Sim um
because using the same dummy index twice in the term Tijj v j wj means we
don#t know whether to sum v j with the second or the third index of Tijk . We
will employ the Einstein summation convention throughout the remainder of
the book, with one further modi!cation occurring in Section 5.
De!ning the transpose of Rij by
Rtij = Rji
Rtji xj Rik xk = xi xi
Index notation has the advantage that we can rearrange terms as long as we
don#t change the index relations. We may therefore write
Rtji Rik xj xk = xi xi
xi xi = xj - jk xk = -jk xj xk
Since there are no unpaired, or 'free( indices, it doesn#t matter that we have i
on the left and j, k on the right. These 'dummy( indices can be anything we
like, as long as we always have exactly two of each kind in each term. In fact, we
can change the names of dummy indices any time it is convenient. For example,
it is correct to write
xi xi = xj xj
or
xi# = Rij xj = Rik xk
26
We now have
Rtji Rik xj xk = -jk xj xk
This expression must hold for all xj and xk . But notice that the product xj xk is
really a symmetric matrix, the outer product of the vector with itself,
2
x xy xz
xj xk = yx y 2 yz
zx zy z 2
Beyond symmetry, this matrix is arbitrary.
Now, we see that the double sum of Rtji Rik ! -jk with an arbitrary symmetric
matrix % t &
Rji Rik ! -jk xj xk = 0
must vanish. This is the case for any purely antisymmetric matrix, so that the
symmetric part of the matrix in parentheses must vanish:
% t & % &
Rji Rik ! -jk + Rtki Rij ! - kj = 0
Combining terms and dividing by two we !nally reach the conclusion that R
times its transpose must be the identity:
Rtji Rik = - jk
Rt R = 1
Notice that we may use the formal notation for matrix multiplication whenever
the indices are in the correct positions for normal matrix multiplication, i.e.,
with adjacent indices summed. Thus,
27
is equivalent to MN = S, while
is equivalent to M t N = S.
Returning to the problem of rotations, we have shown that the transforma-
tion matrix R is a rotation if and only if
Rt R = 1
Since R must be invertible, we may multiply by R"1 on the right to show that
Rt = R"1
Exercise 2.10. Show that the orthogonal transformations form a group. You
may use the fact that matrix multiplication is associative.
Exercise 2.11. Show that the orthogonal transformations form a Lie group.
Remark 2. We need to show that the elements of the rotation group form a
manifold. This requires a 1-1, onto mapping between a neighborhood of any
given group element and a region of R3 . To begin, "nd a neighborhood of the
identity. Let R be a 3-dim matrix satisfying Rt R = 1 such that
R =1+A
where A is another matrix with all of its elements small, |Aij | << 1 for all
i, j = 1, 2, 3. Then we may expand
% &
1 = Rt R = 1 + At (1 + A)
= 1 + At + A + At A
2 1 + At + A
A t = !A
28
and may therefore be written as
0 a b
Aij = !a 0 c
!b !c 0
RR0 = (1 + A) R0
= R0 + AR0
Since the components of R0 are bounded, the components of AR0 are both
bounded and proportional to one or more 8 of a, b,8c. We may therefore make the
8 8
components of A sufficiently small that 8(AR0 )ij 8 << 1, providing the required
neighborhood.
and Rb + a is a 3-vector.
29
2.4. The construction of Euclidean 3-space from the Euclidean group
The Euclidean group is a Lie group, and therefore a manifold. This means
that starting from the dimensionality of space and its local properties of ho-
mogeneity and isotropy, we have constructed a manifold. Unfortunately, points
on the Euclidean group manifold are speci!ed by six parameters, not three $
the Euclidean group is six dimensional. There is a simple way to reduce this
dimension, however, and there are important ways of elaborating the procedure
to arrive at more complicated objects, including the curved spaces of general
relativity and the higher dimensional spaces of Hamiltonian mechanics.
To recover 3-dim space, we !rst identify the isotropy subgroup of the Euclid-
ean group. In principle this could be almost any subgroup, but we want it to be
a group that leaves points !xed. The rotation group does this, but the trans-
lation group does not. The idea is to identify all points of the group manifold
that differ by isotropy. That is, any two points of the 6-dimensional Euclidean
group manifold that differ by a pure rotation will be regarded as identical.
As we show below, the result in the present case simply Euclidean 3-space.
Why go to all this trouble to construct a space which was an obvious choice from
the start? The answer is that the world is not described by Euclidean 3-space,
but the technique we use here generalizes to more complicated symmetry. When
we carefully analyze the assumptions of our measurement theory, we will !nd
additional symmetries which should be taken into account. What we are doing
here is illustrating a technique for moving from local symmetries of measurement
to possible arenas for dynamics. Ultimately, we will !nd the most relevant arenas
to be curved and of more than three dimensions. Still, the procedure we outline
here will let us deduce the correct spaces.
Returning to the problem of rotationally equivalent points, it is easy to see
which points these are. The points of the Euclidean group are labeled by the
six parameters in matrices of the form
' (
R a
p (R, a) =
0 1
We de!ne a point in Euclidean space to be the set of all p (R, a) that are related
by a rotation. Thus, two points p (R, a) and p# (R# , a# ) are regarded as equivalent
if there exists a rotation ' ## (
## R 0
R =
0 1
30
such that
p# (R# , a# ) = R## p (R, a)
To see what Euclidean space looks like, we choose one representative from each
equivalence class. To do this, start with an arbitrary point, p, and apply an
arbitrary rotation to get
' (' (
## R## 0 R a
R p =
0 1 0 1
' (
R## R R## a
=
0 1
R"1 a = x
Since R is invertible, there is exactly one x for each a $ each is simply some
triple of numbers. This means that distinct equivalence classes are labeled by
matrices of the form ' (
1 x
0 1
which are obviously in 1-1 relationship with triples of numbers, x = (x1, x2 , x3) .
The points of our homogeneous, isotropic 3-dim space are now in 1-1 correspon-
dence with triples of numbers, called coordinates. It is important to distinguish
between a manifold and its coordinate charts, and a vector space. Thus, we
speak of R3 when we mean a space of points and V 3 when we wish to talk of
vectors. It is the source of much confusion that these two spaces are essentially
identical, because most of the generalizations of classical mechanics require us
to keep close track of the difference between manifolds and vector spaces. When
31
we want to discuss the 2-sphere, S 2 , as a spherical space of points labeled by two
angles ((, ) , it is clear that ((, ) is a coordinate chart and not a vector. Thus,
assigning vectors to points on S 2 must be done separately from specifying the
space itself. As we treat further examples for which points and vectors differ,
the distinction will become clearer, but it is best to be mindful of the difference
from the start. When in doubt, remember that the 2-sphere is a homogeneous,
isotropic space!
32
3.2. Curves, lengths and extrema
So far, all we know about our Euclidean space is that we can place its points in
1-1 correspondence with triples of numbers,
x = (x 1 , x 2 , x 3 )
and that the properties of the space are invariant under translations and rota-
tions. We proceed the de!nition of a curve.
3.2.1. Curves
Intuitively, we imagine a curve drawn in the plane, winding peacefully from
point A to point B. Now assign a monotonic parameter to points along the curve
so that as the parameter increases, the speci!ed point moves steadily along the
curve. In Cartesian coordinates in the plane, this amounts to assigning a number
., to each point (x, y ) of the curve, giving two functions (x(.), y (.)) . We may
write:
C (.) = (x(.), y (.))
Sometimes it is useful to think of C as a map. If the parameter . is chosen so
that (x(0), y (0)) are the coordinates of the initial point A, and (x(1), y (1)) are
the coordinates of the point B, the . lies in the interval [0, 1] and the curve C
is a map from [0, 1] into the plane:
C : [0, 1] R2
Notice that a parameter . easily specify the entire curve even if the it loops
around and crosses itself. The alternative procedure of specifying, say, y as a
function of x, breaks down for many curves.
We therefore make the brief de!nition:
De!nition 3.1. A curve is a map from the reals into a manifold,
C:RM
This suffices. Suppose we have a curve that passes though some point p
of our manifold. Around p there is some open set that is mapped by a set of
coordinates into a region of R3 ,
p U
: U V 1 R3
33
More concretely, (p) is some triple of numbers in R3 ,
De!nition 3.2. A differentiable curve is one for which the functions xi (.) are
differentiable. A smooth curve is one for which the functions xi (.) are in"nitely
differentiable.
3.2.2. Lengths
In keeping with Newtonian measurement theory, we need a means of choosing
particular curves as the path of motion of a particle. Taking our cue from
Galileo, we want to describe 'uniform motion.( This should mean something
like covering equal distances in equal times, and this brings us to a notion of
distance.
The simplest idea of distance is given by the Pythagorean theorem for tri-
angles,
a 2 + b 2 = c2
Applying this to pairs of points, if two points in the plane are at positions (x1, y1 )
and (x2, y2 ) then the distance between them is
9
l = (x 2 ! x 1 )2 + (y 2 ! y 1 )2
We may generalize this immediately to three dimensions.
b) Generalize to n-dimensions.
34
Perhaps more important for our discussion of curves is the specialization of
this formula to in!nitesimal separation of the points. For two in!nitesimally
nearby points, the proper distance between them is
,
ds = dx2 + dy 2 + dz 2
This form of the Pythagorean theorem allows us to compute the length of an
arbitrary curve by adding up the lengths of in!nitesimal bits of the curve.
Consider a curve, C (.) with - C (.) = xi (.) . In going from . to . + d.,
the change in coordinates is
dxi
dxi = d.
d.
so the length of the curve from . = 0 to . = 1 is
# 1
l01 = ds (.)
0
# 1:
dxi dxj
= gij d.
0 d. d.
which is an ordinary integral. This integral gives us the means to assign an
unbiased meaning to Galileo#s idea of uniform motion. We have, in principal, a
positive number l01 for each curve from p (0) to p (1) . Since these numbers are
bounded below, there exist one or more curves of shortest length. This shortest
length is the in!mum of the numbers l01 (C ) over all curves C.
For the next several sections we will address the following question: Which
curve C has the shortest length?
To answer this question, we begin with a simpli!ed case. Consider the class
of curves in the plane, given in Cartesian coordinates:
C (.) = (x (.) , y (.))
The length of this curve between . = 0 and . = 1 is
# 1
s = ds
0
# 1,
= dx2 + dy 2
0
;
# 1 ' (2 ' (2
dx dy
= + d.
0 d. d.
35
If the curve always has !nite slope with respect to x, so that dx
d%
never vanishes,
we can choose . = x as the parameter. Then the curve may be written in terms
of a single function, y (.) ,
C (x) = (., y (.))
with length ;
# 1 ' (2
dy
s= 1+ d.
0 d.
We begin by studying this example.
36
Most functionals2 that arise in physics take the form of integrals over some
function of the curve,
# 1 ' (
dx d2 x dn x
f [ x] = L x, , 2 , . . . n d.
0 d. d. d.
where L is a known function of n variables. Our simple example of the length
of a curve given above takes this form with
9
L (y ) = 1 + (y # )2
#
While such integrals are uncommon, nonlocal functions are not. For example, we will later
consider the well-known iterative map
% &
xn+1 = a 1 + bx2n
37
we would have to restrict h (.) to be everywhere nonzero and carefully specify
its uniform approach to zero. To avoid this, we change the argument a little bit
and look at just the numerator. Let h(x) be any continuous, bounded function
that vanishes at the endpoints, xA and xB , but is otherwise arbitrary and set
where y0 (x) is some !xed function. For the moment, let h (x) be considered
small. Then we can look at the difference
-f [y (x)]
f [y (x)] = f [y0 (x)] + h(x) + . . .
-y (x)
Let#s see how this works with our length functional, s [y (.)] :
# 1 9
s= d. 1 + (y #)2
0
Since the integrand is an ordinary function, we can expand the square root.
This works as long as we treat h# as small,
9 9
2
# #
1 + (y 0 + h ) = 1 + (y0# )2 + 2y0# h# + (h# )2
;
9
2y # h# + (h# )2
= 1 + (y0# )2 1 + 0
1 + (y0# )2
9 ' # #
(
2 y 0 h
= 1 + (y0# ) 1 + + %%% (3.2)
1 + (y0# )2
38
where we omit terms of order (h# )2 and higher. At this order,
# 1 9 ' (
# 2
y0# h#
-s = d. 1 + (y0) 1 + !1
0 1 + (y0# )2
# 1
y # h#
= d. 9 0
0 1 + (y0# )2
The expression depends on the derivative of the arbitrary function h, but there#s
no reason not to integrate by parts:
# 1 # #
d yh ! h d 9 y0
-sAB = d. 9 0
0 d. # 2 d. # 2
1 + (y 0 ) 1 + (y 0 )
8 8
8 8
y0# h 8 y0# h 8
= 9 8 ! 9 8
8 8
1 + (y0# ) 8
2
1 + (y0# ) 8
2
%=1
%=0
# 1
d y#
! d.h(.) 9 0
0 d. # 2
1 + (y 0 )
This relation must hold for any sufficiently small function h(x). Since h(x) rep-
resents a small change, -y, in the function y (.), we are justi!ed in writing
# 1 #
d y
-s = ! d. -y 9 0
0 dx # 2
1 + (y 0 )
&s
so the functional derivative we want, &y , seems to be close at hand. But how do
we get at it? We still have -s written as an integral and it contains an arbitrary
function, -y = h (.) . How can we extract information about y (.)? In particular,
&s
what is &y ?
39
First, consider the problem of curves of extremal length. For such curves we
&s
expect &y , and therefore -s to vanish for any variation -y. Therefore, demand
# 1
d y0# =0
! d. -y 9
d.
0 1+ (y0# )2
40
But the point .0 was arbitrary, so we can drop the . = .0 condition $ the
argument holds at every point of the interval. Therefore the function y0 (.) that
makes the length extremal must satisfy
#
d y =0
9 0 (3.9)
d. # 2
1 + (y 0 )
This isjust what we need, if it really gives the condition we want! Integrating
once gives
y#
9 0 = c = const. (3.10)
1 + (y0# )2
so that
:
c
y0# = 4 b = const.
1!c
y0(.) = a + b. (3.11)
Finally, with initial conditions y0(0) = yA and y0(1) = yB ,
yA = a
yB = a + b (3.12)
we !nd
y0(.) = yA + (yB ! yA ) . (3.13)
This is the unique straight line connecting (0, yA )and (1, yB ) .
Exercise 3.3. Find the form of the function x(.) that makes the variation, -f,
of each of the following functionals vanish:
@% dx
1. f [x] = 0 x 2 d. where x = d% .
@% 2
2. f [x] = 0 (x + x2) d.
@%
3. f [x] = 0 (x n ! xm ) d.
@%
4. f [x] = 0 (x 2 (.) ! V (x(.))) d. where V (x) is an arbitrary function of
x(.).
@%
5. f [x] = 0 x2 d. where x = (x, y, z) and x2 = x 2 + y 2 + z 2. Notice that
there are three functions to be determined here: x(.), y (.) and z (.).
41
3.3.2. Formal de!nition of functional differentiation
There are several reasons to go beyond the simple variation of the previous
section. First, we have only extracted a meaningful result in the case of vanishing
variation. We would like to de!ne a functional derivative that exists even when
it does not vanish! Second, we would like a procedure that can be applied
iteratively to de!ne higher functional derivatives. Third, the procedure can
become difficult or impossible to apply for functionals more general than curve
length. It is especially important to have a de!nition that will generalize to !eld
theory. Finally, the added rigor of a formal de!nition allows careful proofs of
the properties of functionals, including generalizations to a complete functional
calculus.
In this section we look closely at the calculation of the previous section to
formally develop the functional derivative, the generalization of differentiation
to functionals. We will give a general de!nition sufficient to !nd &f &x
for any
functional of the form
# 1 ' (
dx d2 x dn x
f [ x] = L x, , 2 , . . . n d.
0 d. d. d.
One advantage of treating variations in this more formal way is that we can
equally well apply the technique to relativistic systems and classical !eld theory.
There are two principal points to be clari!ed:
1. We would like to allow fully arbitrary variations, h (.) .
2. We require a rigorous way to remove the integral, even when the variation,
-f , does not vanish.
We treat these points in turn.
42
&f [x(%)]
such that at extrema, &x(%)
= 0 is equivalent to vanishing variation of f [x] .
Let f be given by
# 1
f [x (.)] = L(x (.) , x(1) (.) , . . . , x(n) (.))d. (3.16)
0
and think of x(., $) as a function of two variables for any given h (.). Equiva-
lently, we may think of x(., $) as a 1-parameter family of functions, one for each
value of $. Then, although f [x (.) + $h (.)] is a functional of x (.) and h (.) , it
is a simple function of $:
#
% &
f [x (., $)] = L x (., $) , x(1) (., $) , . . . , x(n) (., $) d.
#
% &
= L x (., $) + $h (., $) , x(1) + $h(1) , . . . d. (3.18)
43
# ' (8
L (x (., $)) L (x (., $)) (n) 88
= h+ ...+ h 8 d.
x x(n) !=0
# ' (
L (x (.)) L (x (.)) (1) L (x (.)) n
= h+ h + ...+ h d.
x x(1) x(n)
Notice that all dependence of L on h has dropped out, so the expression is linear
in h (.) and its derivatives.
Now continue as before. Integrate the h(1) and higher derivative terms by
parts. We assume that h (.) and its !rst n ! 1 derivatives all vanish at the
endpoints. The k th term becomes
# 1 8 # 1 5 8 6
L 88 dk
L 8
8
(k) 8
h(k) (.) d. = (!1)k d. k h (.)
0 x x(%,!)=x(%) 0 d. x(k) 8x=x(%)
so we have
' (8
d 8
-f [x (.)] = f [x(., $)] 88
d$ !=0
# 1'
L (x (.)) d L (x (.))
= !
0 x d. x(1)
n
(
n d L (x (.))
+ . . . + (!1) h (.) d. (3.21)
d.n x(n)
where now, h (.) is fully arbitrary.
Exercise 3.4. Fill in the details of the preceeding derivation, paying particular
attention to the integrations by parts.
The Dirac delta function To do this correctly, we need yet another class
of new objects: distributions. Distributions are objects whose integrals are wel-
de!ned though their values may not be. They may be identi!ed with in!nite
44
sequences of functions. Therefore, distributions include functions, but include
other objects as well.
The development of the theory of distributions is similar in some ways to the
de!nition of real numbers as limits of Cauchy sequences of rational numbers.
For example, we can think of ! as the limit of the sequence of rationals
31 314 3141 31415 314159
3, , , , , ,...
10 100 1000 10000 100000
By considering all sequences of rationals, we move to the larger class of numbers
called the reals. Similarly, we can consider sequences of functions,
f1(x), f2 (x), . . .
The limit of such a sequence is not necessarily a function. For example, the
sequence of functions
1 "|x| "2|x| 3 "3|x| m
e ,e , e , . . . , e"m|x|, . . .
2 2 2
vanishes at nonzero x and diverges at x = 0 as m . Nonetheless, the
integrals of these functions, on the interval [!, ] is the independent of n,
# #
m "m|x|
e = me"mx
" 2 0
= !e"mx |0
= 1
45
Let f (x) be any function. The Dirac delta function, -(x) is de!ned by the
property #
f (x)- (x)dx = f (0)
Heuristically, we can think of - (x) as being zero everywhere but where its argu-
ment (in this case, simply x) vanishes. At the point x = 0, its value is sufficiently
divergent that the integral of - (x) is still one.
Formally we can de!ne -(x) as the limit of a sequence of functions such
as fn (x), but note that there are many different sequences that give the same
properties. To prove that the sequence fn works we must show that
# :
m " m2 x2
lim f (x ) e 2 dx = f (0)
m 2!
for an arbitrary function f (x) . Then it is correct to write
:
m " m2 x2
-(x) = lim e 2
m 2!
This proof is left as an exercise.
We don#t usually need a speci!c sequence to solve problems using delta func-
tions, because they are only meaningful under integral signs and they are very
easy to integrate using their de!ning property. In physics they are extremely
useful for making 'continuous( distributions out of discrete ones. For example,
suppose we have a point charge Q located at x0 = (x0 , y0, z0) . Then we can write
a corresponding charge density as
/ (x) = Q- (x ! x0 ) = Q- (x ! x0 ) - (y ! y0 ) - (z ! z0 )
Exercise 3.5. Perform the following integrals over Dirac delta functions:
@
1. f (x)- (x ! x0) dx
@
2. f (x)- (ax) dx (Hint: By integrating over both expressions, show that
-(ax) = a1 - (x). Note that to integrate - (ax) you need to do a change of
variables.)
46
@
3. Evaluate f (x)- (n) (x) dx where
dn
- (n) (x) = - (x )
dxn
@
4. f (x)- (x2 ! x20) dx (This is tricky!)
The de!nition at last! Using the idea of a sequence of functions and the
Dirac delta function, we can now extract the variation. So far, we have de!ned
the variation as ' (8
d 8
-f [x (.)] 4 f [x(., $)] 88 (3.22)
d$ !=0
47
where x (., $) = x (.) + $h (.) , and shown that the derivative on the right
reduces to
# 1'
L (x (.)) d L (x (.))
-f [x (.)] = !
0 x d. x(1)
n
(
n d L (x (.))
+ . . . + (!1) h (.) d. (3.23)
d.n x(n)
lim hm (. ! % ) = - (. ! % )
m
where
48
Since we have chosen hm (. ! % ) to approach a Dirac delta function, and
since nothing except hm (. ! % ) on the right hand side of eq.(3.24) depends on
m, we have
A B
-f [x (% )] d
4 lim f [xm (., $, % )]
-x (% ) m d$
!=0
# 1'
L (x (.))
=
0 x
n
' ((
n d L (x (.))
! . . . + (!1) lim hm (. ! % ) d.
d.n x(n) m
# 1' n
' ((
L (x (.)) n d L (x (.))
= ! . . . + (!1) - (. ! % ) d.
0 x d.n x(n)
n
' (
L (x (% )) n d L (x (.))
= ! . . . + (!1)
x d.n x(n)
-f [x (.)]
=0
-x (.)
and recover the same result as we got from vanishing variation. The condition
for the extremum is the generalized Euler-Lagrange equation,
8 8 8
L 88 d L 88 n d
n
L 88
! + % % % + (!1) =0 (3.26)
x 8x(%) d. x(1) 8x(%) d.n x(n) 8x(%)
This equation is the central result this chapter. Consider, for a moment, what
we have accomplished. Any functional of the form
#
S [x (t)] = L (x, x, . . .) dt
C
49
The integrand L of the action functional is called the Lagrangian. In general,
once we choose a Lagrangian, L (x, x, . . .), eq.(3.26) determines the preferred
paths of motion associated with L. What we now must do is to !nd a way to
associate particular forms of L with particular physical systems in such a way
that the extremal paths are the paths the physical system actually follows.
In subsequent chapters we will devote considerable attention to the gener-
alized Euler-Lagrange equation. But before trying to associate any particular
form for L with particular physical systems, we will prove a number of general
results. Most of these have to do with symmetries and conservation laws. We
shall see that many physical laws hold by virtue of very general properties of S,
and not the particular form. Indeed, there are cases where it possible to write
in!nitely many distinct functionals which all have the same extremals! Until we
reach our discussion of gauge theory, we will have no reason to prefer any one of
these descriptions over another. Until then we take the pragmatic view that any
functional whose extremals give the correct equation of motion is satisfactory.
Exercise 3.11. Using the formal de"nition, eq.(3.25) "nd the functional deriv-
ative of each of the following functionals.
@t
1. f [x] = 0 x 2 dt where x = dx
dt
.
@t
2. f [x] = 0 (x 2 (t) ! V (x(t))) dt where V (x) is an arbitrary function of x (t) .
Exercise 3.12. De"ne the functional Taylor series by
#
-f
-f = d. -x
-x
# #
1 - 2f
+ d.1 d.2 -x (.1 ) -x (.2 ) + % % %
2 -x (.2) -x (.1 )
@
and let f [x (.)] = L (x, x#) d.. Using the techniques of this section, show that
the "rst and second functional derivatives are given by
-f L d L
= !
-x 'x d. x (
2
- f 2L d 2L
= ! - (.1 ! .2 )
-x (.1 ) -x (.2 ) xx d.1 xx#
' 2 (
d L - (.1 ! .2 )
!
dt xx .1
2 2
L - (. 1 ! . 2 )
! # #
x x .21
50
Remark 3. To de"ne higher order derivatives, we can simply carry the expan-
sion of L to higher order.
#
S = L (x, x#) d.
# #
-S = L (x + -x, x + -x ) d. ! L (x, x#) d.
# #
#
L L # 1 2L
= L (x, x#) + -x + -x + (-x)2
x x 2! xx
2L 1 2L 2
+ -x-x# + # x#
(-x#) ! L (x, x#) dt
# xx 2! x
L L 1 2L
= -x + # -x# + (-x)2
x x 2! x (.) x (.)
2
L # 1 2L 2
+ #
-x-x + # #
(-x#)
x (.) x (.) 2! x (.) x (.)
Now, to integrate each of the various terms by parts, we need to insert some
delta functions. Look at one term at a time:
# #
L # d L
#
-x = ! -x
x d. x#
#
1 2L
I2 = d. (-x)2
2! xx
# #
1 2L
= d.1 d.2 - (.1 ! .2 ) -x (.1 ) -x (.2 )
2 xx
#
2L
I3 = d. -x-x#
xx#
# #
1
= d.1 d.2 - (.1 ! .2 )
2
2L
" #
(-x (.1) -x# (.2 ) + -x (.2 ) -x# (.1 ))
xx
# # ' ' (
1 d 2L
= ! d.1 d.2 - (.1 ! .2 )
2 d.2 xx#
' ((
d 2L
+ - (. 1 ! . 2 ) -x (.1 ) -x (.2 )
d.1 xx#
# # '
1 2L
= ! d.1 d.2 - (.1 ! .2)
2 .2 xx#
51
2L
+ - (.1 ! .2 )
.1 xx# (
d 2L
+- (. 1 ! . 2 ) -x (.1 ) -x (.2 )
d.1 xx#
# # ' (
1 d 2L
= ! d.1 d.2 - (.1 ! .2 ) -x (.1 ) -x (.2 )
2 d.1 xx#
and "nally,
#
1 2L 2
I4 = d. # #
(-x#)
2! x (.) x (.)
# #
1 2L
= d.1 d.2 - (.1 ! .2 ) # -x# (.1) -x# (.2 )
2 x (.2 ) x# (.1 )
# # ' (
1 2L
= d.1 d.2 - (. 1 ! . 2 ) # -x (.1 ) -x (.2 )
2 .1 .2 x (.1 ) x# (.2)
# # ' (
1 2L
= d.1 d.2 - (.1 ! .2) # # -x (.1 ) -x (.2 )
2 .1 .2 x x
# # ' 2
(
1 2L
= d.1 d.2 - (.1 ! .2) # # -x (.1 ) -x (.2 )
2 .1 .2 x x
# # ' (
1 d 2L
+ d.1 d.2 - (. 1 ! . 2 ) -x (.1 ) -x (.2 )
2 .2 d.1 x#x#
# # ' (
1 2 2L
= d.1 d.2 ! 2 - (.1 ! .2) # # -x (.1 ) -x (.2 )
2 . x x
# # ' 1 (
1 d 2L
+ d.1 d.2 ! - (. 1 ! . 2 ) -x (.1) -x (.2)
2 .1 d.1 x#x#
Combining,
# ' (
L d L
-S = d. ! -x
x d. x#
# #
1 2L
+ d.1 d.2- (.1 ! .2) -x (.1 ) -x (.2 )
2 xx
# # ' (
1 d 2L
! d.1 d.2 - (.1 ! .2) -x (.1 ) -x (.2)
2 d.1 xx#
# # ' (
1 2 2L
+ d.1 d.2 ! 2 - (.1 ! .2 ) # # -x (.1) -x (.2 )
2 .1 x x
52
# # ' (
1 d 2L
+ d.1 d.2 ! - (.1 ! .2 ) -x (.1 ) -x (.2 )
2 .1 d.1 x#x#
# ' (
L d L
= d. ! -x
x d. x#
# # ' 2 (
1 L d 2L
+ d.1 d.2 (- (.1 ! .2 ) !
2 xx d.1 xx#
' 2 (
d L
! - (. 1 ! . 2 )
.1 d.1 x#x#
(
2 2L
! 2 - (. 1 ! . 2 ) -x (.1 ) -x (.2 )
.1 x#x#
This agrees with DeWitt when there is only one function x. Setting
# # #
-S 1 - 2S
-S = d. -x + d.1d.2 -x (.1 ) -x (.2 ) + % % %
-x 2 -x(.1)-x(.2 )
we identify:
-S L d L
= !
-x x d. x#
' (
2
- S 2L d 2L
= ! - (. 1 ! . 2 )
-x(.1)-x(.2 ) xx d.1 xx#
' 2 (
d L - (.1 ! .2 )
! # #
d.1 x x .1
2 2
L - (.1 ! .2 )
! # #
x x .21
Third and higher order derivatives may be de"ned by extending this procedure.
The result may also be found by taking two independent variations from the
start.
53
in!nite limit of the product of !nitely many normal integrals. For more difficult
functional integrals, there are many approximation methods. Since functional
integrals have not apppeared in classical mechanics, we do not treat them further
here. However, they do provide one approach to quantum mechanics, and play
an important role in quantum !eld theory.
T # = !T
S # = S !"1
54
for some tensors of each type, then it is immediate that combining such a pair
gives a scalar, or invariant quantity,
S # T # = S !"1!T = ST
It is also immediate that tensor equations are covariant. This means that the
form of tensor equations does not change when the system is transformed. Thus,
if we arrange any tensor equation to have the form
T =0
T # = !T = 0
from properties like "z ; invariant quantities like L can be physical but coordinate
dependent quantities like "z cannot.
55
There are many kinds of objects that contain physical information. You
are probably most familiar with numbers, functions and vectors. A function of
position such as temperature has a certain value at each point, regardless of how
we label the point. Similarly, we can characterize vectors by their magnitude
and direction relative to other vectors, and this information is independent of
coordinates. But there are other objects that share these properties.
Ai = Mij Aj
Bi = Mij Bj
Ci = Nij Cj + ai
Notice that Ci , that does not transform in the same way as Ai and Bi , nor
does it transform homogeneously. It makes mathematical sense to postulate a
relationship between Ai and Bi such as
Ai = .Bi
Ai = .B
i
Ai = %Ci
and
Ai = %Ci
are not equivalent. Indeed, the second is equivalent to
Mij Aj = % (Nij Cj + ai )
56
"1
or, multiplying both sides by Mmi ,
"1 "1
Mmi Mij Aj = %Mmi (Nij Cj + ai)
"1 "1
Am = %Mmi Nij Cj + %Mmi ai
so that unless N = M "1 and a = 0, the two expression are quite different.
123 = 1
eijk = ijk
57
It is easy to check that this expression is correct by noting that in order for
the left hand side to be nonzero, each of the sets (ijk ) and (lmn) must take
a permutation of the values (123) . Therefore, since i must be 1 or 2 or 3, it
must be equal to exactly one of l, m or n. Similarly, j must equal one of the
remaining two indices, and k the third. The right hand side is therefore a list
of the possible ways this can happen, with appropriate signs. Alternatively,
noting that specifying a single component of a totally antisymmetric object
determines all of the remaining components, we can argue that any two totally
antisymmetric tensors must be proportional. It is then easy to establish that
the proportionality constant is 1.
Exercise 4.1. Prove that eq.(4.1) is correct by the second method. First, show
that the right side of eq.(4.1) is antisymmetric under any pair exchange of the
indices (ijk ) and also under any pairwise exchange of (lmn) . This allows us to
write
ijk lmn = . (- il -jm -kn + - im -jn -kl + -in -jl - km
!-il -jn -km ! -in -jm -kl ! - im - jl -kn )
for some constant .. Show that . = 1 by setting l = i, m = j and n = k on both
sides and "nding the resulting triple sum on each side.
We can now produce further useful results. If, instead of summing over all
three pairs of indices, we only sum over one, we !nd that
ijk imn = -ii - jm - kn + - im - jn - ki + - in - ji -km
!-ii -jn - km ! -in -jm -ki ! -im - ji - kn
= 3-jm -kn + - km - jn + - jn -km ! 3- jn -km ! -kn - jm ! -jm -kn
= -jm -kn ! -jn -km (4.2)
Since the cross product may be written in terms of the Levi-Civita tensor as
[u " v]i = ijk uj vk
this identity gives a simple way to reduce multiple cross products.
Exercise 4.2. Prove that the components of the cross product may be written
as
[u " v]i = ijk uj vk
58
then use the identity of eq.(4.2) to prove the %bac-cab& rule:
a " (b " c) = b (a % c) ! c (a % b)
Exercise 4.3. Prove that
ijk ijn = 2- kn
When we generalize these results to arbitrary coordinates, we will !nd that
the Levi-Civita tensor eijk requires a multiplicative function. When we gen-
eralize the Levi-Civita tensor to higher dimensions it is always a maximally
antisymmetric tensor, and therefore unique up to an overall multiple. Thus, in
d-dimensions, the Levi-Civita tensor has d indices, ei1 i2 ...id , and is antisymmetric
under all pair interchanges. In any dimension, a pair of Levi-Civita tensors may
be rewritten in terms of antisymmetrized products of Kronecker deltas.
59
so we can pull out the time derivative. Let the body rotate with angular velocity
, where the direction of is along the axis of rotation according to the right-
hand rule. Then, since the density is independent of time,
#
d
N = / (r " v) d3 x
dt
#
d
= / (r " ( " r)) d3 x
dt
#
d % &
= / r2 ! r ( % r) d3 x
dt
Here#s the important part: we can separate the dynamics from physical prop-
erties of the rigid body if we can get omega out of the integral, because then
the integral will depend only on intrinsic properties of the rigid body. We can
accomplish the separation if we write the torque in components:
#
d
Ni = / (r " v)i d3 x
dt
#
d % &
= / i r2 ! ri ( j rj ) d3 x
dt
#
d % &
= / - ij j r2 ! ri ( j rj ) d3 x
dt
#
d % &
= j / -ij r2 ! ri rj d3 x
dt
Notice how we inserted an identity matrix, using i = - ij j , to get the indices
on both factors of j to be the same. Now de!ned the moment of inertia tensor,
#
% &
Iij 4 / -ij r2 ! ri rj d3 x
This is nine separate equations, one for each value of i and each value of j.
However, since the moment of inertia tensor is symmetric, Iij = Iji , only six of
these numbers are independent. Once we have computed each of them, we can
work with the much simpler expression,
d
Ni = Iij j
dt
The sum on the right is the normal way of taking the product of a matrix on a
vector. The product is the angular momentum vector,
Li = Iij j
60
and we have, simply,
dLi
Ni =
dt
We can also write the rotational kinetic energy of a rigid body in terms of the
moment of inertia tensor,
#
1
T = v2dm
2
#
1
= ( " r) % ( " r) /(x)d3 x
2
Working out the messy product,
Therefore,
#
1
T = % (r " ( " r)) /(x)d3 x
2
#
1 % &
= ij - ij r2 ! ri rj /(x)d3x
2
1
= Iij i j
2
The metric tensor We have already used the Pythagorean formulat to spec-
ify the length of a curve, but the usual formula works only in Cartesian coordi-
nates. However, it is not difficult to !nd an expression valid in any coordinate
system $ indeed, we already know the squared separation of points in Cartesian,
polar and spherical coordinates,
ds2 = dx2 + dy 2 + dz 2
ds2 = d/2 + /2 d'2 + dz 2
ds2 = dr2 + r2 d( 2 + r2 sin2 ( d'2
respectively. Notice that all of these forms are quadratic in coordinate differ-
entials. In fact, this must be the case for any coordinates. For suppose we
61
have coordinates yi given as arbitrary invertible functions of a set of Cartesian
coordinates xi ,
y i = y i (x j )
x i = x i (y j )
Then the differentials are related by
xi
dxi = dyj
yj
The squared separation is therefore
ds2 = dx2 + dy 2 + dz 2
= dxi dxi
' (' (
xi xi
= dyj dyk
yj yk
xi xi
= dyj dyk
yj yk
= gjk dyj dyk
where in the last step we de!ne the metric tensor, gjk , by
xi xi
gjk 4
yj yk
Using the metric tensor, we can now write the distance between nearby points
regardless of the coordinate system. For two points with coordinate separations
dxi , the distance between them is
,
ds = gij dxi dxj
We have already introduced the metric as a way of writing the in!nitesimal
line element in arbitrary coordinates,
ds2 = gij dxi dxj
We show below that the metric also characterizes the inner product of two
vectors:
u % v = gij ui vj
The metric is a rank two covariant tensor.
62
Exercise 4.4. Find the metric tensor for:
2. spherical coordinates.
dSi = ni d2 x
then the numbers pi are just the forces per unit area $ i.e., pressures $ in each
of the three independent directions. Any off-diagonal elements of Pij are called
stresses. These are due to components of the force that are parallel rather than
perpendicular to the surface, which therefore tend to produce shear, shifting
parallel area elements along one another.
Thus, when P12 is nonzero, there is an x-component to the force on a surface
whose normal is in the y direction. But now consider P21 , which gives the y -
component of the force on a similar surface, with normal in the x direction. The
z -component of the torque produced on a cube of side 2a by these two forces
together, about the center of the cube is
63
The moment of inertia of the in!nitesimal cube, taking the density constant, is
1 0 0
1
Iij = /a5 0 1 0
12
0 0 1
4.2. Vectors
The simplest tensors are scalars, which are the measurable quantities of a theory,
left invariant by symmetry transformations. By far the most common non-
scalars are the vectors, also called rank-1 tensors. Vectors hold a distinguished
position among tensors $ indeed, tensors must be de!ned in terms of vectors.
The reason for their importance is that, while tensors are those objects that
transform linearly and homogeneously under a given set of transformations, we
require vectors in order to de!ne the action of the symmetry in the !rst place.
Thus, vectors cannot be de!ned in terms of their transformations.
In the next subsection, we provide an axiomatic, algebraic de!nition of vec-
tors. Then we show how to associate two distinct vector spaces with points of
a manifold. Somewhat paradoxically, one of these vector spaces is called the
space of vectors while the other is called the space of 1 -forms. Fortunately, the
existence of a metric on the manifold allows us to relate these two spaces in a
1-1, onto way. Moreover, the metric allows us to de!ne an inner product on
each of the two vectors spaces. Therefore, we discuss the properties of metrics
in some detail.
64
After the geometric description of vectors and forms, we turn to transfor-
mations of vectors. Using the action of a group on a vector space to de!ne a
linear representation of the group, we are !nally able to de!ne outer products of
vectors and give a general de!nition of tensors in terms of their transformation
properties.
2. Scalar identity: 1v = v
4. Distributive 1: (a + b) v = av + bv
5. Distributive 2: a (u + v) = au + av
65
All of the familiar properties of vectors follow from these. An important
example is the existence of a basis for any !nite dimensional vectors space. We
prove this in several steps as follows.
First, de!ne linear dependence. A set of vectors {vi | i = 1, . . . , n} is linearly
dependent if there exist numbers {ai | i = 1, . . . , n} , not all of which are zero,
such that the sum ai vi vanishes,
ai vi = 0
B = {vi | i = 1, . . . , n}
Then, since every set with n + 1 elements is linearly dependent, the set
{u} B = {u, vi | i = 1, . . . , n}
Exercise 4.5. Prove that two vectors are equal if and only if their components
are equal.
66
Notice that we have chosen to write the labels on the basis vectors as sub-
scripts, while we write the components of a vector as superscripts. This choice
is arbitrary, but leads to considerable convenience later. Therefore, we will
carefully maintain these positions in what follows.
Often vector spaces are given an inner product. An inner product on a vector
space is a symmetric bilinear mapping from pairs of vectors to the relevant !eld,
F,
g :V "V F
Here the Cartesian product V " V means the set of all ordered pairs of vec-
tors, (u, v) , and bilinear means that g is linear in each of its two arguments.
Symmetric means that g (u, v) = g (v, u) .
There are a number of important consequences of inner products.
Suppose we have an inner product which gives a nonnegative real number
whenever the two vectors it acts on are identical:
g ( v , v ) = s2 * 0
where the equal sign holds if and only if v is the zero vector. Then g is a norm
or metric on V $ it provides a notion of length for each vector. If the inner
product satis!es the triangle inequality,
g (u + v, u + v) + g (u, u) + g (v, v)
so if we know how g acts on the basis vectors, we know how it acts on any pair
of vectors. We can summarize this knowledge by de!ning the matrix
gij 4 g (vi, vj )
67
Now, we can write the inner product of any two vectors as
g (u, v) = ai gij bj = gij ai bj
It#s OK to think of this as sandwiching the metric, gij , between a row vector
ai on the left and a column vector bj on the right. However, index notation
is more powerful than the notions of row and column vectors, and in the long
run it is more convenient to just note which sums are required. A great deal of
computation can be accomplished without actually carrying out sums. We will
discuss inner products in more detail in later Sections.
Given a space together with functions and curves, there are two ways to
associated a space of vectors with each point. We will call these two spaces
68
f orms and vectors, respectively, even though both are vector spaces in the
algebraic sense. The existence of two distinct vector spaces associated with
a manifold leads us to introduce some new notation. From now on, vectors
from the space of vectors will have components written with a raised index,
v i , while the components of forms will be written with the index lowered, i .
The convention is natural if we begin by writing the indices on coordinates in
the raised position and think of a derivative with respect to the coordinates as
having the index in the lowered position. The bene!ts of this convention will
quickly become evident. As an additional aid to keeping track of which vector
space we mean, whenever practical we will name forms with Greek letters and
vectors with Latin letters.
The de!nitions are as follows:
De!nition 4.1. A f orm is de"ned for each function as a linear mapping from
curves into the reals. The vector space of forms is denoted V! .
De!nition 4.2. A vector is de"ned for each curve as a linear mapping from
functions into the reals. The vector space of vectors is denoted V ! .
Here#s how it works. For a form, start with a function and think of the form,
f , as the differential of the function, f = df. Thus, for each function we have
a form. The form is de!ned as a linear mapping on curves, f : f R. We can
think of the linear mapping as integration along the curve C, so
#
f (C ) = df = f (C (1)) ! f (C (0))
C
This argument shows that integrals of the differentials of functions are forms,
but the converse is also true $ any linear mapping from curves to the reals may
69
be written as the integral of the differential of a function. The proof is as follows.
Let ' be a linear map from differentiable curves to the reals, and let the curve
C beC parameterized
D by s [0, 1]. Break C into N pieces, Ci , parameterized by
i"1 i
s N , N , for i = 1, 2, ...N . By linearity, ' (C ) is given by
N
4
' (C ) = ' (C i )
i=1
By the differentiability (hence continuity) of ', we know that ' (Ci ) maps Ci to
a bounded interval in R, say, (ai , bi ) , of length |bi ! ai| . As we increase N, each
of the numbers |bi ! ai | tends monotonically to zero so that the value of ' (Ci )
approaches arbitrarily close to the value ai. We may therefore express ' (C ) by
N
4
' (C ) = lim ' (Ci )
N
i=1
4N
ai
= lim ds
N
i=1
|b i ! a i |
70
so that the linear map on curves is the integral of a differential.
For vectors, we start with the curve and think of the corresponding vector
as the tangent to the curve. But this 'tangent vector( isn#t an intrinsic object
$ straight arrows don#t !t into curved spaces. So for each curve we de!ne a
vector as a linear map from functions to the reals $ the directional derivative of
f along the curve C . The directional derivative can be de!ned just using C (.) :
f (C (. + -.) ! f (C (.)))
v (f ) = lim
&%0 -.
It is straightforward to show that the set of directional derivatives forms a vector
space. In coordinates, we#re used to thinking of the directional derivative as just
v % f
dxi
and this is just right if we replace v by the tangent vector, d%
:
dxi f
v (f ) =
d. xi
We can abstract v as the differential operator
dxi
v=
d. xi
i
and think of dx
d%
as the components of v and x i as a set of basis vectors.
For both forms and vectors, the linear character of integration and differen-
tiation guarantee the algebraic properties of vector spaces, while the usual chain
rule applied to the basis vectors,
xi k
dxi = dy
y k
y k
=
xi xi y k
together with the coordinate invariance of the formal symbols and v, shows
% &
that the components of and v transform to a new coordinate system y i xk
according to
dy k y k dxm y k m
vk = = = v
d. xm d. xm
xk xk
k = = = k
y k y i xk y i
71
Since the Jacobian matrix, J k m, and its inverse are given by
y k
Jk m =
xm
xk
J#k m =
y m
we can write the transformation laws for vectors and forms as
vk = J k m v m (4.3)
k = J#m k m
(4.4)
Exercise 4.7. Prove that the set of directional derivatives satis"es the algebraic
de"nition of a vector space.
Exercise 4.8. Prove that the set of forms satis"es the algebraic de"nition of a
vector space.
72
There is a natural duality between the coordinate basis for vectors and the
coordinate basis for forms. We de!ne the bracket between the respective basis
vectors by E F
, dx = - ij
i
xj
This induces a linear map from V ! " V! into the reals,
#, $ : V ! " V! R
given by
E F
j i
#v, $ = v , i dx
xj
E F
j i
= v i , dx
xj
= v j i -ij
= v i i
#v,
$ = vi
i
% i &% &
= J m v m J#n i n
= J#n J i v m n
i m
n m
= - mv n
= v mm
= #v, $
73
4.3. The metric
We have already introduced the metric as a line element, giving the distance
between in!nitesimally separated points on a manifold. This de!nition may be
used to de!ne an inner product on vectors and forms as well.
dxi
v=
d. xi
dxi
with components d%
and xi
as a set of basis vectors. Then rewriting the form
as
dxi dxj 2
ds2 = gij d.
d. d.
' (2
ds dxi dxj
= gij (4.5)
d. d. d.
shows us that gij provides a norm on vectors. Formally, g is linear map from a
vector into the reals,
g:AR
where 8
dxi (.) 88
A=
d. xi 8P
We generalize this to arbitrary pairs of vectors as follows.
74
An inner product on the space of vectors at a point on a manifold is a symmetric,
bilinear mapping
g : (A, B) R
g (A, B) = Ai B j gij
75
Since each basis set can be expanded in terms of a coordinate basis,
E F
m
ej m
i
, en dx n
= -ji
x
It follows that the matrix en i giving the form basis in terms of the coordinate
differentials is inverse to the matrix giving the dual basis for vectors.
Now consider an arbitrary vector, v V ! and form, V! . The duality
relation becomes a map,
G H
# v , $ = v j e j , i e i
G H
= v j i ej , ei
= v j i -ij
= v i i
Using this map together with the metric, we de!ne a unique 1-1 relationship
between vectors and forms. Let w be an arbitrary vector. The 1 -form
corresponding to w is de!ned by demanding
g (v , w ) = # v , $ (4.6)
gij v iwj = v i i
In order for this to hold for all vectors v i , the components of the form must
be related to those of w by
i = gij wj
Formally, we are treating the map
g : (V ! , V ! ) R
76
as a mapping from V ! to R by leaving one slot empty:
g (v, %) : V ! R
Now equating these and using the relationship between vectors and forms,
% &
gik uk gjl v lg ei , ej = uk v l gkl
This bit of notation will prove extremely useful. Notice that gij is the only
matrix whose inverse is given this special notation. It allows us to write
g ij gjk = - ik = gkj g ji
and, as we shall see, makes the relationship between forms and vectors quite
transparent. Continuing, we multiply both sides of our expression by two inverse
metrics:
% &
g mk g nl gik gjl g ei, ej = g mk g nl gkl
% i j&
-m n
i -j g e , e = g mk -nk
g (em , en ) = g mn
77
This establishes the action of the metric on the basis. The inner product of two
arbitrary 1-forms follows immediately:
% &
g (", # ) = 1i vj g ei , ej = g ij 1i 2 j
g (", # ) = g (u, v)
#v, $ = v i i
wi = g ij j
i = gij wj
g (u, v) = ui v j gij
g (", # ) = g ij 1i 2 j
wi = g ij j
i = gij wj
Exercise 4.10. Suppose the components of a certain vector "eld in polar co-
ordinates are given by v i = (/ sin , ! cos ) . Find the components of the cor-
responding form, 2 i . What is the duality invariant, #v, 2 $? What is the norm,
g (v, v )?
78
index is repeated, once up and once down, we perform a sum. This still leads
to equations with some doubled indices and some free indices, such as
Tij v j = i
v i = g ij vj
vi = gij v j
79
When we do this, the only distinction between forms and vectors is the position
of the index. We can 'raise( and 'lower( free indices in any equation by using the
inverse metric and the metric. Let#s look at the torque equation for an example.
The angular momentum is the inner product of the moment of inertia tensor
with the angular velocity, so the equation requires a metric (in the previous
section we assumed Cartesian coordinates, so the metric was just the identity
matrix). Therefore we have
Li = I ij gjk k = I i k
k
where we have used the metric to lower the second index on I ij . Then
d % i &
Ni = I k k
dt
d k
= Ii k
dt
where we assume the moment of inertia is constant in time. If we multiply the
entire equation by gmi , it becomes
dk
gmi N i = gmi I i k
dt
d k
Nm = Imk
dt
The two forms of the equation are completely equivalent since we can always
return the indices to their original positions. The same principles work with
each index of %any
& tensor, regardless of the rank. For example, if we have a
tensor of type 43 with components T ij kl mno , we can convert it to a tensor of
%&
type 52 by raising any of the indices k, l or o:
T ijsl mn
o = g sk T ij kl mn
o
T ij k smn
o = g sl T ij kl mn
o
T ij kl mns
= g so T ij kl mn
o
80
Exercise 4.12. The examples of Section (2.2) are now correctly written as
Tijk v j wk + i = Sij uj
This now represents a relationship between certain 1-forms. Rewrite the expres-
sion as a relationship between vectors.
gij = g (ei , ej )
' (
m n
= ei ej g m
, n
x x
m n
= ei ej gmn
Then, since gmn is invertible, none of the ai vanish and we can de!ne
1
ei = , Ei
|a i |
Then the norm of each ei is either +1 or !1, and the inner product takes the
81
form
1
..
.
1
3 ab = g (e a , e b ) =
!1
..
.
!1
where there are p positive and q negative elements, with p + q = d. We will
reserve the symbol 3 ab for such orthonormal frames, and will distinguish two
kinds of indices. Letters from the beginning of the alphabet, a, b, c, . . . will refer
to an orthonormal basis, while letters from the middle of the alphabet, i, j, k, . . .
will refer to a coordinate basis.
The relationship between an orthonormal frame (or pseudo-orthonormal if q
is nonzero) and a coordinate frame therefore appear as
m
ea = ea
xm
The form basis dual to an orthonormal vector basis is given by
ea = em a dxm
gmn = em aen b 3 ab
3 ab = ea m eb n gmn (4.7)
Exercise 4.14. Let g mn denote the inverse to gmn , so that g in gnm = - im . Simi-
larly, let 3 ab be inverse to 3 ab. Show that
g mn = 3 ab ea m eb n
3 ab = em aen b g mn
! "
Exercise 4.15. The polar basis vectors ' , k form an orthonormal set, and
,
since the line element is
82
the metric in polar coordinates is
1
gmn = /2
1
1. Find the inverse metric.
! "
2. Express the orthonormal basis ea = ' , k in terms of the coordinate
,
! "
basis * , , z .
m
ea = ea
xm
where
3 ab ea m
eb n
= g mn
1
3 ab = 1
1
Therefore, we may choose
1
ea m
= 1
*
1
so that
m
ea = ea
xm
' 1 *
= 1
*
k
1 z
83
and therefore
=
'
/
1
=
/
k =
z
! "
,
Exercise 4.16. The spherical basis vectors r, % form an orthonormal set,
and since the line element is
84
4.4. Group representations
One important way to understand tensors is through their transformation prop-
erties. Indeed, we will de!ne tensors to be the set of objects which transform
linearly and homogeneously under a given set of transformations. Thus, to
understand tensors, we need to understand transformations, and in particular,
transformation groups. The most important transformation groups are the Lie
groups. To specify how a Lie group acts on tensors, we !rst de!ne a linear
representation of a Lie group. Such a representation requires a vector space.
To understand the idea of a representation, note that there are many ways
to write any given Lie group. For example, the group of translations in 1-dim
may be written as ' (
d
g (a) = exp a
dx
because when we act on an arbitrary function of x, g (a) just gives the Taylor
series for f (x + a) expanded about f (x) :
' (
d
g (a) f (x) = exp a f
dx
4
an dn f
= n
(x )
n=0
n! dx
= f (x + a )
In particular, the Taylor series for the action of g on x has only two terms,
' (
d
g (a) x = exp a x
dx
= x+a
But we have already seen a quite different representation for tranlations. Rep-
resenting a point on the real line by a pair
' (
x
1
85
so that ' ( ' (
x x+a
g (a ) =
1 1
When considering tensors, we will use only linear representations of Lie
groups. A linear representation of a group is a vector space upon which the
group acts. Be aware that vector spaces can take surprising forms $ both of the
examples above for the translations are linear representations. The difference
is that the !rst vector space is a function space, whereas the second is a 2-dim
vector space. Thus, different vector space give rise to different representations.
Once we choose a vector space on which a group acts, the form of the group
elements is !xed. Thus, if we choose a function space as the representation and
demand that
g (a ) x = x + a (4.8a)
then the form of g (a) is determined. To see how, notice that we may rewrite
the right side of eq.(4.8a) as
' (
d
x+a = 1+a x
dx
' (
d
= exp a x
dx
86
so the form of g is determined by its action on the basis vectors. By closure, this
action must be another vector, and therefore expressible in terms of the basis
vectors,
g (e i ) = u
= u(i) j ej
4 ej g j i
wj = g j i v i
4.5. Tensors
There is a great deal that is new in the notation we have just introduced, and
the reader may wonder why we need these tools. Certainly there is a great deal
of power in being able to use any coordinates, but we could probably !gure out
the expressions we need on a case by case basis. However, there are some deeper
things going on. First, we are gaining access to new objects $ we introduced
the moment of inertia tensor and the metric tensor, and in time will introduce
other tensors of even higher rank. Without these tools, these new objects won#t
make much sense, even though the objects directly describe physical properties
of material bodies.
87
But there is a more important reason for learning these techniques. Over the
last 50 years, symmetry has come to play a central role in understanding the
fundamental interactions of nature and the most basic constituents of matter. In
order to study these particles and interactions, we need to work with objects that
transform in a simple way under the relevant symmetry. Thus, if we want to use
the group SU (2) to study the weak interaction, we need to be able to write down
SU (2) invariant quantities in a systematic way. Similarly, if we want to study
special relativity, we need to work with Lorentz tensors $ that is, objects which
transform linearly and homogeneously under Lorentz transformations. Knowing
these, we can easily construct objects which are Lorentz invariant using the 4
-dimensional equivalent of the three dimensional dot product. Such invariants
will be the same for all observers, so we won#t need to worry about actually
doing Lorentz transformations. We will have formulated the physics in terms of
quantities that can be calculated in any frame of reference.
We are now in a position to develop tensors of arbitrary rank (0, 1, 2, . . .) and
type (form, vector). We accomplish this by taking outer products of vectors and
forms. Given two vectors u and v we can de!ne their (linear) outer product,
u.v
If we think of u and v as directional derivatives along curves parameterized by
. and * respectively, then we can let the outer product act on a pair of functions
(f, g ) to get
df dg
(u . v) (f, g ) =
d. d*
so the product is a doubly linear operator,
d d
u.v = .
d. d*
We can also expand in a basis and think of the product as a matrix,
M = u.v
= ui v j e i . e j
with components
[M]ij = ui v j
This is just what we would get if we took a column-row product:
1 1 1 1 2 1 3
u % & u v u v uv
u2 v 1 , v 2 , v 3 = u2 v 1 u2 v 2 u2 v 3
u3 u3 v 1 u3 v 2 u3 v 3
88
Of course, the most general 3 " 3 matrix cannot be written as ui v j since there are
not enough degrees of freedom. We can !x this by taking linear combinations.
To see what this means, we examine the basis elements.
Consider the basis elements, ei . ej . These are regarded as formally inde-
pendent, and they may be written as a basis for matrices. If we choose an
orthonormal basis with
1 0 0
e1 =
0 , e2 =
1 , e3 = 0
0 0 1
and so on so that ei . ej has a 1 in the ith row and the j th column and zeros
everywhere else. Linear combinations of these clearly give every possible 3 " 3
matrix. Because the outer product is linear, we can add two or more of these
matrices together,
M = u. v + w .s + ...
% &
= u i v j + w i sj + . . . e i . ej
ij
= [M] ei . ej
Since the nine matrices ei . ej form a basis for all matrices, it is clear that by
adding together enough outer products of vectors we can construct any matrix
we like.
Exercise 4.17. Show that the set of all matrices, M = {M} , forms a vector
space. What is the dimension of M?
The vector space M is the space of second rank tensors. To prove this we
must show that they transform linearly and homogeneously under coordinate
transformations. They do because they are built linearly from vectors.
89
Exercise 4.18. Let M be expanded in a coordinate basis,
M = [M]ij i
. j
x x
Treating M as invariant, show that the components of M transform with two
factors of the Jacobian matrix, i.e;,
I Jij
M = J i m J j n [M]mn
T = u1 . u2 . . . . . un + v1 . v2 . . . . . vn + . . .
In order to keep track of the rank of tensors, we can use abstract index notation,
putting a lower case latin index on T for each rank. A rank-n tensor will have
n labels
Ta1 a2 ...an
Keep in mind that these are labels, not indices: Ta1 a2 ...an is the same (invariant)
object as T, but with a bit of extra information attached. By looking at the
labels we immediately know that Ta1 a2 ...an is n-times contravariant.
Alternatively, we can write the components of T in any basis we choose.
These are written using letters from further down the alphabet, or a different
alphabet altogether,
T m1 m2 ...mn
Since these objects are components in a particular basis, and therefore change
when we change the basis. The relationship between the notations is evident if
we write
Ta1 a2 ...an = T m1 m2 ...mn em1 . em2 . % % % . emn
Usually one or the other form is being used at a time, so little confusion oc-
curs. Each value of each mi gives a different component of the tensor, so in
3-dimensions a rank-n tensor will have 3n different components.
We will not use abstract index notation outside of this section of this book,
for two reasons. First, it is unnecessarily formal for an introductory work.
There are numerous differential geometry and general relativity books that use
the convention throughout, so that the interested reader can easily learn to
90
use it elsewhere. Second, we use early-alphabet and middle-alphabet letters to
distinguish between orthonormal and coordinate bases $ a different distinction
that will be more important for our purposes. This convention will be introduced
in our discussion of differential forms.
When working with vector and forms together, we can take arbitrary outer
products of both:
T = u1 . 2 . . . . . un
+v 1 . ( 2 . . . . . v n + . . .
Notice that any alternation of vectors and forms must always occur in the same
order. In this example, the second position is always a form. When forms are
present, the corresponding labels or indices of T will be written as subscripts to
indicate that they are in a form basis. We need to keep the horizontal spacing
of the indices !xed so we don#t lose track of which position the forms occur in.
Thus,
Ta1a2 a3 ...an = T mm1 2 m3 ...mn em1 . em2 . em3 . % % % . emn
If we have a metric, so that forms and vectors correspond to the same object,
we can raise and lower any indices we like on any tensor. For example, formally,
or in components
T m1m m3 ...mn = gmn T m1 nm3 ...mn
In general, the abstract indices are moved about in the same way as coordinate
indices. This is only confusing for a while.
The most general tensors %m&have m contravariant and n covariant labels or
indices. They are of type n , and may be regarded as multilinear mappings
from n functions and m curves to the reals. They transform with m copies of
the Jacobian matrix and n copies of its inverse.
The space of tensors of all ranks is generally large enough to encompass our
needs for physical quantities. Tensors themselves are not measurable directly
because their components change when we change coordinates. But, like the
length of a vector, we can form truly invariant combinations. Making invariant
combinations is easy because tensors transform covariantly. This is a different
use of the word covariant! All we mean here is that tensors, of any type and
91
rank, transform linearly and homogeneously under coordinate transformations.
Because of this, whenever we form a sum between a form-type tensor index and
a vector-type index, the 'dummy( indices no longer transform $ the Jacobian
of one dummy index cancels the inverse Jacobian of the other. Therefore, any
inner products of tensors transform according to the number of f ree indices.
To form an invariant quantity $ one capable of physical measurement $ we only
need to produce an expression with no free indices. For example, the rotational
kinetic energy
1
T = Iij i j
2
is coordinate invariant because it has no free indices. Any exotic combination
will do. Thus,
T ijk mn v mIjk Rn i
is coordinate invariant,
%mand,
& in principal, measurable.
We also consider n tensors which have m contravariant and n covariant
indices, in some speci!ed order. Notice how the convention for index placement
corresponds to the tensor type. When dealing with mixed tensors, it is important
to exactly maintain index order: T a b c is a different object than T ac b !
By having tensors at hand, it becomes easy to form quantities which, like the
dot product, are invariant under a set of transformations. It is these invariant
quantities that must constitute our physical theory, because physical quantities
cannot depend on our choice of coordinates.Inner products and norms.
The tensors we have discussed here are covariant with respect to the diffeo-
morphism group in 3 dimensions. Evaluated at any point, such a transformation
is a general linear transformation, hence an element of the Lie group GL (3) .
However, we may de!ne other objects where the relevant transformations are
given by a Lie group. In the next sections, for example, we will consider ortho-
normal bases for vectors and forms. By placing this restriction on the allowed
bases, we restrict the allowed transformations to orthogonal group, SO (3) . Nu-
merous other choices have physical application as well.
92
Part II
Motion: Lagrangian mechanics
Starting our investigation with our immediate perceptions of the world, we chose
to model the world with a 3-dimensional space, with a universal time. In order
to guarantee that the physical properties of an object do not depend on the
absolute position or orientation of the object, we asked for the space to be
homogeneous and isotropic. This led us to construct space from the Euclidean
group of translations and rotations. To be able to describe uniform motion in
the resulting space, we developed the tools of variational calculus. The result
was the generalized Euler-Lagrange equation. We now turn to a systematic
study of the generalized Euler-Lagrange equation, eq.(3.26), as a description of
motion in Euclidean 3-space. In this chapter we explore some of the properties
of Lagrangian systems which depend only on certain general properties of the
action functional.
According to the claims of the previous chapters, in order for the Euler-
Lagrangian equation to have physical content, it must be a tensor. It may seem
to be sufficient for it to be a tensor with respect to the Euclidean group, since
that is the symmetry of our chosen arena. But remember that we also need to
avoid any dependence on our choice of coordinates $ the laws of motion should
not depend on how we label points. For this reason, it is very important that
our description of motion be covariant with respect to the full diffeomorphism
group.
Once we have established the tensor character of the Euler-Lagrange equa-
tion, we turn to the idea of symmetry. De!ning symmetry as an invariance
of the action, we prove the Noether theorem $ for every continuous symmetry
of a Lagrangian system, we can !nd a corresponding conserved quantity. We
then study a number of common symmetries and their corresponding conserva-
tion laws. Because the generalized Euler-Lagrange equation conceals as much
as it reveals about these conservation laws, we begin with a restricted class of
Lagrangians $ those which depend only on the path, xi (t) , and its !rst time
derivative, the velocity v i = x i . Then, for completeness, we extend some of the
results to general Lagrangians.
93
5. Covariance of the Euler-Lagrangian equation
We begin with the symmetry of sets of variational equations. Suppose we have
a functional S , expressed as an integral along a curve, C = xi (t) :
#
S [x(t)] = L (x, x, . . .) dt
C
The function L is called the Lagrangian. Suppose further that the Lagrangian
L depends only on x, x and t, but not higher derivatives. Then the generalized
Euler-Lagrange equation, eq.(3.26), reduces to the Euler-Lagrange equation,
' (
L d L
! =0 (5.1)
xk dt x k
dropping the surface term in the !nal step, since the variation is taken to vanish
at the endpoints. Now compare what we get by varying S [ x (q i (t) , t)] with
94
respect to q i (t):
0 = -S#
= - L (x (q, t) , x (q, q, t)) dt
C
# ' (
L xk i L x k i L x k i L xk i
= -q + k i -q + k i -q + k i -q dt
C xk q i x q x q x q
k
Since xi is a function of q j and t only, x
qi
= 0 and the last term vanishes.
Expanding the velocity, x,
explicitly, we have:
k dxk
x =
dt
d k% i &
= x q (t ) , t
dt
xk i xk
= q + (5.2)
q i t
so that, differentiating,
x k xk
=
q i q i
Finally, we differentiate eq.(5.2) for the velocity with respect to q i :
x k 2 xk j 2 xk
= q
+
q i q iq j q it
' k(
x j xk
= q
+
q j q i t q i
d xk
=
dt q i
Substituting, the variation now reduces to
0 = -S
# ' ' ( (
L xk i L d xk i L xk i
= -q + k -q + k i -q dt
C xk q i x dt q i x q
# ' ' ( ' ((
L xk L d xk i d L xk
= k i
+ k i
-q ! k i
-q i + surf ace term
C x q x dt q dt x q
95
# ' ' ( (
L xk d L xk
= k q i
! k i
-q i
C x dt x
q
# ' ' (( k
L d L x i
= k
! k
-q
C x dt x q i
We can now write -S in either of two ways:
# ' ' ((
L d L
-S = i
! i
-q i dt
C q dt q
or # ' ' ((
L d L xk i
-S = k
! -q dt
C x dt x k q i
Since the coefficient of -q must be the same whichever way we write the varia-
i
Exercise 5.1. Let L be a function of xi (t) and its "rst and second derivatives
and consider an arbitrary diffeomorphism, xi = xi (q (t) , t) Repeat the preceed-
ing calculation, considering variations, -q i, which vanish at the endpoints of the
motion, together with their "rst time derivative,
-q i = -qi = 0
to show that the generalized Euler-Lagrange equation is a diffeomorphism group
tensor.
96
again to "nd
dx k
x$k =
dt' (
d xk i xk
= q +
dt q i t
k
x i x j i 2 xk i 2 xk
2 k
= q
$ + q q + i q +
q i q j q i q t t2
from which we can easily "nd the partial derivatives, of xk , x k , x$k with respect
to q m , q m and q$m . It is also useful to show by expanding in partial derivatives
that
d xk x k
=
dt q m q m
The remainder of the problem is straightforward, but challenging.
where x(i k) denotes the kth derivative of xi (t) , is extremal when xi (t) satis!es
the generalized Euler-Lagrange equation,
8 8 8
L 88 d L 88 n d
n
L 88
! + % % % + (!1) =0 (6.1)
x 8x(t) dt x(1) 8x(t) dtn x(n) 8x(t)
97
This condition guarantees that -S vanishes for all variations, x (t) x (t)+-x (t)
which vanish at the endpoints of the motion.
Sometimes it is the case that -S vanishes for certain limited variations of the
path without imposing any condition at all. When this happens, we say that S
has a symmetry:
x i x # i = x i + i (x )
-xi = x#i ! xi = i (x)
98
Theorem 6.3. (Noether's Theorem) Suppose an action has a Lie symmetry so
that it is invariant under
is conserved.
Remark 6. We prove the theorem for n = 1. The proof for arbitrary n is left
as an exercise. When n = 1, variation of the action gives
# t2 5 5 6 6
L (x (t)) i L (x (t)) di (x)
0 = -S [x (t)] 4 (x ) + dt
t1 xi x i(n) dt
This expression must vanish for every path. Now suppose xi (t) is an actual
classical path of the motion, that is, one that satis"es the Euler-Lagrange equa-
tion, ' (
L d L
! =0
xi dt x i
Then f or that path, the integrand vanishes and it follows that
8t2
L i 8
0 = -S [x] = i
(x (t))88
x t1
or
L (x (t2) , x (t2)) i L (x (t1) , x (t1 )) i
( x ( t 2 )) = (x (t1))
x i x i
99
Since t1 and t2 are arbitrary, the function
L (x, x) i
(x )
x i
is a constant of the motion.
Exercise 6.1. Prove Noether's theorem when the Lagrangian depends on the
"rst n time derivatives of the position xi (t) . That is, show that the quantity
n 4
4 k
dm"1 L (x (.)) dk"m i
I= (!1)m"1 (x )
dtm"1 x(ik) dtk"m
k=1 m=1
is conserved.
Remark 7. Hint: For the general case the variation of the action is
# t2 5 5 6 6
L (x (t)) i L (x (t)) dn i
-S [x (t)] 4 (x ) + . . . + (x) d.
t1 xi xi(n) dtn
Integrate this term by parts k times, keeping careful track of the surface terms.
After writing the surface term for the k th integral as a sum over m, sum over all
k.
100
6.2. Conserved quantities in restricted Euler-Lagrange systems
For restricted Euler-Lagrange systems, the Lagrangian take the form
% &
L = L xi, x i = xi(1)
the quantity
L (x, x) i
Q= (x )
x i
is conserved. We make one further de!nition for restricted Euler-Lagrange sys-
tems:
1. The system has translational symmetry, since the action is invariant under
the translation
q q+a
101
Remark 8. To prove the "rst result, simply notice that if
L
=0
q
then L has no dependence on q at all. Therefore, replacing q by q + a does
nothing to L, hence nothing to the action. Equivalently, the variation of the
action with respect to the in"nitesimal symmetry (a ) ,
-q =
-q = 0
is
# ' (
L L
-S = -q + -q dt
q q
# ' (
L
= 0 % -q + % 0 dt
q
= 0
Remark 9. For the second result, the Euler-Lagrange equation for the coordi-
nate q immediately gives
' (
L d L
0 = !
q dt q
' (
d L
= !
dt q
so that
L
p=
q
is conserved.
102
or in!nitesimally,
-xi = i
We may express the invariance of S under -xi = i explicitly,
0 = -S
# t2 ' (
L i L
= -x + -x dt
t1 xi x
# t2 ' ' ( ' ( (
L i d L d L
= -x + -x ! -x dt
t1 xi dt x dt x
8t # t2 ' ' ((
L i 88 2 L d L
= i
8 + i
! i
i dt
x t1 t1 x dt x
For a particle which satis!es the Euler-Lagrange equation, the integral vanishes.
Then, since t1 and t2 are arbitrary we must have
L i
= pi i
x i
conserved for all constants i . Therefore, the momentum pi conjugate to xi is
conserved as a result of translational invariance.
103
The variation i (x) will be in!nitesimal if the angle ( is in!nitesimal, so to !rst
order in ( we have
1 = x cos ( ! y sin ( ! x
' ( ' (
1 2 1 3
= x 1 ! ( + %%% ! y ( ! ( + %%% ! x
2 6
= !y(
2
= x(
! xy
J = m (yx )
x#i = Ri jx
j
or, in!nitesimally,
-xi = x#i ! xi
% &
= - i j + Ai j xj ! xi
= Ai jx
j
104
Now consider the (vanishing) variation of S under a rotation. We have
0 = -S
# t2 ' (
L i L
= -x + -x dt
t1 xi x
# t2 ' ' ( ' ( (
L i d L d L
= -x + -x ! -x dt
t1 xi dt x dt x
8t # t2 ' ' ((
L i88 2 L d L
= -x + ! -xi dt
x i 8t1 t1 x i dt x
i
For any particle obeying the Euler-Lagrange equation, the !nal integral vanishes.
Since t1 and t2 are arbitrary times, we !nd the conserved quantity,
L i
M = -x
x i
L i
= i
A j xj
x
= pi Ai j xj
= pi Aij gjk xk
= Aij pi xj
1 ij
= A (pi xj ! pj xi )
2
Since we may write a general antisymmetric matrix using the Levi-Civita tensor
as
Aij = wk ijk
where
1
wm = Aij ijm
2
we have
1 1
M = wk ijk (pi xj ! pj xi ) 4 ! wk M k
2 2
Since wk is an arbitrary constant vector and we may drop an overall constant
! 12 , the vector
M = x"p
must be conserved. Thus, conservation of angular momentum is a consequence
of rotational symmetry.
105
Conservation of angular monementum is a property of a number of impor-
tant physical systems; moreover, the total angular momentum of any isolated
system is conserved. The following theorems illustrate the usefulness of this
conservation law in the case of a central potential, V (r) .
d2 xi xi
m 2 = !V #
dt r
The total angular momentum
Mij = xi pj ! xj pi
= m (xi x j ! xj x i)
is conserved, since
d d
Mij = m (xi x j ! xj x i )
dt dt
' (
d2 xk d2 xj
= m xj 2 ! xk 2
dt dt
#
V
= ! (x j x k ! x k x j )
r
= 0
106
Proof: Let x0 and v0 be the initial postion and velocity, with x0 measured from
the center of force. Then the angular momentum is
P = {v = $x0 + % v0 |$, % }
w(a) % v = 0
K L
so that the set x0 , v0, w(a) forms a basis. We consider the Mij = 0 case
below. For Mij nonzero, for all a,
w(ia)Mij = 0
Mij = m (xi vj ! xj vi )
w(a0 ) % x 6= 0
Then ' (
wa0 % v
v=x
wa0 % x
and Mij is identically zero, in contradiction to its constancy. Therefore,
we conclude
w(a) % x = 0
for all a. A parallel argument shows that
w(a) % v = 0
107
for all a, so the motion continues to lie in the original plane. Finally, if
Mij = 0 then at any time t,
Mij = m (xi vj ! xj vi ) = 0
so xi and vi are always parallel and we can write
xi = .v i
for some . (xj , t)and all t. Then at any time
d i d. i dv i
vi = x = v +.
dt dt dt
so any change in velocity is parallel to the velocity:
' (
dv i 1 d.
= 1! vi
dt . dt
and the motion remains along the initial line.
108
Bringing both terms to the same side, we have
' (
d L i
x ! L = 0
dt x i
so that the quantity
L i
E4 x ! L
x i
is conserved. The quantity E is called the energy.
These parameters might include masses, lengths, spring constants and so on.
Further, suppose that each of these variables may be rescaled by some factor in
such a way that L changes by only an overall factor. That is, when we make
the replacements
xi $xi
t %t
$ i
x i x
%
ai & i ai
109
for certain constants ($, %, & 1 , . . . , & n ) we !nd that
' (
i $ i
% &
L $x , x , & 1 a1 , . . . , & n an , %t = 0L xi , x i , a1, . . . , an , t
%
for some constant 0 which depends on the scaling!constants. Then the Euler- "
i ! i
Lagrange equations for the system described by L $x , $ x , & 1 a1, . . . , & n an , %t
are the same as for the original Lagrangian, and we may make the replacements
in the solution.
Example 6.9. Consider the 1-dimensional simple harmonic oscillator. The mo-
tion of the oscillator may be described by the Lagrangian
1 1 1
L = m2x 4 + kmx 2x2 ! k 2 x4
12 2 4
since the restricted Euler-Lagrange equation gives
' (
L d L
0 = !
x dt x
' (
2 2 3 d 1 2 3 2
= kmx x ! k x ! m x + kmxx
dt 3
% &
= kmx 2x ! k 2x3 ! m2 x 2x$ + kmxx
$ 2 + 2kmx 2 x
% & % &
= ! kx2 + mx 2 kx ! mx 2 + kx2 mx$
or simply % 2 &
mx + kx2 (kx + mx$) = 0
Assuming m and k are positive, the factor (mx 2 + kx2 ) is positive de"nite except
for the case of degenerate motion, x (t) = 0. Dividing by this factor, we have the
usual equation of motion,
mx$ + kx = 0
Using scalings of the variables, we do not need to solve this equation to gain
insight into the solution. For example, suppose we know that the motion is
periodic, with period T. Now we may make any or all of the replacements
x $x
t %t
$
x x
%
m &m
k -k
110
for constants ($, %, &, -) . The Lagrangian becomes
1 2 $4 2 4 1 $4 1
L= & 4 m x + &- 2 kmx 2x2 ! -2 $4 k 2 x4
12 % 2 % 4
$4 $4
0 = &2 4
= &- 2
= - 2 $4
% %
The value of $ is arbitrary, while the remaining constants must satisfy
&2 &-
4 = = -2
% %2
Both conditions are satis"ed by the single condition,
& = -% 2
Returning to the periodicity of the oscillator, we now know that if we change the
mass by a factor,
& and the spring constant k by a factor - then the period changes
by a factor % = #& . Now suppose we start with a system with m0 = k0 = 1 and
period T0. Then with
m = &m0 = &
k = -k0 = -
the period is
T = %T0
:
&
= T0
-
:
m
= T0
k
9
k
We therefore learn that the frequency is proportional to m
without solving
for the motion.
111
6.3. Conserved quantities in generalized Euler-Lagrange systems
Most of the results of the preceeding sections have
% forms that hold &for generalized
Euler-Lagrange systems. Recall that if L = L x, x, $ . . . , x , t the resulting
x, (n)
variation leads to n
4 dk L
(!)k k (k) = 0
k=0
dt x
This generalized Euler-Lagrange equation is generically of order 2n.
Integrating the k th term by parts and noting that -x(i k) = 0 unless k = 0, gives
# t2 8t2 #
L L 8 t2
d L
i i 8
i
-x ( k ) dt = i
-x ( k " 1) 8 ! i
-x(ik"1) dt
t1 x(k) x(k) 8 t1 dt x(k)
t1
# t2
d L
= ! i
-x(i k"1) dt
t1 dt x (k)
# t2 2 5 6
2 d L
= (!1) 2 i
-x(i k"2) dt
t1 dt x (k)
5 6 8 t2
d k"1
L 8
i8
= (!1)k"1 k"1 -x 8
dt x(ik) 8
# t2 k 5 6 t1
d L
+ (!1)k k i
-xi dt
t1 dt x(k)
112
so the variation gives
0 = -S
5 68t2 n # t2
5 6
4n
dk"1
L 8 4 dk
L
8 i
= (!1)k"1 k"1 8 + (!1)k k -xidt
dt xi(k) 8 t1 dt x i
(k)
k=1 t1 k=0
The coefficient of -xi in the integrand of the !nal integral is the Euler-Lagrange
equation, so when the Euler-Lagrange equation is satis!ed we must have
n
5 6
4 k"1 d
k"1
L
pi = (!1) (6.2)
dtk"1 x(ik)
k=1
for all -x. This pi is the generalized conjugate momentum to xi . Notice that
when n = 1, pi reduces to the previous expression. When any of the coordinates
xi is cyclic, the corresponding momentum is conserved.
Exercise 6.2. Prove this claim directly from the generalized Euler-Lagrange
equation. Suppose that some coordinate q is cyclic, so that L
q
= 0. Show that
n
4 ' (
k"1 dk"1 L
pi = (!1)
dtk"1 q(k)
k=1
is conserved.
113
Rewriting the sum we have
n
dm 4 k d
k"m
L
0 = m
(! ) k "m
dt dt x(k)
k=m
n
m 4"m
d dk L
= (!)k+m
dtm k=0
dtk x(k+m)
dm!1 f
vanish. Thus, dtm!1 is conserved. Moreover, we may immediately integrate m
times, introducing m ! 1 additional constants,
m
4"1
1
f (t ) = pk tk
k=0
k!
Aij = !A ji
114
This time we must keep all of the surface terms. For the k th term,
# t2 8t2 # 5 6
L L 8 t2
d L
8
i
-xi(k) dt = i
-xi(k"1) 8 ! i
-x(i k"1) dt
t1 x (k) x (k)
8 t1 dt x (k)
t
8t12 5 6 8t2
L 8 d L 8
i 8 i 8
= i
-x(k"1) 8 ! i
-x(k"2) 8
x(k) 8 dt x(k) 8
t1 t1
# t2 2 5 6
d L
+ 2 i
-x(ik"2) dt
t1 dt x (k)
..
. 8t2
5 6
4 l
d m"1
L 8
8
= (!1)m"1 m"1 i
-x i
(k"m) 8
m=1
dt x (k)
8
t1
# t2 m 5 6
d L
+ (!1)m m i
-x(i k"m) dt
t1 dt x(k)
..
. 8t2
5 6
4 k
dm"1 L 8
m"1 i 8
= (!1) " i
-x(k"m) 8
m=1
dt m 1 x(k) 8
t1
# 5 6
t2 k
k d L
+ (!1) -xi dt
t1 dt k x(ik)
Summing over k the variation becomes
0 = -S 8t2
5 6
4n 4 k m"1 8
m"1 d L i 8
= (!1) i
-x(k"m) 8
dt m"1 x(k) 8
k=1 m=1 t1
n # t2 k 5 6
4 d L
+ (!1)k k i
-xi dt
t1 dt x (k)
k=0
As usual, the !nal integral vanishes when we apply the equation of motion, so
the quantity
n 4 k
5 6
4 dm"1
L
M = (!1)m"1 m"1 i
Ai j xj(k"m)
k=1 m=1
dt x(k)
115
n 4
k
5 6
4 dm"1 L
= Aij (!1)m"1 xj (k"m)
k=1 m=1
dtm"1 xi(k)
is constant in time for all antisymmetric Aij . Therefore, we may write the con-
served angular momentum vector as
n 4 k
5 6
4 m ijs dm"1 L
s
M = (!1) xi(k"m) m"1
k=1 m=1
dt xj(k)
contains only the k ! 1 derivative of xLi , whereas Mij depends on all derivatives
' ( (k)
dm ! 1 L
dtm!1 xj
up to and including this one.
(k )
6.3.3. Energy
Finally we consider energy. Suppose L is independent of time. Then
n
dL 4 (k+1) L
= x
dt x(k)
k=0
But
' (
(k+1) L d (k) L d L
x (k )
= x (k )
! x(k)
x dt x dt x(k)
' ( ' (
d (k) L d (k"1) d L
= x ! x
dt x(k) dt dt x(k)
d2 L
+x(k"1) 2 (k)
dt x
..
.
k"1
4 ' m
( k
m d (k"m) d L (1) k"1 d L
= (! ) x m (k )
! x ( ! )
m=0
dt dt x dt x(k)
k
116
so ' (
n k"1 n
dL d 44 m (k"m) d
m
L (1)
4 k d
k
L
= (! ) x + x ( ! )
dt dt k=0 m=0 dtm x(k) k=0
dtk x(k)
Using the equation of motion,
n
4 dk L
(! )k =0
dtk x(k)
k=0
6.4. Exercises
Exercise 6.3. Find the Euler-Lagrange equation for the following action func-
tionals:
@
1. S [x] = exp ($x2 + % v2) dt for constants $ and %.
@
2. S [x] = f (x2v2) dt for any given function, f.
@
3. S [x] = @ x1&adt + x % adt, where a = x
!.
117
Exercise 6.5. Consider the 3-dimensional action
# ' (
1 2
S= mx ! mgz dt
2
where x = (x, y, z ) .
Find all rescalings of the parameters and coordinates (m, g, l, , t) which leave
S changed by no more than an overall constant.9Use these rescalings to show
that the period of the motion is proportional to gl .
118
7. The physical Lagrangian
We have seen that the extrema of functionals give preferred paths of motion
among the class of curves. These extremal paths appear to be smooth curves.
Indeed, when the action is taken to be path length, the extremal curves are
straight lines. Now we return to the problem of physical motion stated in Chap-
ter III: can we specify a functional for speci!c physical problems in such a way
that the extremals are the physical paths of motion?
We have seen examples where this is the case. For example, the action
functional
1 1 1
L = m2x 4 + kmx 2x2 ! k 2 x4 (7.1)
12 2 4
describes the simple harmonic oscillator in 1-dimension. Now we take a more
systematic approach, based on the principles of gauge theory. Our gauging
of Newton#s second law begins with the observation that the symmetry which
preserves Newton#s law is different than the symmetry of the Euler-Lagrange
equation. Speci!cally, Newton#s second law is invariant under a special set
of constant transformations called Galilean transformations. By contrast, the
Euler-Lagrange equation is invariant under the diffeomorphism group, that is,
the group of all coordinate transformations.
It follows that, if we are to develop a systematic way of writing physical
equations of motion using the Euler-Lagrange equation, then it must involve a
generalization of Newton#s second law to arbitrary coordinate systems. This is
exactly what gauging accomplishes $ the systematic extension of a symmetry.
There is nothing surprising in such a generalization. In fact, we would !nd it
odd not to be able to work out a form of the second law valid in any coordinate
system of our choosing. The key point here is to do it systematically, rather
than just substituting a speci!c transformation of the coordinates.
There are several steps to our generalization. First, we derive the most
general set of transformations under which Newton#s law is covariant $ the
Galilean group. This group has important subgroups. In particular, we are
interested in the subgroup that also preserves Newtonian measurement theory,
the Euclidean group.
Next, we derive the geodesic equation. Geodesics are the straightest possible
lines in arbitrary, even curved, spaces. In our Euclidean 3-space, the geodesic
equation simply provides a diffeomorphism-invariant way of describing a straight
line $the condition of vanishing acceleration. As a result, we will have expressed
the acceleration in arbitrary coordinates.
119
Finally, we gauge Newton#s law by writing it in terms of coordinate covariant
expressions. We conclude by showing that this generalized form of Newton#s law
indeed follows as the extremum of a functional, and derive a general expression
for the action.
q i = q i (x , t )
t # = t # (t ) (7.4)
x i = x i (q , t )
t = t (t # )
120
since the Newtonian assumption of universal time requires equal intervals, dt =
dt# .
The real limitation on covariance comes from the time derivatives in the ac-
celeration term. Along any path speci!ed by xi (t) = xi (q j (t) , t) the acceleration
x$i may be written in terms of q$j and qk as
' (
i d xi j xi
x$ = q +
dt q j t
' i ( ' 2 i (
x j 2 xi k j 2 xi k x j 2 xi
= q$ + k j q q + k q + q +
q j q q q t tq j t2
xi j 2 xi k j 2 xi k 2 xi
= q
$ + q
q
+ 2 q +
q j q k q j qk t t2
The !rst term is proportional to the second time derivative of q i, but the remain-
ing terms are not. Comparing to the actual transformation of the acceleration,
covariance therefore requires
xi j xi j 2 xi k j 2 xi k 2 xi
q
$ = q
$ + q q + 2 q +
q j q j q k q j q k t t2
2 xi k j 2 xi k 2 xi
0 = q
q + 2 q +
q k q j q k t t2
Since this must hold for all velocities qj , the coefficient of each order in velocity
must vanish. Thus,
2 xi
0 =
t2
2 xi
0 =
qk t
2 xi
0 =
q k q j
Integrating the !rst equation shows that xi must be linear in the time
xi = xi0 + x i0 t
The second equation then requires x i0 = v i = constant. Finally, we see that xi0
must be linear in q j :
2 xi
= 0
q k q j
121
xi
= Mi j
q j
xi = M i jq
j
+ ai
where M i j and ai are constant. Therefore, the most general coordinate trans-
formation for which Newton#s second law is covariant is
x i (q , t ) = M i jq
j
+ ai + v i t (7.5)
and the velocity and acceleration transform as
x i = M i j q
j
+ vi
xi j
= q + v i
q j
xi j
x$i = q$
q j
xi (q) = Ri jq
j
+ ai (7.6)
show that we have recovered the Euclidean symmetry of the background space.
We now derive a covariant expression for the acceleration.
122
With this in mind, we de!ne
We have seen that extremals of length in the Euclidean plane are straight
lines, and this coincides with our notion of uniform, or unaccelerated motion,
d2 xi
=0
dt2
By using a variational principle to !nd the covariant description of a straight
line, we will have found a covariant expression for the acceleration.
We proceed to !nd the general expression for geodesics. In arbitrary coordi-
nates, y i , the in!nitesimal separation between two points is given by
,
ds = gij dy i dy j (7.7)
i
where we set dy
d%
= y n . Varying the path, and remembering that gij is a function
i
of the coordinates and using -y n = d&y
d%
, we have
0 = -S [y ]
123
# 1 ' (
1 gij k i j d-y i j i d-y
j
= -y y y + gij y + gij y d.
m n
0 2 gmn y y y k d. d.
# 1 ' (
1 gij k i j
= -y y y d.
m n
0 2 gmn y y y k
# 1
1 d ! "
! gij y j (gmn y m y n )"1/2 + gij y i (gmn y m y n )"1/2 -y j d.
0 2 d.
where we have set the surface terms to zero as usual. The geodesic equation is
therefore
' (
1 gij i j
0 = y y
2 gmn y m y n y k
1 d ! "
! gkj y j (gmn y m y n )"1/2
2 d.
1 d ! "
! gik y i (gmn y my n )"1/2
2 d.
A considerable simpli!cation is achieved if we choose the parameter . along
curves to be path length, s itself. Then, from eq.(7.7), we have
gmn y my n = 1
Since
d % & gik
gik y i = m y m y i + gik y$i
ds y
this becomes
' (
1 gij i j gkj m j j
0 = y y ! m y y ! gkj y$
2 y k y
' (
1 gik m i i
! y y + gik y$
2 y m
' (
1 gmn gkn gkm
= k
! m ! n
y m y n ! gjk y$j
2 y y y
124
where we have used the symmetry, gmn = gnm , and the symmetry
y m y n = y n y m
in the last step. We can give this a more compact appearance with some new
notation. First, let partial derivatives be denoted by
k =
y k
Next, we de!ne the convenient symbol
1
%kmn = (m gkn + n gkm ! k gmn )
2
The object, %kmn , is called the Christoffel connection for reasons that will become
clear later. Notice that %kmn is symmetric in the last two indices, %kmn = %knm .
Substituting, and using the inverse metric g ik , we have
dy m dy n d2 y j
0 = g ik %kmn + g ik gjk 2
ds ds ds
or the !nal form of the geodesic equation,
d2 y i dy m dy n
2
+ %i mn =0
ds ds ds
This equation may be regarded as either a second order equation for a curve
i
y i (s) , or as a !rst order equation for the tangent vector ui = dy
ds
,
dui
+ %i mn u
m n
u =0
ds
For the vector equation, we may also write
% &
um m ui + un %i nm =0
The term in parentheses will reappear when we discuss gauge theory in more
detail.
We consider two examples.
125
First, if the coordinates are Cartesian, the metric is simply gij = - ij . Then
all derivatives of the metric vanish, k -ij = 0, and the Christoffel symbols, %kmn
and %i mn vanish. Therefore,
d2 xi
= 0
ds2
xi = x0i + v0i s
we !nd
cos (
% , = % , =
sin (
%, = ! sin ( cos (
126
and the equations for geodesics become
d2 ( d d
0 = + %,
ds2 ds ds
d2 d( d d d(
0 = + % , + % ,
ds2 ds ds ds ds
and therefore,
d2 ( d d
2
! sin ( cos ( = 0
ds ds ds
d2 cos ( d d(
+ 2 = 0 (7.9)
ds2 sin ( ds ds
d
We can see immediately that if the motion starts with ds
= 0, then initially
d2 (
= 0
ds2
d2
= 0
ds2
The second of these shows that d ds
remains zero, so this form of the equations
continues to hold for all s. The solution is therefore
( = a + bs
= c
Exercise 7.1. Using eq.(7.8) for the metric, "nd an expression for %kmn =
1
2
(m gkn + n gkm ! k gmn ) in terms of "rst and second derivatives of xi with
respect to y j .
Exercise 7.2. Show that the geodesic equation describes a straight line by
writing it in Cartesian coordinates.
127
Exercise 7.3. Find the equation for a straight line in the plane using polar
coordinates, using two methods. First, "nd the extremum of the integral of the
line element
#
C i D
S q (t ) = ds
# 9
= / 2 + /2 2 dt
128
' ( ' (' (
ds d ds dxi i ds dxm ds dxn
0 = +% mn
dt ds dt ds dt ds dt ds
d2 xi dxm dxn
+ %i mn =0
dt2 dt dt
Notice that in Cartesian coordinates, where %i mn = 0, this equation just ex-
2 i
presses vanishing acceleration, ddtx2 = 0. Since the equation is covariant, it must
express vanishing acceleration in any coordinates and the acceleration is covari-
antly described by.
d2 xi dxm dxn
ai 4 2 + %i mn
dt dt dt
whether it vanishes or not.
To rewrite Newton#s second law in arbitrary coordinates, we may simply
multiply ai by the mass and equate to the force:
dv i
Fi = m + m%i mn v
m n
v
dt
For the force, recall that it is often possible to write F i as minus the gradient
of a potential. In general coordinates, however, the gradient requires a metric:
f
[f ]i = g ij
y j
Newton#s law may therefore be written as
V dv i
!g ij = m + m%i mn v
m n
v (7.10)
y j dt
129
Eq.(7.10) holds in any coordinate system if Newton#s second law holds in Carte-
sian coordinates.
Now we manipulate eq.(7.10). Substituting the expression for the connection,
V dv i m ik
!g ij = m + g (m gkn + n gkm ! k gmn ) v m v n
y j dt 2
i
V dv m
! k = mgki + ((v mm gkn ) v n + (v n n gkm ) v m ! k gmn v mv n )
y dt 2
i
' (
V dv m dgkn n dgkm m m n
! k = mgki + v + v ! (k gmn ) v v
y dt 2 dt dt
dgkn
where we have used the chain rule to write v mm gkn = dt
. Collecting the
gradient terms on the right,
dv i dgkn n !m m n
"
0 = mgki +m v ! k gmn v v ! V
dt dt y 2
d !m "
= (mgkn v n ) ! k gmn v m v n ! V
dt y 2
Now, observing that
!m n m
"
gmn v v = mgkn v n
v k 2
we substitute to get
' "(
d !m n m !m m n
"
0= gmn v v ! gmn v v ! V
dt v k 2 y k 2
Since V depends only on position and not velocity we can put it into the !rst
term as well as the second,
' "(
d !m n m !m m n
"
0= gmn v v ! V ! gmn v v ! V
dt v k 2 y k 2
Finally, we recognize the Euler-Lagrange equation,
' (
d L L
0= k
! k
dt v y
with the Lagrangian
1
L = mgmn v nv m ! V
2
130
We identify the !rst term on the right as the kinetic energy,
1
T = mgmn v n v m
2
since it reduces to 12 mv2 in Cartesian coordinates.
We have successfully gauged Newton#s second law, and shown that the new
diffeomorphism invariant version is given by extrema of the action functional,
#
C i D
S x (t) = (T ! V ) dt
131
The kinetic energy is therefore
1 1
T = mx1 % x1 + mx2 % x2
2 2
' (2
1 L
= m R + (!i sin + j cos )
2 2
' (2
1 L
+ m R + (!i sin + j cos )
2 2
' (
2 2 2
1 L
= m 2R + 2
2 2
Since there is no potential energy, the action is simply
# ' (
2 L2 2
S= m R + dt
4
Since both R and are cyclic we immediately have two conservation laws,
L
Pi = = 2mR i = const.
R i
1
J = mL2 = const.
2
Integrating, we have the complete solution,
Ri = Ri0 + P i t
2J
= 0 + t
mL2
Example 7.3. Write the action for the Kepler problem. The Kepler problem
describes motion in the gravitational potential V = ! GM r
. To formulate the
problem in spherical coordinates we "rst write the kinetic energy. The easiest
way to "nd v 2 is to divide the squared in"nitesimal line element by dt2 :
ds2 = dr2 + r2 d( 2 + r2 sin2 (d2
' (2 ' (2 ' (2 ' (2
2 ds dr 2 d( 2 2 d
v = = +r + r sin (
dt dt dt dt
The kinetic energy is therefore just
1 2
T = mv
2 ! "
m 2 2
= r + r2 ( + r2 2 sin2 (
2
132
and the Lagrangian is
m! 2 2
" $
L=T !V = r + r2 ( + r2 2 sin2 ( +
2 r
Thus, the Kepler action is
# ! ! " $"
m 2 2
S= r + r2 ( + r2 2 sin2 ( + dt
2 r
Exercise 7.5. We showed that the action of eq.(7.1) describes the simple har-
monic oscillator, but according to our new physical correspondence, extrema of
the simpler action functional
#
S = (T ! V ) dt
#
1 % 2 &
= mx ! kx2 dt
2
should describe this same motion. Show that vanishing variation of this simpler
action does give the correct equation of motion.
Exercise 7.6. For each of the following, give the number of Cartesian coordi-
nates required to describe the system, and give the actual number of degrees of
freedom of the problem. Then write the Lagrangian, L = T ! V, and "nd the
equations of motion for each of the following physical systems:
133
3. A ball moves frictionlessly on a horizontal tabletop. The ball of mass m
is connected to a string of length L which passes through a hole in the
tabletop and is fastened to a pendulum of mass M. The string is free to
slide through the hole in either direction.
Exercise 7.7. Use scaling arguments to show how the frequency of small oscil-
lations depends on the amplitude for any potential of the form
V = axn
134
2. Power law potentials, V = arn .
3. Potentials with perturbatively closed orbits
4. Bertrand#s theorem: potentials with non-perturbatively closed orbits
1. Kepler/Coulomb potential, V = ! -r
2. Isotropic oscillator, V = ar2
5. Newtonian gravity
Exercise 8.2. In the empty space surrounding an isolated black hole, general
relativity must be used to correctly describe gravitational effects, since the es-
cape velocity reaches the speed of light at the event horizon. However, suffi-
ciently far from the hole, the Newtonian theory is approximately correct. If the
horizon of a solar mass black hole has radius 1km, approximately how far from
the hole will the Newtonian approximation give answers correct to within one
percent. In other words, at what distance is the escape velocity .01c, where c is
the speed of light?
135
plus the motion of a single particle of effective mass
m1 m2
1=
m1 + m2
and position r = x1 ! x2 in a central potential, V (r) , showing that the action
may be rewritten in the form
#
1 1
S= (m1 + m2) R2 + 1r2 ! V (x1 ! x2 )
2 2
Exercise 8.4. Solve the Kepler problem completely. Begin with the Lagrangian
L = T ! V in spherical coordinates, with the potential
4
V =!
r
. Use any techniques of the preceeding sections to simplify the problem. Solve
for both bounded and unbounded motions.
Exercise 8.5. Study the motion of an isotropic harmonic oscillator, that is, a
particle moving in 3-dimensions in the central potential
1
V = kr2
2
where r is the radial coordinate.
There is a great deal that can be said about central forces without specifying
the force law further. These results are largely consequences of the conservation
of angular momentum. Recalling theorem 6.7, the total angular momentum of
a particle moving in a central potential is always conserved, and by theorem 6.8
the resulting motion is con!ned to a plane. We may therefore always reduce the
action to the form
# ' (
1 % 2 2 2
&
S= m r + r ! V (r) dt
2
L
Moreover, we immediately have two conserved quantities. Since t
= 0, energy
is conserved,
1 % &
E = m r2 + r2 2 + V (r)
2
136
and because is cyclic, the magnitude of the angular momentum is conserved,
J = mr2
1 J2
E = mr2 + + V (r )
2 2mr2
the problem is equivalent to a particle in 1-dimension moving in the effective
potential
J2
U (r ) = V (r ) +
2mr2
In general, orbits will therefore be of three types. Near any minimum, r0, of
the effective potential U, there will be bound states with some maximum and
minimum values of r given by the nearest roots of
E ! U (r ) = 0
If the energy is high enough that this equation has only one root, the motion
will have (at most) one turning point then move off to in!nity. Finally, if there
are no roots to E ! U = 0, the motion is unbounded with no turning points.
From the conserved energy, we may reduce the problem to quadratures.
Solving for dr , we !nd
dt : #
2 dr
t= ,
m E ! U (r )
Alternatively, after solving for r we may divide by J = mr2 , converting the
solution to one for the orbit equation, r () :
:
dr dr/dt mr2 2
= = (E ! U )
d d/dt J m
#
2m dr
= ,
J r2 (E ! U (r))
137
8.1. Regularization
We can regularize a physical problem if we can transform it smoothly into an-
other problem with well-behaved solutions. In the case of central potentials, it
is always possible to transform the problem of bound orbits into the isotropic
oscillator. For certain of these transformations, notably the Kepler problem,
there are no singular points of the transformation.
x = !u"2
d d
= u3
dt d*
to turn the 1-dim Kepler equation of motion into the 1-dim harmonic oscillator.
Thus,
d2 x $
m 2 =! 2
dt x
becomes simply
d2 u $
= ! u
d* 2 2m
Before moving to a proof for the general n-dim case, we note that more
general transformations are possible in the 1-dim case. Suppose we begin with
an arbitrary potential, V (x)
d2 x dV
m 2 =!
dt dx
Then substituting
x = f (u)
d 1 d
=
dt f # d*
we have
1 d 1 d dV (f (u))
m # #
f (u ) = !
f d* f d* dx
138
1 d du dV (f (u))
m #
= !
f d* d* df
d du dV (f (u)) df
m = !
d* d* df du
2
du dV (f (u))
m 2 = !
d* du
If we choose f = V "1 so that V (f (u)) = 12 u2 then the right side is just !u and
we have
u = u0 sin *
Thus, we may turn any one-dimensional problem into a harmonic oscillator! The
catch, of course, is that we have changed the time from t to * , and transforming
back requires a rather troublesome integral. For instance, suppose V is a power
law, V = axn for any n. Then we choose
2
x = f (u ) = u n
so that the time transforms as
n n!2
d* = dtu n
#2
nt 2!n
= u n d*
2
#
2!n 2!n
= A n sin n * d*
139
8.1.2. Higher dimensions
Consider the general central force motion in any dimension d * 2. We begin
from the action # ' (
1 dxi dxj
S = dt m- ij ! V (r )
2 dt dt
,
where the xi are Cartesian coordinates and r = -ij xixj . We have shown that
angular momentum is conserved, and that the motion lies in a plane. Therefore,
choosing polar coordinates in the plane of motion, the problem reduces to two
dimensions. With coordinates x(a) in the directions orthogonal to the plane of
motion, the central force equations of motion are
d2 x(a)
m = 0
' 2 dt2(
dr d d
m 2
!r = ! V # (r )
dt dt dt
d (mr2 )
= 0 (8.1)
dt
140
J 2f # df dV
= 2 3
!
m f du df
2
J df dV
= 2 3
!
m f du du
To obtain the isotropic harmonic oscillator we require the combination of terms
on the right to give both the angular momentum and force terms of the oscillator:
J 2 df d J2
! V ( f ( u )) = ! ku
m2 f 3 du du m 2 u3
Integrating,
J2 J2 1 c
2 2
+ V ( f ( u )) = 2 2
+ ku2 + (8.3)
2m f 2m u 2 2
If we de!ne
J2
g (f ) 4 + V (f )
2m2f 2
the required function f is
5 6
J2 1 c
f = g "1 2 2
+ ku2 +
2m u 2 2
Substituting this solution into the equation of motion, we obtain the equation
for the isotropic oscillator,
d2u J2
m ! = !ku
dt2 mu3
Therefore, every central force problem is locally equivalent to the isotropic har-
monic oscillator. We shall see that the same result follows from Hamilton-Jacobi
theory, since every pair of classical systems with the same number of degrees of
freedom are related by some time-dependent canonical transformation.
The solution takes a particularly simple form for the Kepler problem, V =
! !r . In this case, eq.(8.3) becomes
5 6
J2 $ J2 1 2 c
! ! + ku + =0
2m2 f 2 f 2m2 u2 2 2
141
Solving the quadratic for f1 , we take the positive solution
M
5 6
N
1 m 2 N J 2
J 2
= 2
$ + O $2 + 2 + ku2 + c
f J m m 2 u2
;
2
' 2 2
( 2
$m J $ m J
= 2
1+ ku4 + c + 2
u2 + 2
J $mu J m
1 $m2 J
= 2 + m ku +
f J Ju
or
u
f= 2 J
m ku2 + !m
J2
u+ J
The zeros of the denominator never occur for positive u, so the transformations
f is regular in the Kepler case.
The regularity of the Kepler case is not typical $ it is easy to see that the so-
lution for f may have many branches. The singular points of the transformation
in these cases should give information about the numbers of extrema of the or-
bits, the stability of orbits, and other global properties. The present calculation
may provide a useful tool for studying these global properties in detail.
Exercise 8.7. Consider the regularizing transformation when the radial poten-
tial is a power law, V = arn . Show that the solution for f (u) is given by the
polynomial equation
af n+2 ! h (u) f 2 + b = 0
and "nd the form of the function h (u) . What values of n allow a simple solution?
142
8.2. General central potentials
We now examine various properties of orbits for different classes of potential.
The next exercise illustrates one of the difficulties encountered in dealing with
arbitrary potentials.
V = $ ( r ! r0 ) 2 p
The multiplicity of oscillations per orbit is just one of the things that can
happen in arbitrary central potentials. In fact, general central potentials are
just as general as arbitrary 1-dimensional potentials. If we choose
J02
V (r ) = ! + V (r)
2mr2
then a particle having J = J0 moves in the totally arbitrary effective potential
U (r) = V (r) . In the next section, we explore some conditions which restrict
the motion in two ways which are more in keeping with our expectations for
planetary motion, demanding one or more of the following:
1. At any given value of the energy and angular momentum, there is exactly
one stable orbit.
143
2. Circular orbits of increasing radius have increasing angular momentum
and increasing energy.
3. Perturbations of circular orbits may precess, but do not wobble $ that is,
the frequency of small oscillations about circularity is no more than twice
the orbital frequency.
Of course, there is no reason that orbits in nature must follow rules such as
these, but it tells us a great deal about the structure of the theory to classify
potentials by the shapes of the resulting orbits.
As shown in the previous problem, some central forces allow an orbiting par-
ticle to wobble arbitrarily many times while following an approximately circular
orbit. On of the simplest constraints we might place on potentials is that they
should be monotonic. However, a simple counterexample shows that orbits in
monotonic potentials may have arbitrarily many extrema.
For a counterexample, consider motion in the potential
J02
V (r ) = ! + A sin kr
2mr2
Monotonicity requires the derivative to be positive de!nite, for all r,
# J02
V (r ) = + Ak cos kr > 0
mr3
J2
This is easy to arrange by taking Ak small or mr03 large. Speci!cally, if we want
many periods of kr per orbit, we allow kr to take values up to
kr 9 N!
J02
! Ak > 0
mr3
Combining these, there will be up to N periods of oscillation per orbit if
J02
> AN !
mr2
J02
r2 <
AN m!
144
which can always be satis!ed by choosing A small. Now since the conserved
energy for a central potential is
1 % 2 &
E = m r + r2 2 + V (r)
2
1 2 J2
= mr + + V (r )
2 2mr2
the motion occurs in an effective, single particle potential
J2
U= + V (r )
2mr2
Substituting for V, the effective potential has minima when
J02 J2
U# = ! + + Ak cos kr = 0
mr3 mr3
Therefore, when the angular momentum J is equal to J0 the !rst terms cancel,
leaving
Ak cos kr = 0
Since kr may range up to N!, this has N solutions. Therefore, monotonic
potentials may have arbitrarily many extrema at a given value of the energy
and angular momentum. This means that a particle of given energy might be
found orbiting at any of N distinct radii.
' (
3 !
rl = 2l +
2 k
Exercise 8.9. For the potential
J02
! + A sin kr
2mr2
with J = J0, "nd the frequency of small oscillations about a circular orbit at rl .
145
Combining these results, and solving for E and J 2,
J 2 = mr3V # (r)
r #
E = V (r ) + V (r )
2
Clearly we can !nd some values of E and J that give a circular orbit at any r.
The problem of multiple orbits at a given energy or angular momentum may be
avoided if we demand that, as functions of r, both J 2 and E increase monoton-
ically. Then we have
dJ 2
= 3mr2 V # + mr3V ##
dr
= mr2 (3V # + rV ## ) > 0
dE 1 # r ##
= V + V +V#
dr 2 2
1
= (3V + rV ## ) > 0
#
2
In both cases we !nd the condition
3
V ## > ! V #
r
When this condition is met at each r, then there can be only one circular orbit
of a given energy and angular momentum.
M2
Check this for our oscillating example, V (r) = ! 2mr0 2 + A sin kr. We have
3 # 3M02 3
! V = ! 4
! Ak cos kr0
r0 mr0 r0
2
3M0
V ## = ! ! Ak 2 sin kr0
mr04
so the condition requires
3
!Ak 2 sin kr > ! Ak cos kr
r
kr sin kr < 3 cos kr
While this condition is satis!ed at each minimum, it cannot be true for all values
of r, so the condition rules out this potential.
146
Now consider perturbations around circular orbits. For a circular orbit the
orbital frequency is
J
=
mr02
Let r = r0 + r0x, with x << 1 so that expanding the energy to quadratic order,
1 2 2 J2
E+ = mr0 x + + V ( r0 + r0 x )
2 2mr02 (1 + x)2
1 2 2 J2 % 2
&
= mr0 x + 1 ! 2x + 3 x
2 2mr02
1
+V (r0) + V # (r0 ) r0 x + V ## (r0 ) r02 x2
2
2
1 2 2 J # 3J 2 2 1 ##
= mr0 x ! x + V ( r 0 ) r 0 x + x + V (r0 ) r02 x2
2 mr02 2mr02 2
' 2 (
1 2 2 1 3J
= mr x + + r02 V ## (r0) x2
2 0 2 mr02
These
! 2 perturbations " therefore oscillate with effective mass mr02 and constant
3J
mr2
+ r02 V ## (r0 ) , so the frequency of oscillations is given by
0
' (
1 3J 2
2
= + r02 V ## (r0 )
mr02 m2r02
Comparing to the orbital frequency, we have
3J 2 1
2 m2 r04
+ m
V ## (r0 )
=
2 J2
m2 r04
mr04 ##
= 3+ V ( r0 )
J2
J2
Using the condition for minimal potential, mr03
= V # (r0) , this becomes
2 r0
2 =3+ #
V ## (r0 )
V ( r0 )
Finally, with the convexity condition
3 #
V ## (r0) > ! V ( r0 )
r0
147
the ratio of frequencies is given by
2 r0
2 =3+ #
V ## (r0 ) > 0
V ( r0 )
The second derivative condition is therefore the necessary and sufficient con-
dition for the existence of simple harmonic oscillations about circular orbits.
To insure that these oscillations produce precession but not wobble, we further
require
2 r0
4 * 2 = 3+ # V ## (r0) > 0
V ( r0 )
or
r0 V ##
!3 < +1
V#
For a power law potential, V = arn , this gives
V# = narn"1
V ## = n (n ! 1) arn"2
!3 < (n ! 1) + 1
!2 < n+2
Not surprisingly, any power law potential with n > !2 satis!es the energy
and momentum conditions, but there is an upper limit on n if we are to avoid
wobble. As we shall see, this condition is closely related to the condition for
nonperturbatively stable closed orbits.
V = arn
148
the energy and angular momentum given by,
J02
E0 = + ar0n
2mr02
J0 = mr02 0
where 0 is the frequency of the orbit. We also know that the motion occurs at
a minimum of the effective potential, so that
J02
U # (r0 ) = nar0n"1 ! =0
mr03
Now consider small oscillations around the circular orbits. Let
r = r0 (1 + x)
% &
= 0 1 + $x + %x2
E = E0 +
J = J0 + j
where we assume
x << 1
First consider angular momentum. We have, to quadratic order in x,
J = J0 + j
% &% &
= mr02 0 1 + 2x + x2 1 + $x + %x2
% &
j = mr02 0 (2 + $) x + (1 + 2$ + % ) x2
j = 0
$ = !2
% = 3
J = J0 = mr02 0
149
Expanding the energy to second order in small quantities, we have
1 2 J2 n n
E0 + = mx + 2 + ar0 (1 + x)
2 2
2mr0 (1 + x)
' (
1 2 2 J02 % 2
& n 1 2
= mr x + 1 ! 2x + 3x + ar0 1 + nx + n (n ! 1) x
2 0 2mr02 2
2
' 2
(
1 2 2 J J
= mr0 x + 0 2 + ar0n + ! 0 2 + nar0n x
2 2mr0 mr0
' 2
(
3J0 1
+ 2
+ n (n ! 1) ar0n x2
2mr0 2
Using the value of E0 and the condition for a minimum of the potential, the
J2
constant and linear terms cancel. Replacing nar0n = mr02 leaves
0
1 1 J02
= mr02 x 2 + (3 + (n ! 1)) x2
2 2 mr02
This is clearly the energy for a simple harmonic oscillator with effective mass
1 = mr02
and constant
J02
k = (n + 2)
mr02
and therefore of squared frequency
k
2 =
1
J02
= (n + 2)
m2r04
= (n + 2) 20
p
The orbits will close after q circuits if is any rational multiple, q
of 0, for
non-negative integers p and q. This occurs when
p
n+2 =
q
p2
n = 2 !2
q
150
When q = 1, each single orbital sweep is identical. In this case we have
n = p2 ! 2
{!2, !1, 2, 7, 14, . . .}
Notice that we have computed the frequency of small oscillations only to lowest
order, and that at the same order the orbital frequency is still 0 :
% &
= 0 1 ! 2x + 3x2 : 0
It can be shown that the n = !2 case allows no bound orbits at all, and second,
that unless n = 2 or n = !1, the result holds only for orbits which are pertur-
batively close to circular, while orbits deviating nonperturbatively fail to close.
The conclusion is therefore that generic orbits close only for the Kepler/Coulomb
and Hooke#s law potentials
a
V =
r
V = ar2
Exercise 8.10. Study motion in a central potential V = ! ra2 . Prove the follow-
ing:
2. Orbits which initially increase in distance from the center of force continue
to spiral to in"nity. Find an expression for the angle as a function of time,
and "nd the limiting angle as t .
3. Orbits which initially decrease in distance from the center of force spiral
inward, reaching the center in a "nite amount of time. During this "nite
time, the angle increases without bound.
151
8.5. Symmetries of motion for the Kepler problem
Recent decades have seen new techniques and revivals of long-forgotten symme-
tries of the Kepler problem ([32],[33]). The best-known rediscovery concerning
the Kepler problem is that in addition to the energy, E and angular momentum,
1 $
E = mx2 !
2 r
L = r"p (8.4)
= ! sin + j cos
d
= ! cos ! j sin
dt
= ! r
Using the force law and the angular momentum, we can write this as
d
= ! r
dt
L
= ! 2 r
mr
L
= f (r )
m$
where
$
f (r ) = ! r
r2
152
is Newton#s gravitational force. By the law of motion, we have
dp
= f (r )
dt
so we may write
d
L dp
=
dt m$ dt
or simply
d ! m$ "
p! =0
dt L
This provides a conservation law for Kepler orbits. We de!ne Hamilton#s vector,
m$
h=p!
L
as the conserved quantity.
An alternative conserved quantity, the Laplace-Runge-Lenz vector, is given
by the cross product of h with the angular momentum,
The Laplace-Runge-Lenz vector, the Hamilton vector and the angular momen-
tum form a mutually orthogonal basis.
Exercise 8.12. Check directly that ddtA = 0. Choose coordinates so that the
motion lies in the xy plane with the perihelion on the x axis. Show that the
Laplace-Runge-Lenz vector points in the direction of perihelion.
153
Hamilton#s vector may be used to !nd the motion of the system. We follow a
proof due to Muoz [2]. Let motion be in the xy -plane and choose the perihelion
of the orbit to occur at time t = 0 on the x-axis. Then the initial velocity is
given by
v = v0
= v0 j
Then at t = 0,
m$
h = mv !
! L
m$ "
= mr0 0 ! j
L
At an arbitrary time, dotting h with
gives
$m
h%
= p%
!
! L
m$ " $m
mr0 0 ! cos ' = mr !
L L
L
or replacing = mr2
,
' (
L m$ L m$
! cos ' = !
r0 L r L
' (
m$ 1 m$ 1
2
+ ! 2 cos ' =
L r0 L r
or
L2 /m$
r= ! "
L2
1+ m!r0
! 1 cos '
as usual. In terms of the initial energy
1 2 2 $
E = mr !
2 0 0 r0
L2 $
E = 2
!
2mr0 r0
L2 $
0 = ! !E
2mr02 r0
9
2
1 $ + $2 + 4EL
2m
= 2 L2
r0 2m
154
5 : 6
m$ 2EL2
= 1+ 1+
L2 m$2
De!ning
L2
rm =
m$
:
2EL2
= 1+
m$2
The motion is therefore given in terms of the energy and angular momentum by
rm
r= (8.5)
1 + cos
These curves are hyperbolas, parabolas and ellipses, all of which are conic sec-
tions, that is, curves that arise from the intersection of a plane with a cone.
A plane with closest approach to the origin given by a !xed vector a consists
of all vectors, x ! a, from a which are orthogonal to a,
(x ! a) % a = 0
ax + bz = a2 + b2
Solving for z and substituting into the equation for the cone, we have
1 % 2 &2 % 2 &
z2 = a + b 2
! ax = $ 2
x + y 2
b2
Setting d = a2 + b2 and c = $2 b2 and simplifying,
% &
c ! a2 x2 + cy 2 + 2adx = d2
155
When c =,a2, this is the equation of a parabola. When c ! a2 is nonzero we can
set e = + |c ! a2| and % = sign (c ! a2) , and rewrite the equation as
' (2 ' (2
2%ad 2 2 2%ad
% ex + + cy = d +
e e
In this form we recognize the equation for an hyperbola when % = !1 and an
ellipse when % = +1.
When an ellipse is centered at the origin, we may write its equation in the
simpler form
x2 y 2
+ 2 =1
a2 b
To derive this form from the equation for a Kepler orbit we must convert eq.(8.5)
from r and ', which are measured from one focus of the ellipse, to Cartesian
coordinates measured from the center. The full range of r from rmin at ' = 0 to
rmax at ' = ! gives the semi-major axis d = 2a as
rm
a=
1 ! 2
while the maximum value of y gives the semiminor axis
rm
b=
1 ! 2
It is then not hard to see that the coordinates of any point are given by
rm sin '
y =
1 + cos '
rm cos '
x = a +
1 + cos '
Then we compute:
' (2 ' (2 ' (2
x2 y 2 1 ! 2 rm cos ' 1 ! 2 rm sin '
+ = a + +
a 2 b2 rm 1 + cos ' 2
rm 1 + cos '
1 !% % & &2 % & 2 "
2 2
= (1 + cos ') + 1 ! cos ' + 1 ! sin '
(1 + cos ')2
= 1
156
Exercise 8.13. Show that the Laplace-Runge-Lenz vector points in the direc-
tion of perihelion and has magnitude Emr0 .
Exercise 8.14. Show that when > 1, the orbit formula, eq.(8.5 ), describes
hyperbolas.
dm = / (x#) d3 x#
157
Without loss of generality, look at x# = 0, and de!ne the sequence of functions
1
fa (x ! x# ) = 2 9
|x ! x#|2 + a2
1
fa (x) = 2
r + a2
2
158
since g, being a test function, has !nite integral. For the !rst integral,
# R
1
I = lim g (x) 2 d3 r
a 0 2
r +a 2
# R
3a2
= !g (0) lim d3 r
a 0 (r 2 + a2 )5/2
#
3a2
= !4!g (0) lim r2 dr
a 0 (r 2 + a2 )5/2
159
#
= ! G/ (x# ) 4!- ( x ! x#) d3 x#
= !4!G/ (x)
This is now a !eld equation for the gravitational potential ' :
2 ' (x) = !4!G/ (x)
De!ning the gravitational "eld,
g (x) = !' (x)
we may write
% g (x) = 4!G/ (x)
and using Stoke#s theorem, we see that Gauss# law applies to the gravitational
!eld:
P #
2
g (x ) % n d x = 2 ' (x) d3 x
S V
#
= !4!G / (x) d3 x
V
= !4!GMV
That is, the integral of the normal component of the !eld over the closed bound-
ary surface S of a volume V is equal to !4!G times the total mass contained in
that volume.
Exercise 8.15. Use Gauss' law to "nd the potential inside and outside a uni-
form spherical ball of total mass M. Prove that the gravitational "eld at any
point outside the ball is the same as it would be if the mass M were concentrated
at a point at the center.
Exercise 8.16. Notice that the gravitational "eld equation,
% g (x) = 4!G/ (x)
is independent of time, so that changes in the "eld are felt instantaneously
at arbitrary distances. We might try to "x the problem by including time
derivatives in the equation for the potential,
1 2 ' (x , t )
! + 2' (x, t) = !4!G/ (x, t)
c2 t2
Show that, in empty space where / = 0 this equation permits gravitational wave
solutions.
160
9. Constraints
We are often interested in problems which do not allow all particles a full range
of motion, but instead restrict motion to some subspace. When constrained
motion can be described in this way, there is a simple technique for formulating
the problem.
Subspaces of constraint may be described by relationships between the co-
ordinates, % &
f xi , t = 0
The trick is to introduce f into the problem in such a way that it must vanish in
the solution. Our understanding of the Euler-Lagrange equation as the covariant
form of Newton#s second law tells us how to do this. Since the force that
maintains the constraint must be orthogonal to the surface f = 0, it will be
in the direction of the gradient of f and we can write
d L L f
i
! i =. i
dt x x x
where . = . (xi , x i , t) determines the amplitude of the gradient required to
provide the constraining force. In addition, we need the constraint itself.
f
Remarkably, both the addition of . x i and the constraint itself follow as a
These are exactly what we require $ the extra equation gives just enough infor-
mation to determine ..
161
Thus, by increasing the number of degrees of freedom of the problem by one
for each constraint, we include the constraint while allowing free variation of the
action. In exchange for the added equation of motion, we learn that the force
required to maintain the constraint is
i f
Fconstraint = .g ij
xi
The advantage of treating constraints in this way is that we now may carry
out the variation of the coordinates freely, as if all motions were possible. The
variation of ., called a Lagrange multiplier, brings in the constraint automati-
cally. In the end, we will have the N Euler-Lagrange equations we started with
(assuming an initial N degrees of freedom), plus an additional equation for each
Lagrange multiplier.
When the constraint surface is !xed in space the constraint force never does
any work since there is never any motion in the direction of the gradient of f.
If there is time dependence of the surface then work will be done. Because f
remains zero its total time derivative vanishes so
df
0 =
dt
f dxi f
= +
xi dt t
or multiplying by .dt and integrating,
# #
f i f
W = . i dx = ! . dt
x t
Thus, the Lagrange multiplier allows us to compute the work done by a moving
constraint surface.
As a simple example, consider the motion of a particle under the in&uence
of gravity, V = mgz, constrained to the inclined plane
f (x, z ) = z ! x tan ( = 0
162
Because y is cyclic we immediately have
py = my = mv0y = const.
so that
y = y0 + v 0 y t
We also have conservation of energy,
1
E = mx2 + mgz ! . (z ! x tan ()
2
Varying x, z and . we have three further equations,
0 = mx$ + . tan (
0 = mz$ + mg ! .
0 = z ! x tan (
0 = x$ + g cos ( sin (
0 = z$ + g sin2 (
163
From this we immediately !nd x and z separately:
1
x = x0 + v 0x t + cos ( sin (gt2
2
1
z = z0 + v0z t ! sin2 (gt2
2
The constraint force is given by
f
F i = .gij
xj
= mg cos2 ( (! tan (, 0, 1)
= mg cos ( (! sin (, 0, cos ()
Notice that the magnitude is mg cos ( as expected. Moreover, since the vector
wi = (1, 0, tan ()
Exercise 9.1. Work the inclined plane problem directly from Newton's second
law and check explicitly that the force applied by the plane is
Exercise 9.2. Repeat the inclined plane problem with a moving plane. Let the
plane move in the direction
v i = (v1, 0, v3)
Find the work done on the particle by the plane. For what velocities v i does the
particle stay in the same position on the plane?
Exercise 9.3.
, A particle of mass m moves frictionlessly on the surface z = k/ ,
2
164
Exercise 9.4. A ball moves frictionlessly on a horizontal tabletop. The ball of
mass m is connected to a string of length L which passes through a hole in the
tabletop and is fastened to a pendulum of mass M. The string is free to slide
through the hole in either direction. Find the motion of the ball and pendulum.
Exercise 9.5. Study the motion of a spherical pendulum: a negligably light rod
of length L with a mass m attached to one end. The remaining end is "xed in
space. The mass is therefore free to move anywhere on the surface of a sphere
of radius L, under the in#uence of gravity, !mg k.
3. Use the conservation laws and any needed equations of motion to solve for
the motion. In particular study the following motions:
4. Beginning with your solution for motion in a horizontal plane, study small
oscillations away from the plane.
165
10. Rotating coordinates
It frequently happens that we want to describe motion in a non-inertial frame.
For example, the motion of a projectile relative to coordinates !xed to Earth
is modi!ed slightly by the rotation of Earth. The use of a non-inertial frame
introduces additional terms into Newton#s law. We have shown that the Euler-
Lagrangian is covariant under time-dependent diffeomorphisms,
% &
xi = xi y j , t
y i = Ri j (t) xj
C Di
xi = R"1 j y j (10.1)
10.1. Rotations
Let Ri j be an orthogonal transformation, so that Rt R = 1 and let the rotating
coordinates y i be related to inertial (and by de!nition non-rotating) coordinates
xi (t) = xi (0) by eqs.(10.1). After an in!nitesimal time dt,
y j (dt) = xi + -xi
where -xi is proportional to dt. In Section 6.2.2 we showed that the change of
xi under such an in!nitesimal rotation can be written as
-xi = !i jk
j
dt xk
= %i k x k
i dt = ni dt = ni d
166
the magnitude and angle may be functions of time, but we assume the unit
vector ni is constant. Computing the effect of many in!nitesimal rotations, we
can !nd the effect of a !nite rotation,
Q Ri
Ri j (t ) = lim (- + %d)n
n
ndtt j
where
(- + %d)n
means the nth power of a matrix. Expanding this power using the binomial
theorem,
4n
n!
Ri j (t) = nlim (- )n"k (%d)k
nd
k ! (n ! k )!
k=0
n
4 n (n ! 1) (n ! 2) % % % (n ! k + 1)
= lim (%nd)k
n
nd k=0 nk k !
n % &% & % &
4 1 1 ! n1 1 ! n2 % % % 1 ! k"1
= lim
n
n
(%nd)k
nd k=0 k!
4
1
= (%)k
k=0
k!
= exp (%)
Finally, substituting the expression for the generator gives
% &
Ri k (t) = exp !i jk nj xk
We can evaluate the exponential by !nding powers of the Levi-Civita tensor.
I% &2 Ji
ijk nj = 2 i jk nj k lm nl
m
% &
= 2 - il -jm ! -im - jl nj nl
% &
= !2 -im ! nm ni
Then
I% & Ji % &
i j 3
jk = !3 -im ! nm ni m jn nj
n
% &
= !3 i jn nj ! mjn nj nm ni
= !3 i jn n
j
167
This pattern repeats, so that we can immediately write the whole series. Sepa-
rate the magnitude and direction of i as i = ni . Then
% &
Ri k (t) = exp !i jk nj
1 I % i j &2 n J i
4
= jk n
n=0
(2n)! k
4
1 I% &2n+1 Ji
i j
! jk n
n=0
(2n + 1)! k
4 1 % 2 &n % i &
= -ik + ! -k ! ni nk
n=1
(2n)!
4 1 % 2 &n % i j
&
! ! t jk n
n=0
(2n + 1)!
% & % &4 (!1)n 2n
= -ik ! - ik ! ni nk + - ik ! ni nk
n=0
(2n)!
4
i j (!1)n 2n+1
! jk n
n=0
(2n + 1)!
% &
= ni nk + - ik ! ni nk cos ! i jk n
j
sin
y i (t) = Ri k (t) xk
% &
= ni nk xk + -ik ! ni nk xk cos ! i j k
jk n x sin
This is an entirely sensible result. The rotating vector has been decomposed
into three mutually orthogonal directions,
n, (x ! (n % x) n) , n " x
168
of x parallel to n while the motion clearly rotates the plane orthogonal to n. To
see this more clearly, choose n in the z direction, n = k. Then y is given by
y1 cos t ! sin t 0 x1
y2 = sin t cos t 0 x2
y3 0 0 1 x3
Exercise 10.1. Show that for a general rotation Ri j (, n, t) , n gives the axis
of rotation while t is the angle of rotation at time t.
we see that
d i C % &Di j d
R k = ! exp !i j
jk n m
m jk n
dt dt
= !Ri m
m j
jk n
169
10.2. The Coriolis theorem
Let#s look at the metric and Christoffel symbol more closely. The metric term
in the action is really the kinetic energy, and there may be more kinetic energy
with y i than meets the eye. For xi we have
1
T = m- ij x i x j
2
Now, with y i rotating, we get
dx
x i =
dt! "
i
d [R"1 ] jy
j
=
dt
i
C Di dy j d [R"1 ] j
= R"1 j
+ yj
dt dt
We show below that if the rotation is always about a !xed direction in space
then
d i
R k = !Ri m
m
jk
j
dt
i
d [R"1 ] C Di
k
= R"1 m
m jk
j
dt
so
C CDi Di
x i = R"1
y j + R"1 m m jk j y k
j
C Di % &
= R"1 m y m + m jk j y k
170
and this is not purely quadratic in the velocity! Nonetheless, the equation of
motion is the Euler-Lagrange equation. Setting L = T ! V, we have
d % & V % &% &
m- mi y m + m jk j y k + i + m- mn y m + m jk j y k n li l = 0
dt y
% & V
m y$i + ijk j y k + ijk j y k ! mli l y m ! m jk mli l j y k + i = 0
y
% & V
m y$i + 2ijk j y k + ijk j y k ! m jk mli l j y k + i = 0
y
Writing the result in vector notation, we have
d2 y
m + 2m ( " y) + m " y + m " ( " y) = F
dt2
Alternatively, we may write
d2 y
m = F ! 2m ( " y) ! m
" y ! m " ( " y )
dt2
and think of the added terms on the right as effective forces. This is the Coriolis
theorem.
The meaning of the various terms is easy to see. The last term on the right
is the usual centrifugal force, acts radially away from the axis of rotation.
The second term on the right, called the Coriolis force, depends only on the
velocity of the particle. It is largest for a particle moving radially toward or
away from the axis of rotation. The particle experiences a change in radius,
and at the new radius will be moving at the wrong speed for an orbit. For
example, suppose we drop a particle from a height above the equator. At the
initial moment, the particle is moving with the rotation of Earth, but as it falls,
it is moving faster than the surface below it, and therefore overtakes the planet#s
surface. Since Earth rotates from west to east, the particle will fall to the east
of a point directly below it.
The third term applies if the angular velocity is changing. Suppose it is
increasing, is in the same direction as . Then the particle will tend to be
left behind in its rotational motion. If Earth were spinning up, this would give
an acceleration to the west.
171
11. Inequivalent Lagrangians
One of the more startling in&uences of quantum physics on the study of classical
mechanics is the realization that there exist inequivalent Lagrangians determin-
ing a given set of classical paths. Inequivalent Lagrangians for a given problem
are those whose difference is not a total derivative. While it is not too surpris-
ing that a given set of paths provides extremals for more than one functional, it
is striking that some systems permit in!nitely many Lagrangians for the same
paths. There remain many open questions on this subject, with most of the
results holding in only one dimension.
To begin our exploration of inequivalent Lagrangians, we describe classes of
free particle Lagrangians and give some examples. Next we move to the theorems
for 1-dim systems due to Yan, Kobussen and Leubner ([12], [13], [14], [15],
[16]) including a simple example. Then we consider inequivalent Lagrangians in
higher dimensions.
172
This solution is well-de!ned on any region in which the mapping between velocity
and momentum is 1!1. This means that velocity ranges may be any of four types:
v (0, ) , (0, v1) , (v1, v2) , (v1, ) . Which of the four types occurs depends on
the singularities of f # v i/v. Since v i /v is a well-de!ned unit vector for all nonzero
vi , it is f # which determines the range. Requiring the map from v i to pi to
be single valued and !nite, we restrict to regions of f # which are monotonic.
Independent physical ranges of velocity will then be determined by each zero or
pole of f # . In general there will be n + 1 such ranges:
173
this dependence is linear. Among the solutions are certain special cases called
'geometrically free( [18]. These arise as follows. We may factor the action into
a product which, schematically, takes the form
d/2 '
# S "(
1 ! ak bk
S= Rak bk ck dk
bk ak
! $k - ck -dk ! - ck - dk a1 b1 &&&ad/2 bd/2 c1 d1 &&&cd/2 dd/2
2
k=0
where Rab cd is the curvature tensor and $k are constants. Suppose that for all
k = 1, . . . , n for some n in the range 2 < n < d/2, we have
$k = $
for some !xed value $. Then the variational equations all contain at least n ! 1
factors of
ak bk 1 ! ak bk bk ak
"
R ck dk ! $ - ck - dk ! - ck - dk
2
Therefore, if there is a subspace of dimension m > d ! n +1 of constant curvature
1 ! "
Rak bk ck dk = $ - ackk -bdkk ! - bckk - adkk
2
for a, b = 1, . . . , m, then the !eld equations are satis!ed regardless of the metric
on the complementary subspace. This is similar to the case of vanishing f # , where
the equation of motion is satis!ed regardless of the direction of the velocity,
vi
pi = f # 40
v
as long as v, but not vi, is constant.
Finally, suppose f # has a pole at some value v0 . Then the momentum diverges
and motion never occurs at velocity v0. Of course, this is the case in special
relativity, where the action of a free particle may be written as
#
S = p! dx!
#
= ! Edt + pi dxi
# :
2 v2
= !mc 1 ! 2 dt
c
174
9
v2
With f (v ) = !mc 2
1! c2
, we have
mv
f# = 9
v2
1! c2
175
We illustrate a special case of this formula, of the form
# x
K (x, v )
L (x, v ) = x dv (11.2)
v2
where K is any constant of the motion of the system. This expression is valid
when the original Lagrangian has no explicit time dependence. Following Okubo
[29], we prove that eq.(11.2 ) leads to the constancy of K. The result follows
immediately from the Euler-Lagrange expression for L :
'# x ( # x
d L L d K (x, v ) K (x, x ) 1 K (x, v )
! = dv + x
! x
dv
dt x x dt v2 x 2 v2 x
x$ K (x, x ) K (x, x )
= +
x x x
1 dK (x, x )
=
x dt
Therefore, the Euler-Lagrange equation holds if and only if K (x, x ) is a constant
of the motion.
The uniqueness in 1-dim follows from the fact that a single constant of the
motion is sufficient to determine the solution curves up to the initial point. The
uniqueness also depends on there being only a single Euler-Lagrange equation.
These observations lead us to a higher dimensional result below.
It is interesting to notice that we can derive this form for L, but with K
replaced by the Hamiltonian, by inverting the usual expression,
L
H = x !L
x
for the Hamiltonian in terms of the Lagrangian. First, rewrite the right side as:
L
H = x !L
x ' (
L
= x 2
x x
176
The remarkable fact is that the Hamiltonian may be replaced by any constant of
the motion in this expression. Conversely, suppose we begin with the Lagrangian
in terms of an arbitrary constant of motion, K, according to eq.(11.2),
# x
K (x, v )
L (x, v ) = x dv
v2
Then constructing the conserved energy,
(x, p) = x L ! L
E
x ' (
# x # x
K (x, v ) K (x, v )
= x x 2
dv ! x dv
x v v2
'# x ( # x
K (x, v ) K (x, x ) K (x, v )
= x 2
dv + ! x dv
v x v2
= K (x, x )
177
The Euler-Lagrange equation resulting from L is
d L L
0 = !
dt '
x x (
d 1 2 3 2
% &
= m x + kmxx ! kmx 2x ! k 2 x3
dt 3
% &
= (mx$ + kx) mx 2 + kx2
Either of the two factors may be zero. Setting the !rst to zero is gives the usual
equation for the oscillator, while setting the second to zero we !nd the same
solutions in exponential form:
x = Aeit + Be"it
d L L 1 dK (x, x )
! =
dt x x x dt
1 # d$ (x, x )
= f
x dt
178
Setting this to zero, we have two types of solution:
f # ( $) = 0
d$
= 0
dt
If spurious solutions could arise from motions with f # = 0, those motions would
have to stay at the critical point, $0 say, of f. But this means that $ = $0
remains constant. Therefore, the only way to introduce spurious solutions is if
d!
dt
= 0 has solutions beyond the usual solutions. This may not be possible in
one dimension. Finally, the inverse of the equation $ (x, t) = $0 may not exist
at critical points, so the theorem must refer only to local equivalence of the
solutions for inequivalent Lagrangians.
p2
E=
2m
and we immediately !nd that a complete solution is characterized by the initial
components of the momentum, p0i and the initial position, x0i . Clearly, knowl-
edge of the momenta is necessary and sufficient to determine a set of &ows. If
179
we consider inequivalent Lagrangians,
# v
f (5 )
L=v d5 = F (v )
52
where
v= v2
then the momenta
L vi
pi0 = i
= F#
v v
comprise a set of !rst integrals of the motion. Inverting for the velocity
v i = v i (pi0 )
180
! "
an arbitrary, time-independent constant of the motion, $ x, v, ( ,
! "
# v $ x, 5, ( ! "
=v
L d5 + f x, (
52
still satis!es the same Euler-Lagrange equation,
then the new Lagrangian, L,
5 6
d L
L
x i ! =0
dt x i xi
Part III
Conformal gauge theory
We now have the tools we need to describe the most elegant and powerful formu-
lation of classical mechanics, as well as the starting point for quantum theory.
While our treatment of Hamiltonian mechanics is non-relativistic, we begin with
a relativistic treatment of the related conformal symmetry because it is from that
perspective that the Hamiltonian itself most naturally arises. The development
follows the steps we took in the !rst Chapters of the book, constuction the arena
and a law of motion from symmetry and variational principles. The new ele-
ment that gives Hamiltonian dynamics its particular character is the choice of
symmetry. Whereas Lagrangian theory arises as the gauge theory of Newton#s
second law when we generalize from Galilean symmetry to the diffeomorphism
group, Hamiltonian mechanics arises by gauging the conformal group.
Because we want to gauge the conformal symmetry of spacetime, we begin
with a discussion of special relativity. Then we proceed in the following stages.
First, we return to contrast the Galilean symmetry of Newton#s dynamical equa-
tion with the conformal symmetry of Newtonian measurement theory. Next, we
build a new dynamical theory based on the full conformal symmetry, beginning
with the construction of a new space to provide the arena, and continuing with
the postulating of a dynamical law. Ultimately, we show the equivalence of the
new formulation to the original second law and to Lagrangian dynamics.
181
After completing these constructions, we develop the properties of Hamilton
mechanics.
12.1. Spacetime
From the power point presentation, you know that spacetime is a four dimen-
sional vector space with metric
!1
1
3 !$ =
1
1
where the in!nitesimal proper time * and proper length s are given by
c2 d* 2 = c2 dt2 ! dx2 ! dy 2 ! dz 2
ds2 = !c2 dt2 + dx2 + dy 2 + dz 2
are agreed upon by all observers. The set of points at zero proper interval from
a given point, P, is the light cone of that point. The light cone divides spacetime
into regions. Points lying inside the light cone and having later time than P is
the future. Points inside the cone with earlier times lie in the past of P. Points
outside the cone are called elsewhere.
Timelike curves always lie inside the light cones of any of their points, while
spacelike curves lie outside the lightcones of their points. The tangent vector at
any point of a timelike curve point into the light cone and are timelike vectors,
while the tangent to any spacelike curve is spacelike. The elapsed physical time
experienced travelling along any timelike curve is given by integrating d* along
that curve. Similarly, the proper distance along any spacelike curve is found by
integrating ds.Spacetime
We refer to the coordinates of an event in spacetime using the four coordi-
nates
x! = (ct, x, y, z )
182
where $ = 0, 1, 2, 3. We may also write x! in any of the following ways:
x! = (ct, x)
% &
= ct, xi
% &
= x0 , xi
s2 = ! c2 t 2 + x 2 + y 2 + z 2
% &
c2 * 2 = c2 t 2 ! x 2 + y 2 + z 2
where s is proper distance and * is proper time. These intervals are agreed upon
by all observers.
These quadratic forms de!ne a metric,
!1
1
3 !$ =
1
1
s2 = 3 !$ x!x$
or in!nitesimally
ds2 = 3 !$ dx! dx$
A Lorentz transformation may be de!ned as any transformation of the coordi-
nates that leaves the length-squared s2 unchanged. It follows that
y ! = !! $x
$
3 /0 = 3 !$ !! / !$ 0
183
The set of points at zero proper interval, s2 = 0, from a given point, P, is the
light cone of that point. The light cone divides spacetime into regions. Points
lying inside the light cone and having later time than P lie in the future of P .
Points inside the cone with earlier times lie in the past of P. Points outside the
cone are called elsewhere.
Timelike vectors from P connect P to past or future points. Timelike curves
are curves whose tangent vector at any point x! (.) are timelike vectors at x! (.),
while spacelike curves have tangents lying outside the lightcones of their points.
The elapsed physical time experienced travelling along any timelike curve is
given by integrating d* along that curve. Similarly, the proper distance along
any spacelike curve is found by integrating ds.
184
where v2 is the usual squared magnitude of the 3-velocity. Notice that if v2 is
ever different from zero, then * AB is smaller than the time difference tB ! tA :
# tB : # tB
v2
* AB = dt 1 ! 2 + dt = tB ! tA
tA c tA
Equality holds only if the particle remains at rest in the given frame of reference.
This difference has been measured to high accuracy. One excellent test is to
study the number of muons reaching the surface of the earth after being formed
by cosmic ray impacts on the top of the atmosphere. These particles have a
hal&ife on the order of 10"11 seconds, so they would normally travel only a few
centimeters before decaying. However, because they are produced in a high
energy collision that leaves them travelling toward the ground at nearly the
speed of light, many of them are detected at the surface of the earth.
We next need a generalization of the velocity of the particle. We can get a
direction in spacetime corresponding to the direction of the particle#s motion by
looking at the tangent vector to the curve,
! dx! (.)
t =
d.
We can see that this tangent vector is closely related to the velocity by expanding
with the chain rule,
dx! (.)
t! =
d.
dt dx!
=
d. dt
dt d % &
= ct, xi
d. dt
dt % i &
= c, v
d.
where v i is the usual Newtonian 3-velocity. This is close to what we need, but
dt
since . is arbitrary, so is d% . However, we can de!ne a true vector by using
the proper time as the parameter. Let the world line be parameterized by the
elapsed proper time, * , of the particle. Then de!ne the 4-velocity,
dx! (* )
u! =
d*
185
Since the coordinates of the particle transform according to the Lorentz trans-
formation #
t & vx c2
& t
x# &v & x
# =
y 1 y
#
z 1 z
or more simply,
x # ! = !! $x
$
Therefore, % &
u! = & c, v i (12.1)
This is an extremely useful form for the 4-velocity. We shall use it frequently.
186
Since u! is a 4-vector, its magnitude
3 !$ u!u$
must be invariant! This means that the velocity of every particle in spacetime
has the same particular value. Let#s compute it:
% &2 4 % i &2
3 !$ u! u$ = ! u0 + u
i
= !& c + & v 2
2 2 2
! c2 + v 2
= 2
1 ! vc2
= ! c2
This is indeed invariant! Our formalism is doing what it is supposed to do.
Now let#s look at how the 4-velocity is related to the usual 3-velocity. If
v2 << c2, the components of the 4-velocity are just
% & % &
u! = & c, v i 2 c, v i (12.2)
The speed of light, c, is just a constant, and the spatial components reduce to
precisely the Newtonian velocity. This is just right. Moreover, it takes no new
information to write the general form of u! once we know v i $ there is no new
information, just a different form.
From the 4-velocity it is natural to de!ne the 4-momentum by multiplying
by the mass,
p! = mu!
% &
= m& c, v i
As we might expect, the 3-momentum part of p! is closely related to the New-
tonian expression mv i. In general it is
mv i
pi = 9
2
1 ! vc2
187
Thus, while relativistic momentum differs from Newtonian momentum, they
2
only differ at order vc2 . Even for the 7 mi/ sec velocity of a spacecraft which
escapes Earth#s gravity this ratio is only
v2
= 1.4 " 10"9
c2
so the Newtonian momentum is correct to parts per billion. In particle accel-
erators, however, where near-light speeds are commonplace, the difference is
substantial (see exercises).
Now consider the remaining component of the 4-momentum. Multiplying by
c and expanding & we !nd
p0 c = mc2 &
' (
2 1 v2 3 v4
= mc 1 + 2 + 4 % % %
2c 8c
1 3 v2
2 mc2 + mv 2 + mv 2 2
2 8 c
The third term is negligible at ordinary velocities, while we recognize the second
term as the usual Newtonian kinetic energy. We therefore identify E = p0 c. Since
the !rst term is constant it plays no measurable role in classical mechanics but
it suggests that there is intrinsic energy associated with the mass of an object.
This conjecture is con!rmed by observations of nuclear decay. In such decays,
the mass of the initial particle is greater than the sum of the masses of the
product particles, with the energy difference
4
"E = minitial c2 ! mf inalc2
correctly showing up as kinetic energy.
Exercise 12.1. Suppose a muon is produced in the upper atmosphere moving
downward at v = .99c relative to the surface of Earth. If it decays after a
proper time * = 2.2 " 10"6 seconds, how far would it travel if there were no
time dilation? Would it reach Earth's surface? How far does it actually travel
relative to Earth? Note that many muons are seen reaching Earth's surface.
Exercise 12.2. A free neutron typically decays into a proton, an electron, and
an antineutrino. How much kinetic energy is shared by the "nal particles?
Exercise 12.3. Suppose a proton at Fermilab travels at .99c. Compare New-
tonian energy, 12 mv 2 to the relativistic energy p0 c.
188
12.3. Acceleration
Next we consider acceleration. We de!ne the acceleration 4-vector to be the
proper-time rate of change of 4-velocity,
du!
a! =
d*
dt d (& (c, v i ))
=
d*' dt' ( (
1 3 v m am % i & % i&
= & ! & !2 2 c, v + & 0, a
2 c
v m am 3 ! 2
% i&
= & u + & 0, a
c2
Is this consistent with our expectations? We know, for instance, that
u ! u ! = ! c2
Now compute a! a! :
' (
! v m am 3 ! 2
% i&
a a! = & u + & 0, a a!
c2
% &
= & 2 0, ai a!
' m (
2
% i& v am 4 2
= & 0, a % & (c, vi ) + & (0, ai )
c2
189
v m am 6 i
= 2
& a vi + & 4 ai ai
c5 6
m 2
( v a m )
= & 4 ai ai + & 2
c2
This expression gives the acceleration of a particle moving with relative velocity
v i when the acceleration in the instantaneous rest frame of the particle is given
by the v i = 0 expression
a! a! = ai ai
We consider two cases. First, suppose v i is parallel to ai . Then since a!a! is
invariant, the 3-acceleration is given by
a! a! = & 6 ai ai
or ' (3
i ! v2
a ai = a a! 1 ! 2
c
where a! a! is independent of v i. Therefore, as the particle nears the speed of
light, its apparent 3-acceleration decreases dramatically. When the acceleration
is orthogonal to the velocity, the exponent is reduced,
' (2
i ! v2
a ai = a a! 1 ! 2
c
190
Then the path of extremal proper time is given by the Euler-Lagrange equation
' (
d 1 !
! 2 u u! = 0
d* u$ c
Notice that
$ 1
P! = -$! + u! u$
c2
is a projection operator.
191
Exercise 12.5. A projection operator is an operator which is idempotent, that
is, it is its own square. Show that P $ ! is a projection operator by showing that
/ $ $
P! P/ = P!
Now we have
1 du! '
0= 2
' + P! $ $
c d* x
This projection operator is necessary because the acceleration term is orthogonal
to u!. Dividing by ', we see that
1 du! $ ln '
= !P!
c2 d* x$
If we identify ' (
V
' = exp
mc2
then we arrive at the desired equation of motion
du! $ V
m = !P!
d* x$
which now is seen to follow as the extremum of the functional
#
1 V
a
S [x ] = e mc2 (!u!u! )1/2 d* (12.3)
c C
192
Conformal transformations will appear again when we study Hamiltonian me-
chanics.
We can generalize the action further by observing that the potential is the
integral of the force along a curve,
#
V = ! F!dx!
C
The potential is de!ned only when this integral is single valued. By Stoke#s
theorem, this occurs if and only if the force is curl-free. But even for general
forces we can write the action as
#
1 1 @ !
a
S [x ] = e" mc2 C F! dx (!u!u! )1/2 d*
c C
In this case, variation leads to
# ! "
1 1 @ "
0 = e" mc2 C F" dx (!u/u/ )"1/2 (!u!-u!) d*
c C
# ' @
# (
1 " 1 2 C F# dx# 1 / 1/2 ! $
! e mc (!u u/ ) -x F$ dx d*
c C mc2 x! C
# ' ! (
1 d u! " 1 2 @C F! dx! " 1 1
@
F dx "
= e mc ! 2
e mc2 C " F!c -x!d*
c C d* c mc
and therefore,
du!
m = P! $ F$
d*
This time the equation holds for an arbitrary force.
Finally, consider the non-relativistic limit of the action. If v << c and
V << mc2 then to lowest order,
#
V
a
S [x ] = e mc2 d*
#C ' (
V 1
= 1+ dt
C mc2 &
193
# :
1 % 2 & v2
= mc + V 1 ! dt
mc2 C c2
# ' ' ( (
1 2 v2
= ! 2 mc !1 + 2 ! V dt
mc C 2c
# ' (
1 2 1 2
= ! 2 !mc + mv ! V dt
mc C 2
Discarding the multiplier and irrelevant constant mc2 in the integral we recover
# ' (
1 2
SCl = mv ! V dt
C 2
Since the conformal line element is a more fundamental object than the classical
action, this may be regarded as another derivation of the classical form of the
Lagrangian, L = T ! V.
1. Write the Euler-Lagrange equations (including the one arising from the
variation of .).
1
. = ! (' + a )
2
194
3. Substitute . back into the equation of motion and show that the choice
' (
' ! 2m + a 1
ln = V
'0 ! 2m + a mc2
gives the correct equation of motion.
4. Show, using the constraint freely, that S1 is a multiple of the action of
eq.(12.3).
Exercise 12.7. Consider the action
# 5 :
2
6
v
S2 [ x a ] = mc2 1 ! 2 + V dt
c
1. Show that S2 has the correct low-velocity limit, L = T ! V.
2. Show that the Euler-Lagrange equation following from the variation of S2
is not covariant. S2 is therefore unsatisfactory.
Exercise 12.8. Consider the action
#
S3 [x ] = (mu!u! ! 2V ) d*
a
195
13. The symmetry of Newtonian mechanics
Recall the distinction between the Newtonian dynamical law
F i = mai
While the second law is invariant under the global Galilean transformations of
eq.(7.5)
x i (q , t ) = M i j q j + a i + v i t
the inner products of vectors are invariant under local rotations while line ele-
ments are invariant under diffeomorphisms. When we consider transformations
preserving ratios of line elements, we are led to the conformal group.
To construct the conformal group, notice that the ratio of two line elements
(evaluated at a given point in space) is unchanged if both elements are multiplied
by a common factor. Thus, is does not matter whether we measure with the
metric gmn or a multiple of gmn ,
gmn = e2)gmn
We will consider only positive multiples, but ' may be a function of position.
Clearly, any transformation that preserves gmn also preserves gmn up to an over-
all factor, so the conformal group includes all of the Euclidean transformations.
In addition, multiplying all dynamical variables with units of (length)n by the
nth power of a given constant . will produce only an overall factor. Such trans-
formatoins are called dilatations. For example, if xi is the position of a particle
then xi is replaced by .xi . For a force F i we !rst need to express the units of F i
as a power of length. To do this we need two standards of measurement. Thus,
beginning with MKS units, we can use one constant, v0 = 1m/ sec, with units of
velocity to convert seconds to meters, and a second constant, h0 = 1kg -m2 / sec
so that v0/h0 converts kilograms to inverse meters. Any other choice is equally
196
valid. Using these constants, F i is re-expressed as F i /h0 v02 and thus measured
in
kgm sec2 1
2 4
= 3
sec kgm m
We therefore replace F i by F i /.3 . In this way we may express all physical
variables with powers of length only. In classical mechanics the constants h0
and v0 drop out of all physical predictions.
In addtion to these seven transformations $ three rotations, three transla-
tions, and three dilatations $ there are three more transformations which are
conformal. In the next section we !nd all conformal transformations systemat-
ically.
y i = xi + i (x)
j i + i j ! '- ij = 0
Notice that when ' = 0, these are just the generators of the Euclidean group.
To see their form, consider
j i + i j = 0
197
Take another derivative and cycle the indices:
k j i + k i j = 0
j i k + j k i = 0
i k j + i j k = 0
Add the !rst two and subtract the third, then use the commutation of second
partials,
k j i + k i j + j i k + j k i ! i k j ! i j k = 0
2j k i = 0
i = ai + bij xj
0 = j i + ij = bji + bij
- ji (j i + i j ) ! 3' = 0
3
i i = '
2
so we must solve
2
j i + i j ! m m - ij = 0
3
Cycle the same way as before to get
2
0 = k j i + k i j ! k mm - ij
3
2
0 = j i k + j k i ! j m m - ik
3
2 m
0 = i k j + i j k ! i m - kj
3
198
so that
2
0 = k j i + k i j ! k m m - ij
3
2
+j i k + j k i ! j m m
3
2
-ik ! ik j ! i j k + i mm -kj
3
1 1 1
j k i = k m - ij + j m - ik ! i m m -kj
m m
3 3 3
Now take a trace on jk
1
k k i = ! i ( m m )
3
Now look at the third derivative:
1 1 1
n j k i = n k mm -ij + n j m m -ik ! n i m m -kj
3 3 3
and trace nj :
1 1 1
j j k i = i k m m + j j m m -ik ! k i mm
3 3 3
1
k j j i = j j mm - ik
3
1 1 j
! k i ( m m ) = j mm - ik
3 3
1
= ! mm n n -ik
9
1
k i ( m m ) = m m n n -ik
3
Substitute into the third derivative equation,
1
n j k i = m m s s (- nk - ij + - nj - ik ! - ni -kj )
9
Now trace nk,
1
k j k i = m m s s -ij
3
Compare with the contraction on jk :
1 m
n k k i = m ss (-ni + -ni ! 3- ni )
9
1
= ! mm s s - ni
9
199
Together these show that
j k k i = 0
m m s s = 0
n j k i = 0
i = ai + bij xj + cijk xj xk
where we require cijk = cikj . Substituting into the original equation we have:
2
0 = j i + i j ! k k - ij
3
2% m &
= bij + 2cijk xk + bji + 2cjik xk ! b m + 2cm mk x
k
- ij
3
Therefore, separating the x-dependent terms from the constant terms, we must
have, for all xi ,
2
0 = bij + bji ! bm m - ij
3
2
0 = 2cijk xk + 2cjik xk ! 2cm mk x
k
- ij
3
The !rst,
2
bij + bji = bm m -ij
3
shows that the symmetric part of bij is determined purely by the trace b = bm m ,
while the antisymmetric part is arbitrary. If we let ij = !ji be an arbitrary,
constant, antisymmetric matrix then
1
bij = ij + b- ij
3
For the equation involving cijk , we can strip off the xk to write
2
cijk + cjik = cm mk - ij
3
200
Once again cycling the indices, adding and subtracting:
2 m
cijk + cjik = c mk - ij
3
2 m
cjki + ckji = c mi - jk
3
2 m
ckij + cikj = c mj - ik
3
so adding the !rst two and subtracting the third,
2% m m m
&
2cjik = c mk - ij + c mi - jk ! c mj - ik
3
Now letting ck = cm mk we see that the full in!nitesimal transformation may be
written as
i = ai + bij xj + cijk xj xk
1 1% &
= ai + ij xj + bxi + 2ck xk xi ! ci x2 (13.1)
3 3
We know that ai is a translation and it is not hard to see that ij generates
a rotation. In addition we have four more independent transformations, with
parameters b and ci . We could !nd the form of the transformations these produce
by compounding many in!nitesimal transformations into a !nite one, but if we
can just !nd a 4-parameter family of conformal transformations that are neither
translations nor rotations, then we must have all the transformations that exist.
One of the necessary transformations, called a dilatation, is trivial $ simply
multiply xi by a positive constant:
y i = e % xi
201
2
(ds# ) = - ij dy i dy j
' i (' j (
dx 2xi xk dxk dx 2xj xm dxm
= - ij ! !
x2 (x 2 )2 x2 (x 2 )2
' (2 % &2
1 i j 2 (xm dxm )2 2 xk dxk 4x2 (xm dxm )2
= - ij dx dx ! ! +
x2 (x 2 )3 (x 2 )3 (x 2 )2
' (2
1
= ds2
x2
Because the line element is changed by only a factor, inversions are conformal
transformations.
Before using inversions to construct new transformations, we note that the
inverse of the origin is not de!ned. However, it is easy to correct this by adding
i
a single point to 3-space. Consider y i = xx2 to be a new coordinate patch on an
extended manifold. The overlap with the xi coordinates consists of all values
except xi = 0 and y i = 0. The new point that we want to add is the one with
y i coordinate y i = 0. This point is called the 'point at in!nity.( The manifold
becomes compact, and requires two coordinate patches to cover. Inversions are
de!ned for all points of this extended manifold.
While inversion is simply a discrete transformation, we easily construct a
3-parameter family of transformations by sandwiching other transformations
between two inversions. Since each transformation is separately conformal, se-
quences of them must also be conformal. The result gives nothing new for
rotations or dilatations, but for translations we !nd that inverting, translating,
then inverting again gives
xi
i x2
+ bi xi + x2 bi
y = % xi &2 =
2 + b i 1 + bi xi + b2 x2
x
These transformations, which are simply translations of the point at in!nity, are
called special conformal transformations.Since we now have the required total
of 10 independent conformal transformations, we have found the full.
Exercise 13.1. Show that an inversion, followed by a dilatation, followed by
another inversion is just a dilatation by a different parameter. What is the
parameter?
Exercise 13.2. Show that an inversion followed by a rotation, followed by an-
other inversion, is just the original rotation.
202
Exercise 13.3. An in"nitesimal dilatation is given by
y i = xi + bxi
where b is in"nitesimal. Take a limit of in"nitely many in"nitesimal dilatations,
with limn (nb) = ., to show that a "nite dilatation is given by y i = e%xi .
203
The argument is the same as above, with the result that
1 1% &
! = a! + !$ x$ + bx! + 2c$ x$ x! ! c! x2 (13.3)
3 3
There are now four spacetime translations generated by a!, six Lorentz trans-
formations generated by !$ = !$! , one dilatation generated by b, and four
translations of the point at in!nity with parameters c!, giving a total of !fteen
conformal transformations of spacetime.
Exercise 13.6. Starting from eq.(13.2), repeat the steps of the preceeding sec-
tion to derive the form of the relativistic conformal generators given in eq.(13.3).
$% ! wi wi = 0
w i wi
% =
$
Now consider the effect of conformal transformations on xi and y i . Applying,
respectively, rotations, dilatations, translatations and special conformal trans-
formations to xi and y i , we !nd:
x#i = Ri jx
j
204
x#i = e) xi
x#i = xi + ai
xi + x2 bi
x#i =
1 + bi xi + b2 x2
for xi and
y#i = Ri jx
j
y#i = e)xi
y i + y 2 ai
y#i =
1 + ai y i + a2 y 2
y# = y i + bi
i
for y i .
Next, suppose these transformations are also allowed to change $ (and there-
fore % ) in a way yet to be speci!ed. For each type of conformal transformation,
we will choose the way $ and % transform so that wi wi ! $% = 0. First, under
rotations, let
w# i = Ri jw
j
$# = $
%# = %
w# i = wi
$# = e") $
%# = e)%
w# i = wi + $ai
$# = $
1% i &
%# = w + $ai (wi + $ai)
$
1% i &
= w wi + 2$aiwi + $2a2
$
= % + 2ai wi + a2$
205
and for special conformal transformations let
w# i = wi + %bi
1% i &
$# = w wi + 2%bi wi + % 2 b2
%
= $ + 2bi wi + b2 %
%# = %
$% ! wi wi
vanishes. But also, each of the transformations is linear $ for example, the
special conformal transformations are now given by
i i j
w# - j 0 bi w
$ # = 2bj 1 b 2
$
%# 0 0 1 %
s2 = 3 AB wA wB = wi wi ! $%
% &
wA = wi , $, %
If we let
$ = v!u
% = v+u
s2 = w i w i + u 2 ! v 2
206
The linear transformations that preserve this metric form the pseudo-orthogonal
group, O (4, 1) . Since the dimension of any orthogonal group in n-dimensions is
n(n"1)
2
, O (4, 1) must have dimension 52&4 = 10, exactly the dimension of the con-
formal transformations. The conformal group of Euclidean 3-space is therefore
O (4, 1) .
We collect the matrices representing conformal transformations:
Rotations: i i j
w# R j 0 0 w
$ # = 0 1 0 $
%# 0 0 1 %
Dilatations: i j
w# i -j 0 0 w
$ # = 0 e") 0 $
%# 0 0 e) %
Translations: i j
w# i - j ai 0 w
$ # = 0 1 0 $
%# 2aj a2 1 %
Special conformal transformations:
i i j
w# - j 0 bi w
$ # = 2bj 1 b 2
$
%# 0 0 1 %
207
However, if we take account of the need for a standard of measurement, we
!nd that it is the conformal group and not the Euclidean group that best de-
scribes Newtonian measurement theory. We therefore repeat the construction
starting from the conformal group. The !rst question is which transforma-
tions to regard as equivalent. Certainly we still want to equate points of the
group manifold related by rotations (or Lorentz transformations in the rela-
tivistic case), but we also want dilatational symmetry of the !nal space. This
means that we should regard points in the group space as equivalent if they are
related by dilatations as well. Special conformal transformations, on the other
hand, are a type of translation and should be treated the same way as the usual
translations.
Consider the Euclidean case !rst. There are three rotations and one di-
latation, so in the 10-dimensional space of conformal transformations we treat
certain 4-dimensional subspaces as equivalent. We therefore !nd a 6-dimensional
family of such 4-dimensional subspaces. This 6-dimensional family of points is
the arena for our generalization of mechanics. In the relativistic case we have six
Lorentz transformations and one dilatation, accounting for seven of the !fteen
conformal transformations. This will leave an 8-dimensional set of equivalent
points, so the arena is 8-dimensional.
vi J i mv
m
208
the
%1& connection transforms with two parts. The !rst part is appropriate to a
2
tensor, while the second, inhomogenous term contains the derivative of the
Jacobian matrix,
% & s
%i J i %m J#r J#s ! s J i
jk m rs j k J# j k
The extra derivative terms cancel out, leaving just the homogeneous transfor-
mation law we need.
With the conformal group, the Christoffel connection still works to make
derivatives covariant with respect to rotations, but we must also make derivatives
covariant with respect to scale changes. This is not difficult to do, since scale
changes are simpler than diffeomorphisms. Since a vector v i which has units of
(length)n will transform as
v i en) v i
its derivative will change by
% &
m en) v i = nen) v im ' + en) m v i
We can remove the extra term involving the derivative of the dilatation by
adding a vector Wm to the derivative and demanding that Wm change by the
troublesome term. Thus, setting
Dm v i = m v i + nWm v i
and demanding that Wm transform to Wm ! m ' we have
% & % &
Dm v i m en) v i + n (Wm ! m ') en) v i
= nen) v i m ' + en) m v i + nWm en) v i ! nm 'en) v i
% &
= en) m v i + nWm v i
= en) Dm v i
209
This is !ne for v i , but we also have to correct changes in the Christoffel connec-
tion. Start from
1
%i mn = g ik (m gkn + n gkm ! k gmn )
2
and performing a dilatation. Since the metric changes according to
gmn e2)gmn
the connection changes by
1 "2) ik % % 2) & % & % &&
%i mn e g m e gkn + n e2)gkm ! k e2)gmn
2
1 ik
= g (m gkn + n gkm ! k gmn )
2
+g ik (gkn m ' + gkm n ' ! gmn k ')
We need terms involving Wm to cancel the last three terms. It is easy to see
that the needed form of %imn is
1 ik
%i mn = g (m gkn + n gkm ! k gmn )
2
+g ik (gkn Wm + gkm Wn ! gmn Wk ) (14.1)
The new vector, Wm , is called the Weyl vector after Hermann Weyl, who in-
troduced it in 1918. Weyl suggested at the time that in a relativistic model
W! might correspond to the electromagnetic vector potential, A! = (, Ai ) .
However, this theory is unphysical because it leads to measureable changes in
the sizes of atoms when they pass through electromagnetic !elds. This &aw was
pointed out by Einstein in Weyl#s original paper. A decade later, however, Weyl
showed that electromagnetism could be written as a closely related theory in-
volving phase invariance rather than scale invariance. This was the !rst modern
gauge theory.
Exercise 14.1. Check that the form of the connection given in eq.(14.1) is
invariant under dilatation.
Combining the two covariances $ rotations and dilatations $ we write the
covariant derivative of v i as a combination of the two forms,
Dm v i = m v i + v n %i nm + nWm v i
where %i mn is given by the dilatation-invariant form in eq.(14.1).
210
14.2. Consequences of the covariant derivative
The dilatationally and rotationally covariant derivative has a surprising prop-
erty $it applies not only to vectors and higher rank tensors, but even scalars and
constants as well. The reason for this is that even constants may have dimen-
sions, and if we rescale the units, the 'constant( value will change accordingly.
Mass, for example, when expressed in units of length
A B
mv0
= [m]"1
h0
has units of inverse length. Therefore, its covariant derivative is
Di m = i m ! (!1) Wi m
If the vector Wi is nonzero then the condition for constant mass is
Di m = 0
which integrates to give the position-dependent result
' # (
i
m = m0 exp ! Wi dx
This is not surprising. When we change from meters to centimeters, the value
of all masses increases by a factor of 100. The presence of Wi now allows us to
choose position-dependent units of length, giving a continuous variation of the
value of m. Notice, however, that the ratio of two masses transported together
remains constant because the exponential factor cancels out.
There is a well-known difficulty with this factor though. Suppose we move
two masses along different paths, C1 and C2 , with common endpoints. Then
their ratio is
! @ "
i
m1 m01 exp ! C1 Wi dx
= ! @ "
m2 m02 exp ! C2 Wi dx i
' P (
m01 i
= exp ! Wi dx
m02 C1 "C2
211
for all closed paths C1 ! C2 . By Stokes# theorem, this is the case if and only if Wi
is the gradient of some function f. Any consistent interpretation of a dilatation-
covariant geometry must make a consistent interpretation of this factor. We
will re-examine the consequences of the present formulation of mechanics for
this factor below.
x! = x! + a!
x! + x2 b!
x! =
1 + b$ x$ + b2 x2
These parameters do not have the same units. Clearly, a! has units of length
like x! , but b! must have units of inverse length. As a reminder of their different
roles it is convenient to write the length-type coordinate with its index up and
the inverse-length type coordinate with its index down, that is, (a!, b!) or, as
we write from now on, 5 A = (x! , y!) . Since these become the coordinates of our
new arena for mechanics, called biconformal space, biconformal space contains
different types of geometric object in one space. Like the coordinates, we will
write vectors in biconformal spaces as pairs of 4-vectors in which the !rst is a
vector and the second a form:
w A = (u ! , v ! )
Forms are written with the indices in the opposite positions. For example, the
Weyl vector is usually written as a form. We write its eight components as
WA = (W! , S ! )
WA d5 A = W! dx! + S ! dy!
The Weyl vector plays a central role in the dynamics of biconformal spaces.
Using the structure of the conformal group, one can write a set of differential
212
equations which determine the form of the metric and connection, including the
Weyl vector. All known solutions for biconformal spaces have the same form for
the Weyl vector,
W! = !4y!
S! = 0
WA d5 A = !4y!dx!
WA = WA ! A '
Show that, by a suitable choice of ', the Weyl vector may be written with
components
4 4
WA d5 A = ! y! dx! + x!dy!
2 2
Recalling the non-integrability of lengths when WA is not a gradient, we
compute the components of the curl. Using the symmetric form of WA ,
5 W"
6
W! W! S "
WA WB x " ! x ! y"
! x !
B
! A
= S ! W" S ! S "
5 5 x"
! y! y"
! y!
' (
0 !- $!
= 4
-!$ 0
213
De!nition 14.2. Let &AB be 4 times the inverse to the curl of the Weyl vector,
' (
AB 0 - !$
& =
!-$! 0
{f, f } = 0
{f, g } = 1
{g, g } = 0
{y ! , y $ } = 0
214
14.4. Motion in biconformal space
As shown above, generic spaces with local dilatational symmetry allow measur-
able changes of lengths along different paths. The situation is no different in
biconformal spaces. If we de!ne lengths l1 and l2 using the Minkowski metric
3 !$ then follow their motions along paths C1 and C2 , respectively, then the !nal
ratio of their lengths is given by
'P (
l1 l01 A
= exp WA d5
l2 l02 C1 "C2
Variation of S leads to
# ' (
dx! d-x!
0 = -y! + y! d.
C d. d.
# ' (
dx! dy! !
= -y! ! -x d.
C d. d.
so, since the x! and y! variations are independent, we have the Euler-Lagrange
equations for a straight line:
dx!
= 0
d.
dy!
= 0
d.
The result is independent of parameterization. At this point we could introduce
a potential and examine the consequences, but there is a more direct way to
arrive at classical mechanics. We simply impose the invariance of the time
coordinate.
215
14.5. Hamiltonian dynamics and phase space
One of the distinguishing features of classical mechanics is the universality of
time. In keeping with this, we now require x0 = ct to be an invariant parameter
so that -t = 0. Then varying the corresponding canonical bracket we !nd
K L
0 = - t, 5 A
K L K L
= -t, 5 A + t, -5 A
% &
t -5 A
=
x0 y0
so that variations of all of the biconformal coordinates must be independent of
y0 : % &
-5 A
=0
y0
In particular, this applies to y0 itself,
(-y0)
=0
y0
so that y0 plays no role as an independent variable. Instead, -y0, and hence y0
itself, depend only on the remaining coordinates,
1 % &
y0 = ! H yi , xj , t
h0c
The function H is called the Hamiltonian. Our inclusion of h0 and c in its
de!nition give it units of energy. Its existence is seen to be a direct consequence
of the universality of classical time.
Fixing the time for all observers effectively reduces the 8-dimensional bicon-
formal space to a 6-dimensional biconformal space in which t parameterizes the
motion. We call this 6-dimensional space, as well as its 6N -dimensional general-
ization below, phase space. The restriction of the biconformal bracket to phase
space is called the Poisson bracket.
Returning to the postulate of the previous section, the action functional
becomes # # ' (
! dxi 1
S= y!dx = yi ! H dt.
C C dt h0
216
In this form we may identify yi as proportional to the momentum canonically
conjugate to xi , since
L 1 H (xi , yi , t)
= y i !
x i h0 x i
= yi
Henceforth we identify
pi
yi =
h0
where the inclusion of the constant h0 gives the momentum pi its usual units.
Variation of S now leads to
0 = -S # ' (
1 dxi dpi i
= -pi ! -x ! -Hdt) dt
h0 dt dt
# ' (
1 dxi dpi i H i H
= -pi ! -x ! i -x ! -pi dt
h0 dt dt x pi
which immediately gives:
dxi H
= (14.2)
dt pi
dpi H
= ! i . (14.3)
dt x
Eqs.(14.2) and (14.3) are Hamilton!s equations of motion. They form the basis
of Hamiltonian dynamics.
The Lagrangian and the Hamiltonian are related by
dxi
L = pi !H
dt
It follows from this relationship that all of the xi dependences of L arises from
H and vice-versa, so that
L H
i
=! i (14.4)
x x
Furthermore, since the Hamiltonian depends on xi and pi while the Lagrangian
depends on xi and x i , eq.(14.2) is simply a consequence of the relationship
217
between L and H. Combining eq.(14.4) with the de!nition of the conjugate
L
momentum, pi = x
i
we can rewrite eq.(14.3) as
d L L
=
dt x i xi
Thus, Hamilton#s equations imply the Euler-Lagrange equation, even though
they are !rst order equations and the Euler-Lagrange equation is second order.
We also have
L
E = x i !L
x i
dxi
= x i pi ! pi +H
dt
= H
As for the single particle case, the invariance of time constrains p0 . However,
since WA d5 A = !y!dx! is to be evaluated on N different curves, there will be
N distinct coordinates x!n and momenta, pn! . We therefore have
K L
0 = - x0m , p0n
K L K L
= -x0m , p0n + x0m , -p0n
(-p0n ) x0m
= (14.6)
p0k x0k
Now, since time is universal in non-relativistic physics, we may set x0m = t for
0
all m. Therefore, x
x0
m
= 1 and we have
k
(-p0n )
=0 (14.7)
p0k
218
which implies that each pn0 is a function of spatial components only,
% &
pn0 = pn0 xik , pki
This means that each pn0 is sufficiently general to provide a generic Hamiltonian.
Conversely, any single N -particle Hamiltonian may be written as a sum of N
identical Hamiltonians,
N
1 4
H= H
N n=1
so that eq.(14.5) becomes
N #
1 4 % n &
S = !p0 dt + pni dxin
h0 i=1 Ci
# 54 N
6
1 % &
= pn dxi ! H xik , pki dt
h0 Ci i=1 i n
219
Finally, phase space is the 6N -dimensional space of all possible values of both
position and momentum. Let
% &
5 A = q1i , p1j , q2i , p2j , . . . , qN
i
, pN
j
By Stokes# theorem, this integral is equal to the surface integral of the curl of
WA , which in turn is given by
5 W"
6
W! W! S "
WA WB x"
! x! y"
! x!
B
! A
= S ! W" S ! S "
5 5 x"
! y! y"
! y!
5 W"
6
W! W!
x"
! x! y"
= W"
! y!
220
p
H H
0 xj
+ tj pj
pi H yj
= ! t ! xi
yi
! x j + xi !-ji
! H
pi
-ji
H H
0 xj p
j
H
! 0 !-ji
= xi
! H
pi
-ji
' (' (
C i
D C i
D H m H i
[Ui ]A V ! [Ui ]B V = , 0, 0, 0; 0, - i ! , - ; 0, 0, 0, 0
B A xi pi n
' (' (
H i H n
! ! , - ; 0, 0, 0, 0 , 0, 0, 0; 0, - i
pi m xi
H H H
! xi pi xn 0 0
0 0 0 0
=
0 0 0 0
! pH
m
-m
n 0 0
H H H
! pi xi 0 0 ! p n
H
0 0 - n
! x m m
0 0 0 0
0 0 0 0
H H
0 xn
0 pn
! m 0 0 !-nm
H
=
0
x
0 0 0
H
! p m
-m
n 0 0
221
This agrees with the curl of the Weyl vector, so we may write
WA WB C iD C iD
! = [Ui ]A V ! [ Ui ]B V A
5 B 5 A B
222
pi . This independence is guaranteed by the collapse of WA d5 A to W!dx! , which
is a derived property of biconformal spaces. Integrating the Weyl vector along
any con!guration space path gives the same result as integrating it along any
path that projects to that con!guration space path because W is horizontal $
it depends only on dx! , not on dp! . This is an example of a !bre bundle $ the
possible values of momentum form a '!bre( over each point of con!guration
space, and the integral of the Weyl vector is independent of which point on the
!bre a path goes through.
Since the integral of the Weyl vector is a function of position, independent of
path, then it is immediately obvious that no physical size change is measurable.
Consider our original comparison of a system with a ruler. If both of them move
from x0 to x1 and the dilatation they experience depends only on x1 , then at
that point they will have experienced identical scale factors. Such a change is
impossible to observe from relative comparison $ the lengths are still the same.
This observation can also be formulated in terms of the gauge freedom. Since
we may write the integral of the Weyl vector as a function of position, the
integral of the Weyl vector along every classical path may be removed by the
gauge transformation [104]
e"S(x) . (14.8)
Then in the new gauge,
# #
W!# dx! = (W! ! ! S (x)) dx!
#
= W! dx! ! S (x)
= 0 (14.9)
223
14.7. A second proof of the existence of Hamilton#s principal function
Proving the existence of Hamilton#s principal function S provides an excellent
example of the usefulness of differential forms. The Weyl vector written as a
1-form is
W = pi dxi ! H dt
We can do more than this, however. One of the most powerful uses of
differential forms is for !nding integrability conditions. To use them in this way,
we must de!ne the exterior derivative d of a 1-form. Given a 1-form,
= i dxi
%0&
we de!ne the exterior derivative of to be the antisymmetric 2
tensor
1
( j i ! i j )
2
Normally this is written as a 2-form,
d = j i dxj dxi
d% = 0 (14.10)
for some p-form %. The p-form is then said to be closed. It follows that
% = d) (14.11)
d% = 0 (14.12)
224
Then there exists a function f (a function is a zero-form) such that
% = df (14.13)
This is the usual condition for the integrability of a conservative force. Think
of f as the negative of the potential and % as the force, % = Fidxi . Then the
integrability condition
d% = 0 (14.14)
is just
0 = d%
= dFi dxi
Fi j
= j
dx dxi
x
1 Fi % j i i j
&
= j
d x d x ! dx dx
2 x
' (
1 Fi j i Fi i j
= dx dx ! j dx dx
2 xj x
' (
1 Fi j i Fj j i
= dx dx ! dx dx
2 xj xi
' (
1 Fi Fj
= ! dxj dxi
2 xj xi
This is equivalent to the vanishing of the curl of F, the usual condition for the
existence of a potential.
The Poincar lemma and its converse give us a way to tell when a 1-form is
the differential of a function. Given the Weyl vector
= pi dxi ! H dt (14.15)
225
where the last term vanishes because dt dt = 0. We can rearrange this result
as
H H
d = dpi dxi ! dxi dt ! dpi dt
xi pi
' ( ' (
H H
= dpi + dt dxi ! dt (14.17)
xi pi
where we again use dt dt = 0 and also use dxi dt = !dt dxi . Since each
factor is one of Hamilton#s equations, this condition is clearly satis!ed when
the equations of motion are satis!ed. Therefore, the Weyl vector must be the
exterior derivative of a function,
= dS
226
14.8. Phase space and the symplectic form
We now explore some of the properties of phase space and Hamilton#s equations.
One advantage of the Hamiltonian formulation is that there is now one equa-
tion for each initial condition. This gives the space of all qs and ps a uniqueness
property that con!guration space (the space spanned by the qs only) doesn#t
have. For example, a projectile which is launched from the origin. Knowing
only this fact, we still don#t know the path of the object $ we need the initial
velocity as well. As a result, many possible trajectories pass through each point
of con!guration space. By contrast, the initial point of a trajectory in phase
space gives us both the initial position and the initial momentum. There can be
only one path of the system that passes through that point.
Systems with any number of degrees of freedom may be handled in this
way. If a system has N degrees of freedom then its phase space is the 2N -
dimensional space of all possible values of both position and momentum. We
de!ne con"guration space to be the space of all possible postions of the particles
comprising the system, or the complete set of possible values of the degrees of
freedom of the problem. Thus, con!guration space is the N -dimensional space
of all values of qi . By momentum space, we mean the N -dimensional space of
all possible values of all of the conjugate momenta. Hamilton#s equations then
consist of 2N !rst order differential equations for the motion in phase space.
We illustrate these points with the simple example of a one dimensional
harmonic oscillator.
L = T !V
1 1
= mx 2 ! kx2
2 2
The conjugate momentum is just
L
p= = mx
x
so the Hamiltonian is
H = px ! L
227
p2 1 1
= ! mx 2 + kx2
m 2 2
2
p 1
= + kx2
2m 2
Hamilton's equations are
H p
x = =
p m
H
p = ! = !kx
x
H L
= ! =0
t t
Note that Hamilton's equations are two "rst-order equations. From this point
on the coupled linear equations
p = !kx
p
x =
m
may be solved in any of a variety of ways. Let's treat it as a matrix system,
' ( ' 1
(' (
d x x
= m (14.18)
dt p !k p
' ( !9 9 "
!k k k
The matrix M = 1 has eigenvalues = m
, ! m
and diagonal-
m
izes to ' (
!i 0
= AM A"1
0 i
where
' (
1 ikm 1
A =
2i km i km !1
' (
"1 ! 1 ! 1
A =
!i km i km
:
k
=
m
228
Therefore, multiplying eq.(14.18) on the left by A and inserting 1 = A"1A,
' ( ' 1
( ' (
d x "1 x
A =A m A A (14.19)
dt p ! k p
we get decoupled equations in the new variables:
! "
' ( ' ( 1 ip
a q 2
x!
=A = ! km " (14.20)
a p 1
x+ ip
2 km
229
or
m2 2 x2 + p2 = 2mE
This describes an ellipse in the xp plane. The larger the energy, the larger the
ellipse, so the possible motions of the system give a set of nested, non-intersecting
ellipses. Clearly, every point of the xp plane lies on exactly one ellipse.
The phase space description of classical systems are equivalent to the con!gu-
ration space solutions and are often easier to interpret because more information
is displayed at once. The price we pay for this is the doubled dimension $ paths
rapidly become difficult to plot. To ofset this problem, we can use Poincar
sections $ projections of the phase space plot onto subspaces that cut across
the trajectories. Sometimes the patterns that occur on Poincar sections show
that the motion is con!ned to speci!c regions of phase space, even when the
motion never repeats itself. These techniques allow us to study systems that
are chaotic, meaning that the phase space paths through nearby points diverge
rapidly.
Now consider the general case of N degrees of freedom. Let
% &
5 A = q i , pj
H
= &AB
5 B
where the presence of &AB in the last step takes care of the difference in signs
on the right. Here &AB is just the inverse of the symplectic form found from the
curl of the dilatation, given by
' (
AB 0 - ij
& =
!- ji 0
230
Its occurrence in Hamilton#s equations is an indication of its central importance
in Hamiltonian mechanics. We may now write Hamilton#s equations as
d5 A H
= &AB B (14.22)
dt 5
Consider what happens to Hamilton#s equations if we want to change to a
new set of phase space coordinates, 7A = 7A (5 ) . Let the inverse transformation
be 5 A (7) . The time derivatives become
d5 A 5 A d7B
=
dt 7B dt
while the right side becomes
C
H AB 7 H
&AB = &
5 B 5 B 7C
Equating these expressions,
5 A d7B AB 7
D
H
B
= & B
7 dt 5 7D
2C
we multiply by the Jacobian matrix, 3A
to get
7C 5 A d7B 7C AB 7D H
= &
5 A 7B dt 5 A 5 B 7D
d7B 7C AB 7D H
-C
B = &
dt 5 A 5 B 7D
and !nally
d7C 7C AB 7D H
= &
dt 5 A 5 B 7D
De!ning the symplectic form in the new coordinate system,
C D
CD 4 7 &AB 7
&
5 A 5 B
we see that Hamilton#s equations are entirely the same if the transformation
leaves the symplectic form invariant,
CD = &CD
&
231
Any linear transformation M A B leaving the symplectic form invariant,
&AB 4 M A C M B D&
CD
7C AB 7D
&CD 4 &
5 A 5 B
are canonical transformations. Canonical transformations preserve Hamilton#s
equations.
5 C AB 5 D
A
& B
= &CD
7 7
232
then we have
f g
{f, g }# = &AB = {f, g }
5 C 5 D
and the Poisson bracket is unchanged. We conclude that canonical transforma-
tions preserve all Poisson brackets.
An important special case of the Poisson bracket occurs when one of the
functions is the Hamiltonian. In that case, we have
f H
{f, H } = &AB
5 A 5 B
f H f H
= i
! i
x pi p xi
i
' (
f dx f dpi
= ! i !
xi dt p dt
df f
= !
t t
or simply,
df f
= {f, H } +
t t
This shows that as the system evolves classically, the total time rate of change of
any dynamical variable is the sum of the Poisson bracket with the Hamiltonian
and the partial time derivative. If a dynamical variable has no explicit time
dependence, then f
t
= 0 and the total time derivative is just the Poisson bracket
with the Hamiltonian.
The coordinates now provide a special case. Since neither xi nor pi has any
explicit time dependence, with have
dxi K L
= H, xi
dt
dpi
= {H, pi } (14.23)
dt
and we can check this directly:
dqi K L
= H, xi
dt
4N ' (
xi H xi H
= j p
!
j =1
x j pj xj
233
N
4 H
= -ij
j =1
pj
H
=
pi
and
dpi
= {H, pi }
dt
4N ' (
pi H pi H
= !
j =1
q j pj pj qj
H
= !
qi
Notice that since qi , pi and are all independent, and do not depend explicitly on
qi pi
time, p j
= q j
= 0 = qt
i
= p
t
i
.
Finally, we de!ne the fundamental Poisson brackets. Suppose xi and pj
are a set of coordinates on phase space such that Hamilton#s equations hold in
the either the form of eqs.(14.23) or of eqs.(14.22). Since they themselves are
functions of (xm , pn ) they are dynamical variables and we may compute their
Poisson brackets with one another. With 5 A = (xm , pn ) we have
K L xi xj
xi , xj = &AB
3
5 A 5 B
4N ' (
xi xj xi xj
= !
m=1
xm pm pm xm
= 0
for xi with xj ,
K i L xi pj
x , pj 3 = &AB A B
5 5
4N ' (
xi pj xi pj
= m p
!
m=1
x m pm xm
N
4
= - im -m
j
m=1
= - ij
234
for xi with pj and !nally
pi pj
{pi , pj }3 = &AB
5 A 5 B
4N ' (
pi pj pi pj
= !
m=1
xm pm pm xm
= 0
for pi with pj . The subscript 5 on the bracket indicates that the partial derivatives
are taken with respect to the coordinates 5 A = (xi , pj ) . We summarize these
relations as K A BL
5 , 5 3 = &AB
We summarize the results of this subsection with a theorem: Let the coor-
dinates 5 A be canonical. Then a transformation 7A (5 ) is canonical if and only
if it satis!es the fundamental bracket relation
K A BL
7 , 7 3 = &AB
K A BL 7A 7B
7 , 7 3 = &CD C D = &AB
5 5
so 7A satis!es the fundamental bracked relation.
In summary, each of the following statements is equivalent:
1. 7A (5 ) is a canonical transformation.
235
2. 7A (5 ) is a coordinate transformation of phase space that preserves Hamil-
ton#s equations.
AB 5 C 5 D
& A B
= &CD
7 7
Check each:
4N ' (
K i j
L q i q j q i q j
q ,q x,p
= m p
!
m=1
x m pm xm
= 0
236
q j
since pm
= 0. For the second bracket,
K L
-ij = q i, !j x,p
4N ' (
q i !j q i !j
= m p
!
m=1
x m pm xm
4N
q i !j
=
m=1
xm pm
Since q i is independent of pm , we can satisfy this only if
!j xm
=
pm q j
Integrating gives
xn
!j = pn + cj
q j
with cj an arbitrary constant. The presence of cj does not affect the value of
the Poisson bracket. Choosing cj = 0, we compute the !nal bracket:
4N ' (
!i !j !i !j
{! i , !j }x,p = !
m=1
xm pm pm xm
4N ' (
2 xn xm xm 2xn
= pn ! pn
m=1
xmq i q j q i xmq j
4N ' (
xm xn xm xn
= ! pn
m=1
q j xm q i q i xm q j
4N ' (
xn xn
= ! i j pn
m=1
q j q i q q
= 0
Therefore, the transformations
q j = q j (x i )
xn
!j = pn + cj
q j
is a canonical transformation for any functions q i (x). This means that the sym-
metry group of Hamilton#s equations is at least as big as the symmetry group
of the Euler-Lagrange equations.
237
14.9.2. Example 2: Interchange of x and p.
The transformation
q i = pi
! i = !xi
is canonical. We easily check the fundamental brackets:
K i jL
q , q x,p = {pi , pj }x,p = 0
K i L K L
q , !j x,p = pi , !xj x,p
K L
= ! pi , xj x,p
K L
= + xj , pi x,p
= -ji
K L
{! i , !j }x,p = !xi, !xj x,p = 0
238
For (a), let
q1i = pi
! 1i = !xi
pn n
q i = Pi = ! x
Qi
!i = !Qi = !Qi (pj )
This establishes that replacing the momenta by any three independent functions
of the momenta, preserves Hamilton#s equations.
dxi H
=
dt pi
dpi H
= ! i
dt x
in the original system and
dq i H #
=
dt !i
d! i H #
= ! i
dt q
239
in the transformed variables. The principle of least action must hold for each
pair:
#
% &
S = pi dxi ! Hdt
#
#
% &
S = ! i dq i ! H # dt
pi dxi ! Hdt = !i dq i ! H # dt + df
df = pi dxi ! ! i dq i + (H # ! H ) dt
For the differential of f to take this form, it must be a function of xi, q i and t,
that is, f = f (xi , q i, t). Therefore, the differential of f is
f i f i f
df = dx + i dq + dt
xi q t
Equating the expressions for df we match up terms to require
f
pi = (14.25)
xi
f
!i = ! i (14.26)
q
f
H# = H + (14.27)
t
240
The !rst equation
f (xj , q j , t)
pi = (14.28)
xi
gives q i implicitly in terms of the original variables, while the second determines
! i . Notice that we may pick any function q i = q i (pj , xj , t). This choice !xes the
form of ! i by the eq.(14.26), while the eq.(14.27) gives the new Hamiltonian in
terms of the old one. The function f is the generating function of the transfor-
mation.
241
guaranteed that a complete solution exists, and we will assume below that we
can !nd it. Before proving our central theorem, we digress to examine the
exact relationship between the Hamilton-Jacobi equation and the Schrdinger
equation.
242
Then cancelling the exponential,
A i! !2 2
i! !A = ! ==A! = A
t t 2m 2m
i! i!
! = A % = ! A =2
2m 2m
1
+ (A = % =) + V A
2m
Collecting by powers of !,
% & 1
O !0 : ! = = % = + V
t 2m ' (
% 1& 1 A 1 2 2
O ! : =! = A % = + =
A t 2m A
% & !2 2
O !2 : 0 = ! = A
2m
The zeroth order terms is the Hamilton-Jacobi equation, with = S :
S 1
! = = S % =S + V
t 2m
1 2
= p + V (x )
2m
where p = =S . Therefore, the Hamilton-Jacobi equation is the ! 0 limit of
the Schrdinger equation.
243
integral which depends on arbitrary functions. We will show below that a com-
plete solution leads to a general solution. We use S as a generating function.
Our canonical
% i transformation
& will take the variables (xi , pi ) to a new set
of variables % , $i . Since S depends on the old coordinates xi and the new
momenta $i , we have the relations
S
pi =
xi
S
%i =
$i
S
H# = H +
t
Notice that the new Hamiltonian, H # , vanishes because the Hamiltonian-Jacobi
equation is satis!ed by S !. With H # = 0, Hamilton#s equations in the new
canonical coordinates are simply
d$i H #
= =0
dt % i
d% i H #
= ! =0
dt $i
with solutions
$i = const.
% i = const.
The system remains at the phase space point ($i , % i ). To !nd the motion in the
original coordinates as functions of time and the 2s constants of motion,
xi = xi (t; $i , % i )
244
Therefore, solving the Hamilton-Jacobi equation is the key to solving the full me-
chanical problem. Furthermore, we know that a solution exists because Hamil-
ton#s equations satisfy the integrability equation for S .
We note one further result. While we have made use of a complete integral to
solve the mechanical problem, we may want a general integral of the Hamilton-
Jacobi equation. The difference is that a complete integral of an equation in
s + 1 variables depends on s + 1 constants, while a general integral depends on
s functions. Fortunately, a complete integral of the equation can be used to
construct a general integral, and there is no loss of generality in considering a
complete integral. We see this as follows. A complete solution takes the form
S = g (t, x1, . . . , xs , $1 , . . . , $s ) + A
Then
S
= (g (t, xi , $i ) + A($i )) = 0
hk hk
and we have
A($1 , . . . , $s ) = const. ! g
This just makes A into some speci!c function of xi and t.
245
Since the partials with respect to the coordinates are the same, and we
haven#t changed the time dependence,
S = g (t, x1, . . . , xs , h1, . . . , hs ) + A (hi )
is a general solution to the Hamilton-Jacobi equation.
246
Because E = H, the new Hamiltonian, H # , is zero. This means that both q and
! are constant. The solution for x and p follows immediately:
!
x = q+ t
m
p = !
We see that the new canonical variables (q, !) are just the initial position and
momentum of the motion, and therefore do determine the motion. The fact that
knowing q and ! is equivalent to knowing the full motion rests here on the fact
that S generates motion along the classical path. In fact, given initial conditions
(q, !), we can use Hamilton#s principal function as a generating function but treat
! as the old momentum and x as the new coordinate to reverse the process above
and generate x(t) and p.
p2 1
H= + kx2
2m 2
and the Hamilton-Jacobi equation is
S 1 2 1
=! (S # ) + kx2
t 2m 2
Letting
S = f (x) ! Et
as before, f (x) must satisfy
; ' (
df 1
= 2m E ! kx2
dx 2
and therefore
# ; ' (
1 2
f (x ) = 2m E ! kx dx
2
#
= !2 ! mkx2dx
247
'2
where we have set E = 2m
. Now let mkx = ! sin y. The integral is immediate:
#
f (x ) = !2 ! mkx2dx
#
!2
= cos2 ydy
mk
!2
= (y + sin y cos y )
2 mk
Hamilton#s principal function is therefore
5 : 6
!2 ! x " x x 2
S (x, !, t) = sin"1 mk + mk 1 ! mk 2
2 mk ! ! !
!2
! t!c
2m
!2 ! x " x 2 !2
= sin"1 mk + ! ! mkx2 ! t!c
2 mk ! 2 2m
and we may use it to generate the canonical change of variable.
This time we have
S
p =
x
! 1 1 2 x !mkx
= 9 + ! ! mkx2 +
2 1 ! mk x 2 2 2 ! 2 ! mkx2
'2
' 2 (
1 ! 1% 2 2
& mkx2
= + ! ! mkx !
!2 ! mkx2 2 2 2
= !2 ! mkx2
S
q =
!
! ! x" !2 1 ! x"
"1
= sin mk + 9 ! mk 2
mk ! 2 mk 1 ! mk x2 !
'2
x ! !
+ ! t
2 ! 2 ! mkx2 m
! ! x" !
= sin"1 mk ! t
mk ! m
S
H# = H+ =H !E =0
t
248
The !rst equation relates p to the energy and position, the second gives the new
position coordinate q, and third equation shows that the new Hamiltonian is
zero. Hamilton#s equations are trivial, so that ! and q are constant,
9 and we can
k
invert the expression for q to give the solution. Setting = m
, the solution is
! ! m "
x (t ) = sin q + t
m !
= A sin (t + ')
where
q = A'
! = Am
The new canonical coordinates therefore characterize the initial amplitude and
phase of the oscillator.
249
'2
where we have set E = 2m
. Hamilton#s principal function is therefore
# ,
!2
S (x, !, t) = ! 2 ! 2mU (x)dx ! t!c
2m
and we may use it to generate the canonical change of variable.
This time we have
S , 2
p = = ! ! 2mU (x)
x '# x (
S , !
q = = !2 ! 2mU (x)dx ! t
! ! x0 m
S
H# = H + = H !E = 0
t
The !rst and third equations are as expected, while for q we may interchange
the order of differentiation and integration:
'# (
, !
q = 2
! ! 2mU (x)dx ! t
! m
# ! , "
!
= !2 ! 2mU (x) dx ! t
! m
#
!dx !
= , ! t
!2 ! 2mU (x) m
To complete the problem, we need to know the potential. However, even without
knowing U (x) we can make sense of this result by combining the expression for
q above to our previous solution to the same problem. There, conservation of
energy gives a !rst integral to Newton#s second law,
p2
E = +U
2m
' (2
1 dx
= m +U
2 dt
250
Substituting into the expression for q,
# x #
!dx ! x mdx !
q = , ! , ! t0
! ! 2mU (x) m x0 2m (E ! U ) m
2
# x # x
!dx !dx !
= , ! , ! t0
!2 ! 2mU (x) x0 !2 ! 2mU (x) m
# x0
!dx !
= , ! t0
! ! 2mU (x) m
2
Part IV
Bonus sections
We did not have time for the following topics or their applications, but you may
!nd them interesting or useful. The section on gauge theory is not complete,
but has a full treatment of differential forms.
251
tor space on which the group acts. The largest class of objects on which our
symmetry acts will be the class determining the covering group. This achieves
the fullest realization of our symmetry. For example, while the Euclidean group
ISO (3) leads us to the usual formulation of Lagrangian mechanics, we can
ask if we might not achieve something new by gauging the covering group,
ISpin(3) 9 = ISU (2) . This extension, which places spinors in the context of
classical physics, depends only on symmetry, and therefore is completely inde-
pendent of quantization.
There are numerous advantages to the spinorial extension of classical physics.
After Cartan#s discovery of spinors as linear representations of orthogonal groups
in 1913 ([61],[62]) and Dirac#s use of spinors in the Dirac equation ([63],[64]),
the use of spinors for other areas of relativistic physics was pioneered by Penrose
([65],[66]). Penrose developed spinor notation for general relativity that is one
of the most powerful tools of the !eld. For example, the use of spinors greatly
simpli!es Petrov#s classi!cation of spacetimes (compare Petrov [67] and Penrose
([65],[68]), and tremendously shortens the proof of the positive mass theorem
(compare Schoen and Yau ([69],[70],[71]) and Witten [72]). Penrose also in-
troduced the idea and techniques of twistor spaces. While Dirac spinors are
representations of the Lorentz symmetry of Minkowski space, twistors are the
spinors associated with larger conformal symmetry of compacti!ed Minkowski
space. Their overlap with string theory as twistor strings is an extremely ac-
tive area of current research in quantum !eld theory (see [73] and references
thereto). In nonrelativistic classical physics, the use of Clifford algebras (which,
though they do not provide a spinor representation in themselves, underlie the
de!nition of the spin groups) has been advocated by Hestenes in the 'geometric
algebra( program [9].
It is straightforward to include spinors in a classical theory. We provide a
simple example.
For the rotation subgroup of the Euclidean group, we can let the group act
on complex 2-vectors, 7a , a = 1, 2. The resulting form of the group is SU (2). In
this representation, an ordinary 3-vector such as the position vector xi is written
as a traceless Hermitian matrix,
X = xi 0 i
[X ]ab = xi [0i ]ab
where 0 i are the Pauli matrices. It is easy to write the usual Lagrangian in
252
terms of X :
m ! "
L= tr XX ! V (X )
4
where V is any scalar-valued function of X. However, we now have the additional
complex 2-vectors, 7a , available. Consider a Dirac-type kinetic term
and potential
V (7a ) = .7
# a B i0 iab 7b + . . .
Notice there is no necessity to introduce fermions and the concomitant anticom-
mutation relations $ we regard these spinors as commuting variables. A simple
action therefore takes the form
# !m ! " "
S = dt
tr XX + 7 a a
# a (i7 ! 17 ) ! V (X ) ! .7 a i
# B 0 iab 7b
4
The equations of motion are then
V
mx$i = !0iab
X ab
7 a = !i17 ! i.B i0 iab 7b
a
together with the complex conjugate of the second. The !rst reproduces the
usual equation of motion for the position vector. Assuming a constant vector
B i , we can easily solve the second. Setting 7 = "e"i/t , " must satisfy,
" = !i.B i 0i ab
"b
The important thing to note here is that, while the spinors " rotate with a single
factor of eiw&4 , a vector such as X rotates as a matrix and therefore requires two
factors of the rotation
X # = e"iw&4 Xeiw&4
This illustrates the 2 : 1 ratio of rotation angle characteristic of spin 1/2. The
new degrees of freedom therefore describe classical spin and we see that spin
is best thought of as a result of the symmetries of classical physics, rather
253
than as a necessarily quantum phenomenon. Similar results using the covering
group of the Lorentz group introduce Dirac spinors naturally into relativity
theory. Indeed, as noted above, 2-component spinor notation is a powerful tool
in general relativity, where it makes such results as the Petrov classi!cation or
the positivity of mass transparent.
dx dy = !dy dx
This gives a grading of (!)p to all p-forms. Thus, if is a p-form and 3 a q -form,
254
each element of the Lie algebra generating a symmetry transformation. The Lie
algebra is a vector space characterized by a closed commutator product and the
Jacobi identity. Supersymmetries are extensions of the normal Lie symmetries
of physical systems to include symmetry generators (Grassmann variables) that
anticommute. Like the grading of differential forms, all transformations of the
graded Lie algebra are assigned a grading, 0 or 1, that determines whether a
commutator or commutator is appropriate, according to
[Tp, Tq ] 4 TpTq ! (!)pq Tq Tp
where p, q {0, 1} . Thus, two transformations which both have grading 1 have
anticommutation relations with one another, while all other combinations satisfy
commutation relations.
Again, there is nothing intrinsically 'quantum( about such generalized sym-
metries, so we can consider classical supersymmetric !eld theories and even
supersymmetrized classical mechanics. Since anticommuting !elds correspond
to fermions in quantum mechanics, we may continue to call variables fermi-
onic when used classically, even though their statistical properties may not be
Fermi-Dirac. Perhaps more importantly, we arrive at a class of classical action
functionals whose quantization leads directly to Pauli or Dirac spinor equations.
The development of pseudomechanics was pioneered by Casalbuoni ([74],
[75], see also Freund [76]), who showed that it was possible to formuate an
! 0 limit of a quantum system in such a way that the spinors remain but
their magnitude is no longer quantized. Conversely, the resulting classical action
leads to the Pauli-Schrdinger equation when quantized. Similarly, Berezin and
Marinov [77], and Brink, Deser, Zumino, di Vecchia and Howe [78] introduced
four anticommuting variables, ( ! to write the pre-Dirac action. We display these
actions below, after giving a simpli!ed example. Since these approaches moved
from quantum !elds to classical equations, they already involved spinor repre-
sentations. However, vector versions (having anticommuting variables without
spinors) are possible as well. Our example below is of the latter type. Our
development is a slight modi!cation of that given by Freund [76].
To construct a simple pseudomechanical model, we introduce a superspace
formulation, extending the usual 'bosonic( 3-space coordinates xi by three ad-
ditional anticommuting coordinates, (a ,
K a bL
( ,( = 0
Consider the motion of a particle described by (xi (t) , (a (t)) , and the action
255
functional # ' (
1 i a % &
S= dt mx ix i + ( a( ! V xi, (b
2 2
Notice that (2 = 0 for any anticommuting variable, so the linear velocity term is
the% best&we can do. For the same reason, the Taylor series in (a of the potential
V xi , (b terminates:
% & % & % & 1 % & 1 % &
V xi, (b = V0 xi + "a xi (a + abc B a xi (b ( c + 4 xi abc (a (b (c
2 3!
Since the coefficients remain functions of xi , we have introduced four new !elds
into the problem. However, they are not all independent. If we change coordi-
nates from (a to some new anticommuting variables, setting
(a = 7a + 5Bbc
a b c
7 7 + C a bcd 7b 7c 7d
a
Bbc = B[abc]
% &
where 8 is an anticommuting constant, the component functions in H (b change
according to
' (
1
V = V0 + " a 7 + "a 5Bbc + abc B 7b 7c
a a a
2
' (
f 1
+ af bB 5Bcd + 4bcd + "a C bcd 7b 7c 7d
a a
3!
a 4 + 6" aC a a
5Bbc = 2
(-b Bc ! -ac Bb )
4B
while no choice of Bbc
a
can make the second term vanish because " a 5Bbc
a
is nilpo-
1
tent while 2 abc B is not. Renaming the coefficient functions, V takes the form
a
% & 1
V (b = V0 + " a (a + abc B a(b (c
2
Now, without loss of generality, the action takes the form
# ' (
1 i i i a a a 1 a b c
S = dt mx x + ( ( ! V0 ! "a ( ! abc B ( (
2 2 2
256
Varying, we get two sets of equations of motion:
V
mx$i = !
xi
V0 "a a 1 B a b c
= ! i + ( + abc ((
x xi 2 xi
a
( = i" a + ia bc B b (c
Clearly this generalizes Newton#s second law. The coefficients in the !rst equa-
tion depend only on xi , so terms with different powers of (a must vanish sepa-
rately. Therefore, B a and " a are constant and we can integrate the (a equation
immediately. Since [Jb ]a c = cb a satis!es
[Ja , Jb ]c d = e ba [Je ]c d
257
where
dx!
v! =
d.
v!
u! =
!v 2
and $ is a Lagrange multiplier. The action, SDirac , is both reparameterization
invariant and Lorentz invariant. Its variation leads to the usual relativistic
mass-energy-momentum relation together with a constraint. When the system
is quantized, imposing the constraint on the physical states gives the Dirac
equation.
Evidently, the quantization of these actions is also taken to include the en-
tension to the relevant covering group.
(35 )! = 5 ! 3 !
This insures the reality property (35 !)! = 35 ! , but this is not a necessary con-
dition for complex Grassmann numbers. For example, the conjugate of the
complex 2-form
dz dz !
258
is clearly just
dz ! dz
and is therefore pure imaginary. We must therefore regard the T C symmetry
required by the proof as somewhat arbitrary.
Similarly, for time reversal, [79] requires both
t !t
* !*
a |0$ = |1$
a a = 0
Since classical states are not discrete, there may be no such limitation. Do
anticommuting classical variables therefore satisfy Bose-Einstein statistics? If
259
so, how do Fermi-Dirac quantum states become Bose-Einstein in the classical
limit?
The introduction of pseudomechanics has led to substantial formal work on
supermanifolds and symplectic supermanifolds. See [80], [81] and references
therein.
260
17. Gauge theory
Recall our progress so far. Starting from our direct experience of the world,
we have provisionally decided to work in a 3-dimensional space, together with
time. Because we want to associate physical properties with objects which
move in this space rather than with the space itself, we demand that the space
be homogeneous and isotropic. This led us to the construction of Euclidean
3-space. Next, in order to make measurements in this space, we introduced a
metric. In order to describe uniform motion, we digressed to study the functional
derivative. Vanishing functional derivative gives us a criterion for a straight line,
or geodesic.
The next question we addressed concerned the description of matter. Since
we do not want our description to depend on the way we choose to label points,
we sought quantities which are invariant under the relevant transformations. We
found that, in order to construct invariant, measurable quantities, it is useful
to introduce tensors. Equations relating tensors do not change form when we
change coordinates, and scalars formed from tensors are invariant.
We are now ready to describe motion, by which we mean the evolution in time
of various measurable quantities. Foremost among these physical observables, of
course, is the position of the particle or other object of study. But the measurable
quantities also include invariant formed from the velocity, momentum and so on.
In order to describe the evolution of vectors and other tensors, we need to
associate a vector space with each point of the physical arena $ that is, we
choose a set of reference frames. However, because the Newtonian, Lagrangian,
and Hamiltonian formulations of mechanics are based on different symmetries,
the arenas are different. Thus, while the arena for Newtonian mechanics is
Euclidean 3-space, the arena for Lagrangian mechanics may be larger or smaller
depending on the number of degrees of freedom and the number of constraints,
while the arena for Hamiltonian dynamics differs not only in dimension but
also in the underlying symmetry. As a result, when we assign a basis to the
relevant manifold, we will require different symmetries of the basis in different
formulations.
Once we know the symmetry we wish to work with and have selected the
relevant class of reference frames, we need to know how a frame at one point and
time is related to a frame at another. Our formulation will be general enough to
allow arbitrary changes of frame from place to place and time to time. The tool
that allows this is the connection. Just as the metric gives the distance between
261
two nearby points of a manifold, the connection tells us how two in!nitesimally
separated reference frames are related.
To introduce connections for arbitrary symmetries, we need to develop two
tools: Lie algebras to describe the symmetry, and differential forms to cast the
problem in a coordinate invariant form.
We can then turn to our discussion of motion.
3. Associativity: a, b, c S, a - (b - c) = (a - b) - c = a - b - c
De!nition 17.2. A Lie group is a group G = {S, -} for which the set S is a
manifold.
Since each A can be written in terms of n2 real numbers, GL (n) has dimension
n2 (note that the nonvanishing determinant does not reduce the number of real
262
numbers required to specify the transformations). GL (n) provides an example of
a Lie group with more than one connected component. We can imagine starting
with the identity element and smoothly varying the parameters that de"ne the
group elements, thereby sweeping out curves in the space of all group elements.
If such continuous variation can take us to every group element, we say the
group is connected. If there remain elements that cannot be connected to the
identity by such a continuous variation (actually a curve in the group manifold),
then the group has more than one component. GL (n) is of this form because
as we vary the parameters to move from element to element of the group, the
determinant of those elements also varies smoothly. But since the determinant
of the identity is 1 and no element can have determinant zero, we can never
get to an element that has negative determinant. The elements of GL (n) with
negative determinant are related to those of positive determinant by a discrete
transformation: if we pick any element of GL (n) with negative determinant,
and multiply it by each element of GL (n) with positive determinant, we get
a new element of negative determinant. This shows that the two components
of GL (n) are in 1 to 1 correspondence. In odd dimensions, a suitable 1 to 1
mapping is given by !1, which is called the parity transformation.
Example 17.4. The simplest subgroup of GL (n) removes the second compo-
nent to give a connected Lie group. In fact, it is useful to factor out the deter-
minant entirely, because the operation of multiplying by a constant commutes
with every other transformation of the group. In this way, we arrive at a simple
group, one in which each transformation has nontrivial effect on some other
transformations. For a general matrix A GL (n) with positive determinant,
let 1
A = (det A) n A
Then det A = 1. Since
! "
= det A det B
det AB =1
263
the set of all A closes under matrix multiplication. We also have det A"1 = 1,
and det 1 = 1, so the set of all A forms a Lie group. This group is called the
Special Linear group, SL(n), where special refers to the unit determinant.
Theorem 17.5. Consider the subset of GL (n; R) that leaves a "xed matrix M
invariant under a similarity transformation:
K L
H = A|A GL(n), AM A"1 = M
IMI "1 = M
"1
To see that H includes inverses of all its elements, notice that (A"1) = A.
Using this, we start with
AM A"1 = M
and multiply on the left by A"1 and on the right by A to get
264
The last line is the statement that A"1 leaves M invariant, and is therefore in
H. Finally, matrix multiplication is associative, so H is a group, concluding our
proof.
Exercise 17.1. Consider the subset of GL (n; R) that leaves a "xed matrix M
invariant under a transpose-similarity transformation:
K L
H # = A|A GL(n), AM At = M
so adding and subtracting eqs.(17.1) and (17.2) gives two independent con-
straints on A :
AMs At = Ms
AMa At = Ma
That is, A must separately leave the Ms and Ma invariant. Therefore, the largest
subgroups of G that we can form in this way are found by demanding that M
265
be either symmetric or antisymmetric. These conditions de!ne the orthogonal
and symplectic groups, respectively. We look at each case.
If M is symmetric, then we can always choose a basis for the vector space
on which the transformations act in such a way that M is diagonal; indeed we
can go further, for rescaling the basis we can make every diagonal element into
+1 or !1. Therefore, any symmetric M may be put in the form
1
..
.
(p,q ) 1
M 3 ij = (17.3)
!1
..
.
!1
where there are p terms +1 and q terms !1. Then 3 provides a pseudo-metric,
since for any vector v i, we may de!ne
p p+q
4 % i &2 4 % i &2
i j
#v, v $ = 3 ij v v = v ! v
i=1 i=p+1
(We use summation signs here because we are violating the convention of always
summing one raised and one lowered index.). The signature of 3 is s = p ! q
and the dimension is n = p + q. Groups preserving a (p, q ) pseudo-metric are the
(pseudo-)orthogonal groups, denoted O (p, q ) . If we demand in addition that all
transformations have unit determinant, we have the special orthogonal groups,
SO (p, q ) . This class of Lie groups includes the O (3, 1) Lorentz metric, as well
as the O (3) case of Euclidean 3-space.
Now suppose M is antisymmetric. This case arises in classical Hamiltonian
dynamics, where we have canonically conjugate variables satisfying fundamental
Poisson bracket relations.
K i jL
q , q x' = 0 = {pi , pj }x'
K L K L
pj , q i x' = ! q i , pj x' = - ij
5 a = (q i , pj )
266
where if i, j = 1, 2, . . . , n then a = 1, 2, . . . , 2n, then the fundamental brackets
may be written in terms of an antisymmetric matrix &ab as
{5 a , 5 b } = &ab
where ' (
ab 0 !- ij
& = = !&ba (17.4)
- ij 0
Since canonical transformations are precisely diffeomorphisms that preserve the
fundamental brackets, the canonical transformations at any !xed point is the
Lie group of which preserves &ab . These transformations, and in general the sub-
group of GL (n) preserving any nondegenerate antisymmetric matrix, is called
the symplectic group. We have a similar result here as for the (pseudo-) orthog-
onal groups $ we can always choose a basis for the vector space that puts the
invariant matrix &ab in the form given in eq.(17.4). From the form of eq.(17.4)
we suspect, correctly, that the symplectic group is always even dimensional.
Indeed, the determinant of an antisymmetric matrix in odd dimensions is al-
ways zero, so such an invariant cannot be non-degenerate. The notation for the
symplectic groups is therefore Sp (2n) .
and consider those transformations that are close to the identity. Since the
identity is A (0) , these will be the transformations A () with << 1. Expanding
in a Taylor series, we keep only terms to !rst order:
' (
cos ! sin
A ( ) =
sin cos
267
' (
1 ! % &
= + O 2
1
' (
0 !1
= 1+
1 0
The only information here besides the identity is the matrix
' (
0 !1
1 0
but remarkably, this is enough to recover the whole group! For general Lie
groups, we get one generator for each continuous parameter labeling the group
elements. The set of all linear combinations of these generators is a vector space
called the Lie algebra of the group, denoted by the same abbreviation in lower
case letters. The Lie algebra of SO (2) is therefore so (2) . We will give the full
de!ning set of properties of a Lie algebra below.
To recover a general element of the group, we perform in!nitely many in!n-
itesimal transformations. Applying A () n times rotates the plane by an angle
n : ' ' ((n
n 0 !1
A (n) = (A ()) = 1 +
1 0
Expanding the power on the right using the binomial expansion,
4n ' (' (k
n 0 !1
A(n) = k 1n"k
k 1 0
k=0
268
1 k
= (
k!
Therefore
4 n ' (k
1 0 !1
A (() = lim (k
0 k ! 1 0
n k=0
4 ' (k
1 0 !1
= (k
k ! 1 0
k=0
'' ( (
0 !1
4 exp (
1 0
where in the last step we def ine the exponential of a matrix to be the power
series in the second line. Quite generally, since we know how to take powers of
matrices, we can de!ne the exponential of any matrix, M, by its power series:
4 1 k
exp M 4 M
k!
k=0
Next, we check that the exponential form of A (() actually'is the original
(
0 !1
class of transformations. To do this we !rst examine powers of :
1 0
' (2 ' (
0 !1 !1 0
= = !1
1 0 0 !1
' (3 ' (
0 !1 0 !1
= !
1 0 1 0
' (4
0 !1
= 1
1 0
269
4 ' (2m 4 ' (2m+1
1 0 !1 2m 1 0 !1
= ( + (2m+1
(2m)! 1 0 (2m + 1)! 1 0
m=0 m=0
5 6 ' (
4 (!1)m 0 !1 4 (!1)
m
2m
= 1 ( + (2m+1
(2m)! 1 0 (2m + 1)!
m=0 m=0
' (
0 !1
= 1 cos ( + sin (
1 0
' (
cos ( ! sin (
=
sin ( cos (
To see that this property also means that O (3) transformations preserve the
Pythagorean length of vectors, contract xm xn on both sides of the invariance
equation to get
C tD i
A m 3 ij [A]j n xm xn = 3 mn xm xn
Then, de!ning
y i = Ai mx
m
we have
y i 3 ij y j = 3 mnxm xn
But this just says that
% 1 &2 % 3 &2 % 3 &2 % 1 &2 % 3 &2 % 3 &2
y + y + y = x + x + x
270
so O (3) preserves the Pythagorean norm.
Next we restrict to SO (3) . Since every element of O (3) satis!es
At A = 1
we have
1 = det (1)
% &
= det At det (A)
= (det (A))2
nm = !mn (17.6)
We are dealing with 3 " 3 matrices here, but note the power of index notation!
There is actually nothing in the preceeding calculation that is speci!c to n = 3,
271
and we could draw all the same conclusions up to this point for O (p, q )! For the
3 " 3 case, every antisymmetric matrix is of the form
0 a !b
A(a, b, c) = !a 0 c
b !c 0
0 1 0 0 0 !1 0 0 0
= a !1 0 0 + b 0 0 0 + c 0 0 1
0 0 0 1 0 0 0 !1 0
Notice that any three independent, antisymmetric matrices could serve as the
generators. The Lie algebra is de!ned as the entire vector space
v = v 1 J1 + v 2 J2 + v 3 J3
272
Similarly, we !nd that [J2 , J3 ] = J1 and [J3 , J1] = J2 . Since the commutator is
antisymmetric, we can summarize these results in one equation,
[Ji , Jj ] = ij k Jk (17.8)
u = u i Ji
v = v k Jk
273
where for each a and each b, wab is some vector in V. We may expand each wab
in the basis as well,
wab = cab c Jc
for some constants cab c . The cab c = !cba c are called the Lie structure constants.
These constants are always real, regardless of the representation. The basis
therefore satis!es,
[Ja, Jb ] = cab c Jc
which is sufficient, using linearity, to determine the commutators of all elements
of the algebra:
C D
[u, v ] = ua Ja , v bJb
= u a v b [ Ja , Jb ]
= uav b cab c Jc
= w c Jc
= w (17.9)
Exercise 17.4. Show that if the Jacobi identity holds among all generators,
On the left, the (rs) labels tell us which generator we are talking about, while
the m and n indices are the matrix components. The net effect is that the labels
274
tell us which components of the matrix are nonzero. For example,
0 1 0 %%%
C (12) D % 1 2 1 2
& !1 0 0
mn
= - -
m n ! - -
n m = 0 0 0
.. ..
. .
and so on. To compute the Lie algebra, we need the mixed form of the genera-
tors,
C (rs)Dm mk
C (rs) D
n
= 3 kn
= 3 mk -rk -sn ! 3mk -rn -sk
= 3 mr -sn ! 3 ms -rn
Rearranging to collect the terms as generators, and noting that each must have
free m and n indices, we get
CC (uv) D C (rs) DDm
, n
= 3 vr (3 mu- sn ! 3 ms -un )
!3 vs (3 mu -rn ! 3 mr -un )
!3 ur (3 mv - sn ! 3 ms -vn )
+3 us (3 mv -rn ! 3 mr -vn )
C Dm C Dm
= 3 vr (us) n ! 3 vs (ur) n
C Dm C Dm
!3 ur (vs) n + 3 us (vr) n (17.10)
Finally, we can drop the matrix indices. It is important that we can do this,
because it demonstrates that the Lie algebra is a relationship among the different
275
generators that doesn#t depend on whether the operators are written as matrices
or not. The result, valid for any o(p, q ), is
C (uv) (rs)D
, = 3 vr (us) ! 3 vs (ur) ! 3 ur (vs) + 3 us (vr) (17.11)
Exercise 17.5. By the arguements of this section, we could have written the
generators of so (3) as
C (jk) Dm
n
= 3 mj -kn ! 3mk -jk
276
so that to linear order,
Ja xa1 + Jaxa2 = Ja xa3
This requires the generators to combine linearly under addition and scalar mul-
tiplication. Next, we require an identity operator. This just means that the
zero vector lies in the space of generators, since g (0, . . . , 0) = 1 = 1 + Ja 0a . For
inverses, we have
% &
g (xa1 )g "1 xb2 = 1
(1 + Ja xa1 + . . .) (1 + Ja xa2 + . . .) = 1
1 + Ja xa1 + Jaxa2 = 1
277
' (
b b a b 1 b c 1 a b
= 1 + Jb x + Jb y + Ja Jb x y + Kbc y y + Kab x x
2 2
% d d d e c d
" 1 ! Jd x ! Jd y + Jd Je y y + Jc Jd x y
(
c d 1 d e 1 c d
+ Jc Jd x x ! Kde y y ! Kcd x x
2 2
= 1 ! Jd x ! Jd y + Jd Je y y + Jc Jd x c y d + Jc Jd x c x d
d d d e
1 1 % &% &
! Kde y d y e ! Kcd xc xd + Jb xb + Jb y b 1 ! Jd xd ! Jd y d
2 2
1 1
+Ja Jb xa y b + Kbc y by c + Kab xa xb
2 2
Collecting terms,
g3 = 1 + Ja z a (x, y ) + % % %
= 1 ! Jd x d ! Jd y d + Jb x b + Jb y b
+ Jd Je y d y e + Jc Jd x c y d + Jc Jd x c x d ! Jb Jd x b x d
! Jb Jd y b x d ! Jb Jd x b y d ! Jb Jd y b y d + Ja Jb x a y b
1 1 1 1
+ Kbc y b y c + Kab xa xb ! Kde y d y e ! Kcd xc xd
2 2 2 2
c d b d
= 1 + Jc Jd x y ! Jb Jd y x
= 1 + Jc Jd x c y d ! Jd Jc x c y d
= 1 + [ Jc , Jd ] x c y d
Equating the expansion of g3 to the collected terms we see that for the lowest
order terrms to match we must have
[Jc, Jd ]xc y d = Ja z a (x, y )
for some z a. Since xc and y d are already small, we may expand z a in a Taylor
series in them
z a = aa + bab xb + cab y b + cbc a xb y c + . . .
The !rst three terms must vanish, because z a = 0 if either xa = 0 or y a = 0.
Therefore, at lowest order, z a = cbc a xb y c and therefore the commutator must
close
[Jc , Jd ] = cbc aJa
Finally, the Lie group is associative: if we have three group elements, g1 , g2
and g3 , then
g1 (g2 g3 ) = (g1 g2 ) g3
278
To !rst order, this simply implies associativity for the generators
Ja ( Jb Jc ) = ( Ja Jb ) Jc
From the !nal arrangement of the terms, we see that it is satis!ed identically
provided the multiplication is associative.
Therefore, the de!nition of a Lie algebra is a necessary consequence of being
built from the in!nitesimal generators of a Lie group. The conditions are also
sufficient, though we won#t give the proof here.
The correspondence between Lie groups and Lie algebras is not one to one,
because in general several Lie groups may share the same Lie algebra. However,
groups with the same Lie algebra are related in a simple way. Our example
above of the relationship between O(3) and SO(3) is typical $ these two groups
are related by a discrete symmetry. Since discrete symmetries do not partic-
ipate in the computation of in!nitesimal generators, they do not change the
Lie algebra. The central result is this: for every Lie algebra there is a unique
maximal Lie group called the covering group such that every Lie group shar-
ing the same Lie algebra is the quotient of the covering group by a discrete
symmetry group. This result suggests that when examining a group symmetry
of nature, we should always look at the covering group in order to extract the
greatest possible symmetry. Following this suggestion for Euclidean 3-space and
279
for Minkowski space leads us directly to the use of spinors. Spinors form the
vector space on which the linear representation of the covering group acts. Thus,
we see that making the maximal use of of symmetry makes the appearance of
spinors in quantum mechanics, general relativity and quantum !eld theory seem
natural.
Much of their importance lies in the coordinate invariance of the resulting inte-
grals.
One of the important properties of integrands is that they can all be regarded
as oriented. If we integrate a line integral along a curve from A to B we get a
number, while if we integrate from B to A we get minus the same number,
# B # A
f (x)dl = ! f (x)dl (17.13)
A B
changes sign if we reverse the direction of the normal to the surface. This normal
can be thought of as the cross product of two basis vectors within the surface.
If these basis vectors# cross product is taken in one order, n has one sign. If the
opposite order is taken then !n results. Similarly, volume integrals change sign
if we change from a right- or left-handed coordinate system.
We can build this alternating sign into our convention for writing differential
forms by introducing a formal antisymmetric product, called the wedge product,
symbolized by , which is de!ned to give these differential elements the proper
signs. Thus, surface integrals will be written as integrals over the products
280
dx dy, dy dz, dz dx
dx dy = !dy dx
under the interchange of any two basis forms. This automatically gives the right
orientation of the surface. Similarly, the volume element becomes
V = dx dy dz
which changes sign if any pair of the basis elements are switched.
We can go further than this by formalizing the full integrand. For a line
integral, the general form of the integrand is a linear combination of the basis
differentials,
Ax dx + Ay dy + Az dz
Notice that we simply add the different parts. Similarly, a general surface inte-
grand is
Az dx dy + Ay dz dx + Axdy dz
f (x) dx dy dz
281
Notice that the antisymmetry is all we need to rearrange any combination
of forms. In general, wedge products of even order forms with any other forms
commute while wedge products of pairs of odd-order forms anticommute. In
particular, functions (0-forms) commute with all p-forms. Using this, we may
interchange the order of a line element and a surface area, for if
l = Adx
S = B dy dz
then
lS = (A dx) (B dy dz )
= A dx B dy dz
= AB dx dy dz
= !AB dy dx dz
= AB dy dz dx
= Sl
but the wedge product of two line elements changes sign, for it
l1 = Adx
l2 = B dy + C dz
then
l1 l2 = (A dx) (B dy + C dz )
= A dx B dy + A dx C dz
= AB dx dy + AC dx dz
= !AB dy dx ! AC dz dx
= !B dy Adx ! C dz Adx
= !l 2 l 1 (17.15)
= ! = 0
282
l V = (A dx) (B dx dy dz )
= AB dx dx dy dz
= 0
' ( ' (
x x y y
dx dy = du + dv du + dv
u v u v
x y x y
= dv du + du dv
v
' u u
( v
x y x y
= ! du dv
u v v u
= J du dv
where ' (
x x
J = det u v
y y
u v
is the Jacobian of the coordinate transformation. This is exactly the way that an
area element changes when we change coordinates! Notice the Jacobian coming
out automatically. We couldn#t ask for more $ the wedge product not only
gives us the right signs for oriented areas and volumes, but gives us the right
transformation to new coordinates. Of course the volume change works, too.
283
Exercise 17.8. In eq.(17.15), showing the anticommutation of two 1-forms,
identify the property of form multiplication used in each step (associativity,
anticommutation of basis forms, commutation of 0-forms, etc.).
x x (u, v, w)
y y (u, v, w)
z z (u, v, w)
the new volume element is just get the full Jacobian times the new volume form,
dx dy dz = J (xyz ; uvw) du dv dw
So the wedge product successfully keeps track of p-dim volumes and their
orientations in a coordinate invariant way. Now any time we have an integral,
we can regard the integrand as being a differential form. But all of this can
go much further. Recall our proof that 1-forms form a vector space. Thus,
the differential, dx, of x (u, v ) given above is just a gradient. It vanishes along
surfaces where x is constant, and the components of the vector
' (
x x
,
u v
A = Aidxi
284
Let#s sum up. We have de!ned forms, have written down their formal prop-
erties, and have use those properties to write them in components. Then, we
de!ned the wedge product, which enables us to write p-dimensional integrands
as p-forms in such a way that the orientation and coordinate transformation
properties of the integrals emerges automatically.
Though it is 1-forms, Ai dxi that correspond to vectors, we have de!ned
a product of basis forms that we can generalize to more complicated objects.
Many of these objects are already familiar. Consider the product of two 1-forms.
A B = Aidxi Bj dxj
= AiBj dxi dxj
1 % &
= Ai Bj dxi dxj ! dxj dxi
2
1% &
= Ai Bj dxi dxj ! Ai Bj dxj dxi
2
1% &
= Ai Bj dxi dxj ! Aj Bi dxi dxj
2
1
= (Ai Bj ! Aj Bi ) dxi dxj
2
The coefficients
Ai Bj ! Aj Bi
are essentially the components of the cross product. We will see this in more
detail below when we discuss the curl.
f f f
df = dx + dy + dz
x y z
f
= i
dxi
x
Since a function is a 0-form then we can imagine an operator d that differentiates
any 0-form to give a 1-form. In Cartesian coordinates, the coefficients of this
1-form are just the Cartesian components of the gradient.
285
The operator d is called the exterior derivative, and we may apply it to any
p-form to get a (p + 1)-form. The extension is de!ned as follows. First consider
a 1-form
A = Aidxi
We de!ne
dA = dAi dxi
Similarly, since an arbitrary p-form in n-dimensions may be written as
d2 x = d (dx)
= d (1dx)
= d (1) dx
= 0
since all derivatives of the constant function f = 1 are zero. The same applies
if we apply d twice to any function:
d2 f = d (df )
' (
f i
= d dx
xi
' (
f
= d i
dxi
x
' 2 (
f
= dx dxi
j
xj xi
2f
= dxj dxi
xj xi
286
By the same argument we used to get the components of the curl, we may write
this as
' (
2 1 2f 2f
df = ! dxj dxi
2 xj xi xixj
= 0
since partial derivatives commute.
Exercise 17.11. Prove the Poincar Lemma: d2 = 0 where is an arbitrary
p-form.
Next, consider the effect of d on an arbitrary 1-form. We have
% &
dA = d Ai dxi
' (
Ai j
= j
dx dxi
x
' (
1 Ai Aj
= ! dxj dxi (17.16)
2 xj xi
We have the components of the curl of the vector A. We must be careful
here, however, because these are the components of the curl only in Cartesian
coordinates. Later we will see how these components relate to those in a general
coordinate system. Also, recall from Section (4.2.2) that the components Ai are
distinct from the usual vector components Ai . These differences will be resolved
when we give a detailed discussion of the metric in Section (5.6). Ultimately,
the action of d on a 1-form gives us a coordinate invariant way to calculate the
curl.
Finally, suppose we have a 2-form expressed as
S = Az dx dy + Ay dz dx + Ax dy dz
Then applying the exterior derivative gives
dS = dAz dx dy + dAy dz dx + dAx dy dz
Az Ay Ax
= dz dx dy + dy dz dx + dx dy dz
z y x
' (
Az Ay Ax
= + + dx dy dz (17.17)
z y x
287
so that the exterior derivative can also reproduce the divergence.
Exercise 17.12. Fill in the missing steps in the derivation of eq.(17.17).
Exercise 17.13. Compute the exterior derivative of the arbitrary 3-form, A =
f (x, y, z ) dx dy dz.
!
(dx dy ) = dz
!
(dy dz ) = dx
!
(dz dx) = dy
!
(dx dy dz ) = 1
and further require the star to be its own inverse,
!!
=1
With these rules we can !nd the Hodge dual of any form in 3-dim.
Exercise 17.14. Show that the dual of a general 1-form,
A = Aidxi
is the 2-form
S = Az dx dy + Ay dz dx + Ax dy dz
Exercise 17.15. Show that for an arbitrary (Cartesian) 1-form
A = Aidxi
that
!
d! A = divA
288
Exercise 17.16. Write the curl of A
' ( ' ( ' (
Ay Az Az Ax Ax Ay
curl (A) = ! dx + ! dy + ! dz
z y x z y x
Exercise 17.17. Write the Cartesian dot product of two 1-forms in terms of
wedge products and duals.
We have now shown how three operations $the wedge product , the exterior
derivative d, and the Hodge dual ! $ together encompass the usual dot and
cross products as well as the divergence, curl and gradient. In fact, they do
much more $ they extend all of these operations to arbitrary coordinates and
arbitrary numbers of dimensions. To explore these generalizations, we must !rst
explore properties of the metric and look at coordinate transformations. This
will allow us to de!ne the Hodge dual in arbitrary coordinates.
17.6. Transformations
Since the use of orthonormal frames is simply a convenient choice of basis, no
information is lost in restricting our attention to them. We can always return
to general frames if we wish. But as long as we maintain the restriction, we
can work with a reduced form of the symmetry group. Arbitrary coordinate
transformations $ diffeomorphisms $ preserve the class of frames, but only or-
thogonal transformations preserve orthonormal frames. Nonetheless, the class
of tensors is remains unchanged $ there is a 1-1, onto correspondence between
tensors with diffeomorphism covariance and those with orthogonal covariance.
The correspondence between general frame and orthonormal frame tensors
is provided by the orthonormal frame itself. Given an orthonormal frame
ea = em a dxm
we can use the coefficient matrix em a and its inverse to transform back and
forth between orthonormal and coordinate indices. Thus, given any vector in
an arbitrary coordinate basis,
v = vm
xm
289
we may insert the identity in the form
-m a
n = en ea
m
to write
v = v n- m
n
xm
= v n en a ea m
xm
= (v n e n a ) e a
= v a ea
The mapping
v a = v n en a
is invertible because en a
is invertible. Similarly, any tensor, for example
T m1 ...mr n1 ...ns
Similar expressions may be written for tensors with their contravariant and
covariant indices in other orders.
Exercise 17.18. We showed in Section (3) that the components of the metric
are related to the Cartesian components by
xm xn
gjk = 3
y j y k mn
where we have corrected the index positions and inserted the Cartesian form
of the metric explicitly as 3mn = diag (1, 1, 1). Derive the form of the metric in
cylindrical coordinates directly from the coordinate transformation,
x = x (/, , z ) = / cos
y = y (/, , z ) = / sin
z = z (/, , z ) = z
290
Exercise 17.19. Notice that the identity matrix should exist in any coordinate
system, since multiplying any vector by the identity should be independent of
coordinate system. Show that the matrix - i j , de"ned to be the unit matrix
in one coordinate system, has the same form in every other coordinate system.
Notice that the upper index will transform like a contravariant vector and the
lower index like a covariant vector. Also note that -i j = -j i .
Exercise 17.20. Show that the inverse to the metric transforms as a contravari-
ant second rank tensor. The easiest way to do this is to use the equation
gij g jk = -ki
and the result of exercise 2, together with the transformation law for gij .
i1 i2 ...in
# xi xj
gmn = gij
y m y n
291
the determinants are related by
g # = det gmn
#
' i (
x xj
= det gij
y m y n
xi xj
= det m deg tij det n
y y
2
= J g
so that ei...j is a tensor. If we raise all indices on ei1 i2 ...in , using n copies of the
inverse metric, we have
j1 i1 j2 i2
ej1 j2 ...jn = gg g . . . g jn in i1 i2 ...in
"1 j1 j2 ...jn
= gg
1
= j1 j2 ...jn
g
!p : Vp R
Linearity refers to both the forms and the volumes. Thus, for any two p -forms,
!1p and !2p , and any constants a and b,
a!1p + b!2p
292
is also a p-form, while for any two disjoint p-volumes, Vp1 and Vp2,
% & % & % &
!p Vp1 + Vp2 = !p Vp1 + !p Vp2
In Section 3, we showed for 1-forms that these conditions specify the differential
of functions. For p-forms, they are equivalent to linear combinations of wedge
products of p 1-forms.
Let A be a p-form in d-dimensions. Then, inserting a convenient normaliza-
tion,
1
A = Ai1 ...ip dxi1 . . . dxip
p!
The action of the exterior derivative, d, on such a p-form is
' (
1
dA = k
Ai1...ip dxk dxi1 . . . dxip
p! x
We also de!ned the wedge product as a distributive, associative, antisymmetric
product on 1-forms:
% &
adxi + bdxi dxj = adxi dxj + bdxi dxj
% & % &
dxi dxj dxk = dxi dxj dxk
dxi dxj = !dxj dxi
A third operation, the Hodge dual, was provisionally de!ned in Cartesian coor-
dinates, but now we can write its full de!nition. The dual of A is de!ned to be
the (d ! p)-form
! 1 i ...i ip+1
A= Ai1 ...ip e 1 p ip+1 ...id dx . . . dxid
(d ! p)!p!
Notice that we have written the !rst p indices of the Levi-Civita tensor in the
superscript position to keep with our convention of always summing an up index
with a down index. In Cartesian coordinates, these two forms represent the same
array of numbers, but it makes a difference when we look at other coordinate
systems.
Differential calculus is de!ned in terms of these three operations, (,! , d) .
Together, they allow us to perform all standard calculus operations in any num-
ber of dimensions and in a way independent of any coordinate choice.
293
17.8.1. Grad, Div, Curl and Laplacian
It is straightforward to write down the familiar operations of gradient and curl
and divergence. We specify each, and apply each in polar coordinates, (/, (, z ) .
Recall that the metric in polar coordinates is
1
gmn = /2
1
its inverse is
1
g mn = 1
*2
1
and its determinant is
g = det gmn = /2
so
f 1 f f
f = +
' +
k
/ / z
294
Divergence The use of differential forms leads to an extremely useful expres-
sion for the divergence $ important enough that it goes by the name of the
divergence theorem. Starting with a 1-form, = i dxi , we compute
!
d! = !
d! i dxi
' (
! 1 i
= d i e jk dxj dxk
2
1 ! %
&
= d i gg in njk dxj dxk
2
1 ! % in &
= i gg njk dxm dxj dxk
2 xm
1 % in &
= i gg njk emjk
2 xm
1 1 % in &
= i gg njk mjk
2 g xm
1 1 % in & m
= i gg 2-n
2 g xm
1 % im &
= i gg
g xm
In terms of the vector, rather than form, components of the original form, we
may replace i = g ij j so that
! 1 m
d! = ( g ) = %
g xm
Since the operations on the left are all coordinate invariant, the in the middle is
also. Notice that in Cartesian coordinates the metric is just -ij , with determinant
3, so the expression reduces to the familiar form of the divergence and
1 m
% = ( g )
g xm
In polar coordinates we have
1 !, 2 m "
% = , /
/2 xm
' ! (
1 , 2 *" !, 2 " !, 2 z "
= , / + / + /
/2 / z
1 z
= (/ * ) + +
/ / z
295
Curl The curl is the dual of the exterior derivative of a 1-form. Thus, if
= i dxi then
! !
d = i dxj dxi
xj
' (
ji
= e k i dxk
xj
= eji k gim g mn n dxk
xj
' (
ji mn mn
= e k gim (g n ) ! n g dxk
xj xj
' (
lj m s mn
= elmk g ! gsn g dxk
xj xj
Now observe that
mn
gsn g = (gsn g mn ) ! g mn (gsn )
xj xj xj
m
= -s ! g mn (gsn )
xj xj
= !g mn gsn
xj
so that
' (
! m
lj s mn
d = elmk g + g gsn dxk
xj xj
' (
lj m s jn
= elmk g + e k gsn dxk
xj xj
Next consider
ejn k gsn = ejn k j gsn
xj
1 jn
= e k (j gsn ! n gsj )
2
1 jn
= e k (j gsn ! n gsj + s gjn )
2
= ejn k %nsj
296
This combines to
' (
! m
lj s jn
d = elmk g + e k gsn dxk
xj xj
' (
lj s jn
= elmk g + e k %nsj dxk
m
xj
' (
j m
= emk + g %nsj dxk
nm s
xj
j % &
= emk j m + s %m sj dxk
= ejmk Dj m dxk
% &
= ejmk Dj m dxk
Also consider
% &
d! = d ei jk i dxj dxk
% &
= d eijk i dxj dxk
% &
= d gijk i dxj dxk
% i m j k
&
= g ijk dx d x dx
xm
' (
ji
= e k j i dxk
x
To apply the formula, start with the components of the vector. In our familiar
example in polar coordinates, let
w i = (w * , w , w z )
297
The corresponding form has components i = gij wj = (w* , /2 w , wz ) . Therefore,
the exterior derivative is
% &
d = d w* d/ + /2 w d + wz dz
w* w*
= d d/ + dz d/
z
% 2 & % 2 &
+ / w d/ d + / w dz d
/ z
wz wz
+ d/ dz + d dz
/
' ( ' z (
% 2 & w* w % 2 &
= /w ! d/ d + ! /w d dz
/ z
' * (
w wz
+ ! dz d/
z /
! 1
d/ d = e123g33 dz = dz
/
! 1
d dz = e231g11 d/ = d/
/
! 312
dz d/ = e g22 d = /d
so that
' ( ' (
! 1 % 2 & w* 1 wz
d = /w ! dz + ! / (w ) d/
/ / / z
' * z
(
w w
+/ ! d
z /
Now, since
!
d = [ " ]i gik dxk
we use the inverse metric on the components of ! d to !nd
so that with i = gij j we have
1 wz
[ " ]1 = ! / (w )
/ z
298
' (
2 1 * z
[ " ] = !
/ z /
' (
1 % 2 & w*
[ " ]3 = /w !
/ /
Exercise 17.21. Work out the form of the gradient, curl, divergence and lapla-
cian in spherical coordinates. Express your results using a basis of unit vectors.
Exercise 17.22. In an orthonormal vector basis the electric and magnetic "elds
and the current are
E = E i ei
B = B i ei
J = J i ei
De"ne equivalent forms in arbitrary coordinates by
+ = E i gij dxj = Ej dxj
1 i
, = B eijk dxj dxk
2
- = J igij dxj
Show that Maxwell's equations,
4!
%E =
/
c
%B = 0
1 B
"E+ = 0
c t
1 E 4!
"B! = J
c t c
may be written in terms of +, ,, - and / as
! 4!
d! + =
/
c
d, = 0
1 ,
d + = 0
c t
! ! 1 + 4!
d ,! = -
c t c
299
Remark 10. The third equation may be proved as follows:
1 , % & 1 1 i
d + = d E i gij dxj + B eijk dxj dxk
c t c t 2
Ej m j 1 1 i
= m
dx dx + B eijk dxj dxk
x' c t(2
1 1 1 i
= m
Ej ! j Em dxm dxj + B eijk dxj dxk
2 x x c t 2
' (
1 1 1 i
= n
El ! l En einl eijk dxj dxk + B eijk dxj dxk
4 x x c t 2
' ' ( (
1 1 inl 1 i
= El ! l En e + B eijk dxj dxk
2 2 xn x c t
' (
1 inl 1 i
= e n El + B eijk dxj dxk
2 c t
' (i
1 1
= "E+ B eijk dxj dxk
2 c t
= 0
Exercise 17.23. From Maxwell's equations,
! ! 4!
d+ = /
c
d, = 0
1 ,
d + = 0
c t
! ! 1 + 4!
d ,! = -
c t c
show that
1
/ +! d! - = 0
c t
Show that this equation is the continuity equation by writing it in the usual
vector notation.
Exercise 17.24. Using the homogeneous Maxwell equations
d, = 0
1 ,
d + = 0
c t
show that the electric and magnetic "elds arise from a potential.
300
Remark 11. Start with the magnetic equation
d, = 0
, = dA
for some 1-form A. Substitute this result into the remaining homogeneous equa-
tion,
1
d + dA = 0
' c t (
1
d + A = 0
c t
A second use of the converse to the Poincar lemma shows that there exist a
0-form ! such that
1
+ A = !d
c t
and therefore
1
= !d ! A
c t
17.9. References
References
[1] Tjiang, P. C. and Sutanto, S. H., On the Derivation fo Conserved Quan-
tities in Classical Mechanics, ArXiv: physics backslash
[2] Muoz, G., Vector constants of the motion and orbits in the Coulomb/
Kepler problem, ArXiv: physics\0303106.
[3] Euler, L., Novi Comm. Acad. Sci. Petrop. 11 (1765) 144.
301
[6] Kustaanheimo, P. and Stiefel, E., J. Reine Angew. Mathematik 218 (1965)
204.
[7] Stiefel, E. and Scheifele, G., Linear and Regular Celestial Mechanics
(Springer, 1971).
[10] Ray, S. and Shamanna, J., Orbits in a central force !eld: bounded orbits,
ArXiv:physics\0410149 v1.
[16] Leubner, C., Inequivalent Lagrangians from constants of the motion, Phys.
Lett. 86A, no.2 (1981) 68-70.
302
[23] Wolsky, A. M., Amer. Journ. Phys., 39 (1971) 529.
[25] Currie, D. G., Jordan, T. F., and Sudarshan, E. G., Rev. Mod. Phys. 35
(1963) p 350.
[27] Bolza, O., Lectures on the Calculus of Variations, (Dover, New York, 1961)
30.
[30] Lpez, G., About ambiguities appearing in the study of classical and quan-
tum harmonic oscillator, Vevista Mexicana de Fsica 48 (1) 10-15.
[33] Gorringe, V. M. and Leach, P. G. L., The !rst integrals and their Lie
algebra of the most general autonomous Hamiltonian of the form H =
T + V.
[34] Laplace, P. S., Celestial Mechanics, Vol. 1(Chelsea, NY, 1969) 344.
[36] Lenz, W., On the Course of the Motion and the Quantum States of the
Disturbed Kepler Motion, Z. Phys. 24 (1924) 197-207.
[37] Goldstein, H., More on the prehistory of the Laplace of Runge-Lenz vector,
Am. J. Phys. 44 (1976) 1123-1124.
[38] Goldstein, H., Poole, C. P., Safko, J. L., Classical Mechanics (Third Edi-
tion), (Pearson Education, Inc., Addison-Wesley, San Francisco, 2002).
303
[39] Abelson, H. di Sessa, A., and Rudolph, L., Velocity space and the geometry
of planetary orbits, Am. J. Phys. 43 , (1975) 579-589.
[41] Derbes, D., Reinventing the wheel( Hodographic solutions to the Kepler
problems, Am. J. Phys. 69 (2001) 481-489.
[46] Messiah, A., Quantum Mechanics, Dover Publications, Toronto (1999) pp.
214-242, reprint of Quantum Mechanics (John Wiley & Sons, USA 1958)
Translated by G. H. Temmer from Mcanique Quantique.
[49] Landau, L. D., and Lifshitz, E. M., Mechanics (Third Edition) Course of
theoretical mechanics, Volume 1, (Butterworth Heinmann, Oxford, 1976).
[51] Talukdar, B., Ghosh, S., and Shamanna, J., Canonical structure of the
coupled Korteweg-de Vries equations, Can. J. Phys 82 (2004) 459-466.
[52] May, R., Simple Mathematical Models with Very Complicated Dynamics,
Nature 261 (1976) 459-467.
304
[53] May, R. and Oster, G. F., Bifurcations and Dynamic Complexity in Simple
Ecological Models, The American Naturalist 110 (1976) 573-599.
[54] May, R., Biological Populations with Nonoverlapping Generations:Stable
Points, Stable Cycles, and Chaos, Science 186 (1974) 645-647.
[55] Wheeler, J. T., Gauging Newton#s Law, ArXiv: hep-th/0305017.
[56] Anderson, L. B. and Wheeler, J. T., Quantum mechanics as a measure-
ment theory on biconformal space, Arxiv: hep-th\0406159.
[57] Utiyama, R., Phys. Rev. 101 (1956) 1597.
[58] Kibble, T. W. B., J. Math. Phys. 2 (1961) 212.
[59] Wheeler, J. T., New conformal gauging and the electromagnetic theory of
Weyl, J. Math. Phys. 39 (1998) 299, Arxiv: hep-th\9706214.
[60] Wehner, A. and Wheeler, J. T., Conformal actions in any dimension, Nuc.
Phys. B557 (1999) 380-406, Arxiv: hep-th backslash
[61] Cartan, E., Les groupes projectifs qui ne laissent invariante aucune mul-
tiplicit plane, Bull. Soc. Math. France, 41 (1913) 53-96.
[62] Cartan, E., The theory of spinors, English translation, Hermann, Paris,
1966, reprinted by Dover, New York, 1981. Originally printed in French in
1937 from lie Cartan#s lectures, collected and arranged by Andr Mercier.
[63] Dirac, P. A. M., Proc. Roy. Soc. Lon. A117 (1928) 610.
[64] Dirac, P. A. M., Proc. Roy. Soc. Lon. A126 (1930) 360.
[65] Penrose, R., A Spinor Approach to General Relativity, Ann. Phys. 10
(1960) 171-201.
[66] Penrose, R. and Rindler, W., Spinors and space-time, Vols. I and II (Cam-
bridge University Press, Cambridge, 1984).
[67] Petrov, A. Z., New methods in General Relativity (in Russian), (Nauka,
Moscow, 1966). English edition: Einstein spaces (Pergamon Press, 1969).
[68] Wald, R. M., General Relativity, (University of Chicago Press, Chicago,
1984), 373-375.
305
306
[70] Schoen, R. and Yau, S.-T., Proof of the Positive Mass Theorem. II, Com-
mun. Math. Phys.79 (1981) 231-260.
[71] Schoen, R. and Yau, S.-T., Proof That the Bondi Mass is Positive, Phys.
Rev. Lett, 48 (1982) 369-371.
[72] Witten, E., A New Proof of the Positive Energy Theorem, Commun. Math.
Phys. 80 (1981) 381-402.
[77] Berezin, F. A. and Marinov, M. S., Particle spin dynamics as the grass-
mann variant of classical mechanics, Ann. Phys. (NY) 104 (1977) 336-62.
[78] Brink, L., Deser, S., Zumino, B., di Vecchia, P. and Howe, P., Local
sypersymmetry for spinning particles, Phys. Lett. 64B (1976) 435-438.
[79] Morgan, J. A., Spin and statistics in classical mechanics, ArXiv: physics
\0401070. This article provides a very accessible introduction to pseudo-
mechanics, including 87 references on pseudomechanics and the history of
the spin statistics theorem.
[87] F. Berezin, Sov. Math. Izv. 38 (1974) 1116; Sov. Math. Izv. 39 (1975) 363.
[91] J.T. Wheeler, Quantum measurement and geometry, Phys. Rev. D 41 (2)
(1990) 431-441.
[93] Wheeler, J. T., SU(3) x SU(2) x U(1) as the residual gauge group of the
spacetime metric choice, The Vancouver Meeting $ Particles and Fields
!91, edited by D. Axen, D. Bryman and M. Comyn, World Scienti!c (1992)
854-856; J. T. Wheeler, SU(3) x SU(2) x U(1)SU(3): The residual symme-
try of conformal gravity, Mathematical Aspects of Classical Field Theory,
Contemporary Mathematics, Am. Math. Soc., Edited by M. J. Gotay, J.
E. Marsden, & V. Moncrief (1992) 635-644.
[95] A. Wehner, A new conformal gauge theory, Dissertation, Utah State Uni-
versity, 2001.
307
[96] J.T. Wheeler, New conformal gauging and the electromagnetic theory of
Weyl, J. Math. Phys. 39 (1998) 299.
[97] A. Wehner, J.T. Wheeler, Conformal actions in any dimension, Nuc. Phys.
B 577 (1999) 380-406.
[98] L.B. Anderson, J.T. Wheeler, Biconformal supergravity and the AdS/CFT
conjecture. Nuc. Phys. B 686 (2004) 285-309. hep-th/030911
[102] A. Einstein, Sitz. Ber. Preuss. Akad. Wiss. 26, 478 (1918), including Weyl#s
reply.
[104] J.T. Wheeler, Gauging Newton#s Law, hep-th/0305017, submitted for pub-
lication.
[107] L.D. Landau, E.M. Lifshitz, Mechanics (Third Edition), Course of Theo-
retical Physics, Volume 1, (Butterworth Heinmann, Oxford, 1976).
308
[111] L. O#Raifeartaigh, The Dawning of Gauge Theory, (Princeton University
Press, Princeton, 1997).
[112] J.P. Crawford, Cifford algebra: notes on the spinor metric and Lorentz,
Poincar, and conformal groups, J. Math. Phys. 32 (3) (1991) 576.
309