Sunteți pe pagina 1din 91

1

Primer to
Introduction to Equilibrium Statistical Mechanics
Oct, 2011 version1

Yoshi Oono yoono@illinois.edu (3111ESB)


Physics and IGB, UIUC

These notes are for those who have not attended any undergraduate statistical thermodynamics courses except for very rudimentary courses at the 200 level; this is
essentially a set of notes for one semester undergraduate statistical mechanics course.
The notes may be used as your basic knowledge checklist; simply scan the bold-face
titles of entries and the index (hyperreferenced). The IESM is a critical introduction
to equilibrium statistical mechanics, but this primer is rather conventional, so some
critical comments are in the footnotes.2

Errors and typos in version 0 (2004) were fixed thanks to Bo Liu.


Chapter 2 covers standard elementary topics, but the author of this memo always suspects that
ideal quantum gases are excessively discussed in elementary courses because it is easy to compose
elementary but not-so-trivial exam questions. They are important, but other fascinating topics
are squeezed out because of them. Therefore, those who do not wish to go into solid-state or low
temperature physics may browse through ideal quantum-gas-related topics (Sections 2.12-2.14).
2

Contents
1 Thermodynamics Primer
1.1 Introduction . . . . . . . . . . . . . . . . . . .
1.2 Zeroth law of thermodynamics . . . . . . . . .
1.3 First law of thermodynamics . . . . . . . . . .
1.4 Fourth law of thermodynamics . . . . . . . . .
1.5 Second law of thermodynamics . . . . . . . .
1.6 Clausius inequality . . . . . . . . . . . . . . .
1.7 Various thermodynamic potentials . . . . . . .
1.8 Manipulation of thermodynamic formulas . . .
1.9 Consequences of stability of equilibrium states
1.10 Ideal rubber . . . . . . . . . . . . . . . . . . .
1.11 Third law of thermodynamics . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

5
5
7
7
13
14
20
24
27
29
33
35

2 Statistical Mechanics Primer


2.1 Basic hypothesis of equilibrium statistical mechanics
2.2 Boltzmanns principle . . . . . . . . . . . . . . . . . .
2.3 Equilibrium at constant temperature . . . . . . . . .
2.4 Simple systems . . . . . . . . . . . . . . . . . . . . .
2.5 Classical statistical mechanics . . . . . . . . . . . . .
2.6 Heat capacity of solid . . . . . . . . . . . . . . . . . .
2.7 Classical ideal gas . . . . . . . . . . . . . . . . . . . .
2.8 Open systems . . . . . . . . . . . . . . . . . . . . . .
2.9 Ideal particle systems quantum statistics . . . . .
2.10 Free fermion gas . . . . . . . . . . . . . . . . . . . . .
2.11 Free bosons and Bose-Einstein condensation . . . . .
2.12 Phonons and photons . . . . . . . . . . . . . . . . . .
2.13 Phase coexistence and phase rule . . . . . . . . . . .
2.14 Phase transition . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.

39
39
40
43
47
51
53
57
60
63
66
69
71
75
78

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

CONTENTS

Chapter 1
Thermodynamics Primer

1.1

Introduction

1.1.1 Why do we start with thermodynamics?


When a macroscopic object is isolated and left alone for a long time, it reaches an
equilibrium state. The state can be described by a set of macroscopic quantities such
as temperature, pressure, volume, etc., and these quantities obey thermodynamics.
Thermodynamics summarizes our empirical knowledge about macroscopic objects in
equilibrium. Statistical mechanics tries to elucidate thermodynamics in terms of the
statistical behavior of systems consisting of many objects obeying mechanics. However, its theoretical framework cannot be derived from mechanics. Its justification
comes from its consistency with empirical facts (thermodynamics).
We should not forget that the fundamental framework of statistical mechanics was
established by Gibbs well before the so-called quantum revolution. Gibbs constructed
a framework that is consistent with thermodynamics. Thermodynamics not only survived this revolution but it is fair to say that it was an important guiding principle
to make a sound theoretical framework. Statistical mechanics almost survived the
revolution, but it is not surprising because Gibbs made statistical mechanics heavily
relying on thermodynamics. Do not forget that Gibbs was a foremost expert of thermodynamics. We can easily guess that statistical mechanics is largely independent
of the actual mechanics of the microscopic world. Furthermore, if we accept an obvious fact that the ultimate judge of physics is empirical results, we could say that
thermodynamics is more fundamental than the microscopic mechanical description
5

CHAPTER 1. THERMODYNAMICS PRIMER

of macroscopic objects that is beyond our direct experimental confirmation.


It is a prejudice that a more microscopic description is more fundamental. Therefore, we start with thermodynamics. After discussing elementary thermodynamics,
we proceed to statistical mechanics. Rudiments of probability and combinatorics will
be given in due course.
1.1.2 Macroscopic objects and equilibrium state
Our empirical facts about macroscopic objects in equilibrium are usually summarized in the five laws (axioms1 ) of thermodynamics. In standard approaches to
thermodynamics we do not explicitly define the word macroscopic nor equilibrium. In the standard thermodynamics, these words are implicitly defined through
the fundamental laws just as points and lines are in Euclidean geometry.
However, their intuitive meaning is as follows. We say an object is macroscopic if
its halves are again macroscopic. This implicitly implies that we may ignore the surface effect completely. Usually, a macroscopic object contains 1020 or more molecules,
and the range of intermolecular forces extends only the distance of the order of the
size of molecules; most molecules do not feel the surface of the object. Hence, the
surface effect is almost surely negligible for ordinary macroscopic objects.
A system is said to be in an equilibrium state when all the fast processes have
occurred but all the slow processes have not. This characterization of equilibrium
may sound very pragmatic, but this is the honest characterization of the word equilibrium.
1.1.3 Five fundamental laws of equilibrium thermodynamics: Summary
There are five fundamental thermodynamic laws:
[0] The existence of equilibrium states, and temperature (the zeroth law).2
[1] The conservation of energy (the first law).
[2] The variational principle selecting equilibrium states (the second law).
[3] The impossibility to reach the absolute zero temperature (the third law).
[4] Thermodynamic quantities are either extensive or intensive (the fourth law).
The ordering above is not logical, but here we follow the conventional scheme.
[4] is often not recognized as a fundamental law, but quite important. The terms
intensive and extensive will be explained in 1.4.1.

However, these axioms are far insufficient to reconstruct thermodynamics mathematically.


Thus, although we call them informally axioms we should regard them as important principles.
2
Strictly speaking, it is logically impossible to introduce temperature without other laws, but
in this primer we proceed intuitively, just as many rudimentary textbooks.

1.2. ZEROTH LAW OF THERMODYNAMICS

1.2

Zeroth law of thermodynamics

1.2.1 Thermal equilibrium of isolated systems


A macroscopic system is said to be isolated if the system has no interaction at all
with its surrounding environment. If an isolated system is left undisturbed for a long
time, the system would reach a macroscopic state which would not change any more.
This final state is called a thermal equilibrium state.
1.2.2 Zeroth law consists of two assertions
The zeroth law consists of two assertions i the conventional exposition:
ThOa For a given isolated system there is a thermal equilibrium state.
There is a special way of making contact between two systems called thermal contact. Thermal contact is a contact through a special wall which does not allow the
systems to exchange work, matter, or any systematic macroscopic interaction (such
as electromagnetic interactions).
If two systems A and B are in thermal contact and are in equilibrium as a compound system, we say A and B are in thermal equilibrium.
ThOb If the systems A and B are in thermal equilibrium, and so are the systems B
and C, then the systems A and C are in thermal equilibrium. That is, the thermal
equilibrium relation is an equivalence relation.
The second assertion implies the existence of a scalar quantity called temperature
(or more precisely, an empirical temperature): there is a quantity called temperature
which takes identical values for two systems in thermal equilibrium. Here, we do not
mathematically demonstrate this, but this should not be counterintuitive. Notice,
however, up to this point the introduced temperature does not imply that hotter
objects have higher temperatures. The definition of temperature simply tells us that
hotter and colder objects have different temperatures.

1.3

First law of thermodynamics

1.3.1 Thermodynamic variables and thermodynamic space


Empirically, it is known that equilibrium states of macroscopic (and spatially homogeneous) objects can be macroscopically uniquely specified by a few variables (called
thermodynamic variables) such as temperature, volume, etc. For example, an equilibrium state of a simple fluid (ordinary liquids and gases of pure substances like
liquid water, helium gas, etc.) is uniquely specified by (P, V, M ), where P is the

CHAPTER 1. THERMODYNAMICS PRIMER

pressure, V the volume, and M the mass or molarity of the system.


However, strictly speaking, if phase coexistence can occur, not arbitrary set of
thermodynamic variables can uniquely specify the equilibrium state. Only the (internal) energy E and the work coordinates X1 , can, where work coordinates are
thermodynamic variables that are used to express the work done to the system
whose mechanical meanings are unambiguously clear and extensive (see, e.g., 1.3.8
and 1.3.9). The set E, X1 , is called the thermodynamic coordinate system. The
space spanned by these thermodynamic coordinates may be called the thermodynamic space. For a given macroscopic system, its each equilibrium state uniquely
corresponds to a point in the thermodynamic space of its own.
Some readers might question that there are much more macroscopic observables we can observe for
a given object, shapes, orientation, etc. Precisely speaking, thermodynamic states are equivalence
classes of macroscopically distinguishable states according to the values of the thermodynamic coordinates.

1.3.2 Simple system and compound system


A system that is macroscopically spatially uniform unless there is phase coexistence
and that can be thermodynamically uniquely described by a set of thermodynamic
coordinates with a single internal energy {E, Xi } is called a simple system . A system that may be described by a join of simple systems is called a compound system ,
whose thermodynamic space may be interpreted as a direct product of the thermodynamics spaces of the constituent simple systems. In these note, we mainly discuss
simple systems.
1.3.3 Quasistatic process and path in thermodynamic space
Any (experimentally realizable) process that consists of states extremely (infinitesimally) close (experimentally indistinguishably close) to equilibrium states is called
a quasistatic process. Quasistatic processes may be expressed as a curve in the thermodynamic space. In these notes, a quasistatic process is synonymous to a process
that has a curve representation in the thermodynamic space (Fig. 1.1). Whether it
is reversible (retraceable) or not is not directly related to quasistatic nature of the
process.
Certainly, the processes not described as curves in the thermodynamic space are
nonequilibrium processes. For example, if a process is sufficiently rapid, it cannot
have any corresponding path in the thermodynamic space, because states along the
process are not infinitesimally close to equilibrium states. Only when the process
is sufficiently slow, all the instantaneous macroscopic states during the process are
infinitesimally close to equilibrium states. In this case the path lies in the thermodynamic space. It is sure that there is a way to realize any path corresponding to

1.3. FIRST LAW OF THERMODYNAMICS

a quasistatic process reversibly. (Equilibrium) thermodynamics can tell us how to


compute changes of thermodynamic quantities along any quasistatic path.
no

s
qua

ne
qp
ro
ce
ss

tic
ista

Fig. 1.1 A and B are equilibrium


states. A quasistatic process connecting
A and B is in the thermodynamic space.
From A to B a process need not be
quasistatic. Then, such a process cannot
be described in the thermodynamic space
(red).

However, a given quasistatic process is reversible or not depends on the context.


For example, if a system is in contact with a cold bath across a thermally fairly well
insulating wall. At each instance, the state of the system is very close to an equilibrium state, so the process is a quasiequilibrium process, but what is join on in this
process is a cooling process, so the system + the cold bath undergoes an irreversible
process. Still, there is a way to realize this temperature change reversibly for the
system alone, so we can use equilibrium thermodynamics to study every state along
the process.
1.3.4 State functions
If in an equilibrium state the value of a macroscopic quantity is uniquely specified
by the corresponding point in the thermodynamic space, the macroscopic quantity
is called a state function. Its value is indifferent to how the state is realized.
Once the thermodynamic space is established we may say a (univalent) function
defined on the thermodynamic space is a state function. That is, a function of the
thermodynamic coordinates is called a state function. For example, the equilibrium
volume of a system is a state function; temperature is another example.
When the initial and final equilibrium states are given, the variation of a state
function does not depend on the actual process but only on the initial and final
equilibrium states. Even if the actual process connecting these two states is not
a quasistatic process (i.e., does not lie in the state space), we can thermodynamically compute the variation of any state function during the process with the aid of
an appropriate (appropriately devised) quasistatic process connecting the same end
points.
Actually, the essence of thermodynamic computation is to devise a quasistatic
path connecting two equilibrium states that may be in practice connected by an

10

CHAPTER 1. THERMODYNAMICS PRIMER

irreversible process.
1.3.5 Joule demonstrated that energy is a state function
We have already included internal energy E in the thermodynamic coordinates, but
whether E is a state function or not was not clear before Joule. Joule experimentally
proved in 18433 that when the initial and the final equilibrium states are specified, the
necessary (mechanical and electromagnetic) work W for any process connecting these
two states of a thermally isolated system4 is independent of the actual procedure
(actual way of supplying work) and depends only on both ends of the process. From
this we may conclude that there is a state function E whose change for an isolated
system is given by
E = W.
(1.3.1)
Remark. A more precise statement of (a generalization of) Joules finding is as
follows. There is a special wall called an adiabatic wall such that for a system surrounded by this wall the necessary work to bring the system from a given initial
equilibrium state to a specified final equilibrium state is independent of the actual
process but is dependent only on these two end states of the process. Here, work is
defined by mechanics and electrodynamics. u
t.
1.3.6 Closed system, heat, and internal energy
(1.3.1) does not hold when the process is not adiabatic, even if the system does not
exchange matter with its environment. A system which does not exchange matter
with its environment is called a closed system. Now, even if a system is closed, if it
is surrounded by a energetically leaky wall, all the energy supplied to the system
as work W may not stay in the system, or perhaps more energy could seep through
the wall into the system.
Empirically, we know that if we have two equilibrium states A and B, we can bring
at least one of the states into the other adiabatically (in a Dewar jar), say A B,
supplying only work W from outside. The process need not be a quasistatic process
(perhaps we can heat the system by friction inside). In any case, in this way we
can measure the energy difference of B relative to A E = EA EB = W 5 can be
measured in terms of mechanics (+ electrodynamics).
If we wish to realize the change A B by a quasistatic and reversible process,
we cannot always do so adiabatically. Now, suppose a quasistatic and non-adiabatic
process A B requires work W , which is usually smaller than E (see the second
law below). The deficit E W is understood as supplied through the wall as heat
3

[1843: The first Angro-Maori War in New Zealand; Tahiti became a French colony.]
e.g., (intuitively) any system contained in a Dewar jar.
5
The difference due to a process is always defined as the final quantity initial quantity.
4

1.3. FIRST LAW OF THERMODYNAMICS

11

Q: Q = E W . In this way heat is introduced. That is, we wish to keep E as a


state function even for non-isolated closed systems:
E = W + Q.

(1.3.2)

(1.3.2) is the conservation law of energy extended to non-mechanical processes. E


is called internal energy in thermodynamics, because we do not take the mechanical
energy due to the mechanical motion of the system as a whole into account, even if
the system is moving as a whole.6
Notice that although E is a state function, neither W nor Q is a state function;
they depend explicitly on the path connecting the initial and the final equilibrium
states (the path may not be in the thermodynamic space7 ).
1.3.7 Open system and general form of the first law
When not only heat but matter can be exchanged between the system and its environment (in this case the system is called an open system), (1.3.2) does not hold
anymore. To rescue the equality, we introduce the term called mass action Z
E = W + Q + Z.

(1.3.3)

Z is not a state function, either. Now, we can summarize the first law of thermodynamics:
ThI The internal energy E defined by (1.3.3) is a state function.
The first law may be regarded as a special case of the general law of energy
conservation. Strictly speaking, however, the first law only discusses the processes
connecting two equilibrium states.
For an infinitesimally small change, we write (1.3.3) as follows:
dE = d0 W + d0 Q + d0 Z,

(1.3.4)

where d0 is used to emphasize that these changes are not the changes of state functions (not the path independent changes).8

Precisely speaking, the total energy of the system observed from the co-moving and co-rotating
observer is the internal energy of the system.
7
Notice that W is purely mechanically defined, irrespective of the nature of the process, it is
macroscopically measurable, but Q is usually not directly measurable in nonequilibrium processes.
It is computed at the end to satisfy (1.3.2).
8
Mathematically, it is not a total differential or not a closed form.

12

CHAPTER 1. THERMODYNAMICS PRIMER

1.3.8 Work due to volume change


When the change is quasistatic, W , Q and Z in (1.3.3) are determined by the equilibrium states of the system along the quasistatic path.
For example, let us consider the work required to change the system volume from
V to V + dV (V is clearly a state function, so d0 is not used here). The necessary
work supplied to the system reads (See Fig. 1.2)
dl

Fig. 1.2 Work done by volume change.


d0 W = F dl = P dV,

(1.3.5)

where P is the pressure and the force F is given by the following formula, if the
process is sufficiently slow
F = A P,
(1.3.6)
where A is the cross section of the piston. Here, we use the sign convention such
that the energy gained by the system becomes positive. Hence, in the present example, d0 W should be positive when we compress the system (i.e., when dV < 0).
If the process is fast, there would not be sufficient time for the system to equilibrate. For example, when we compress the system, the force necessary and the force
given by (1.3.6) can be different; the pressure P may not be well defined. Consequently, (1.3.5) does not hold (the work actually done is larger than given by (1.3.5)).
1.3.9 Electromagnetic work
The electromagnetic work can be written as
d0 W = H dM ,

(1.3.7)

d0 W = E dP ,

(1.3.8)

where H is the magnetic field, M the magnetization, E the electric field, and P the
polarization.

1.4. FOURTH LAW OF THERMODYNAMICS

13

1.3.10 Mass action and chemical potential


The mass action is empirically written as
d0 Z =

i dNi ,

(1.3.9)

where Ni is the number of i-th particles (or the molarity of the i-th chemical species),
and i is its chemical potential.
1.3.11 The first law is not identical to the law of energy conservation
The first law is about the internal energy, which is defined only for equilibrium states. If a system
is not in equilibrium, it may have a material flow (e.g., fluid flow) that carries macroscopic kinetic
energy. Such an energy is not regarded as a part of the internal energy. Needless to say, if we count
all the energies, the conservation of energy rigorously holds for an isolated system, but the total
energy is equal to the internal energy only when the system is in equilibrium and is not moving or
rotating with respect to the observer who measures the internal energy. Thus, the conservation of
energy implies the first law, but the converse is not true.

1.4

Fourth law of thermodynamics

1.4.1 Extensive and intensive thermodynamic variables


Thermodynamic observables which are proportional to the number of particles (or
mass) in the system are called extensive quantities. For example, the internal energy U of the system is doubled when we piece together identical systems in the
identical thermodynamic state. In contrast, the temperature is not doubled by the
same procedure. Temperature is independent of the amount of mass in the system.
Thermodynamic quantities independent of the number of particles (or mass) in the
system are called intensive quantities. Temperature and pressure are examples.
Notice that the infinitesimal form of the first law of thermodynamics can be written in general as follows:
X
dE = d0 Q +
xi dXi ,
(1.4.1)
i

where Xi are extensive quantities and xi are intensive quantities. The pair (xi , Xi )
is called a thermodynamical conjugate pair (with respect to energy).

14

CHAPTER 1. THERMODYNAMICS PRIMER

1.4.2 The fourth law of thermodynamics


The fourth law claims
ThIV All thermodynamic observables are either extensive or intensive.
This empirical law is vital when we construct a statistical mechanical framework
to explain macroscopic properties of matter.
Also the law is practically useful to obtain the equation of state applicable to any
amount of matter from experiments that actually use a particular amount of the
matter as shown in the following example.
Example 1. This example contains thermodynamic variables we have not yet discussed at all, but what matters is whether a particular variable is extensive or intensive, so dont worry. An empirical equation of state of a magnetic substance (2
moles) is obtained as
(1.4.2)
A = T 1/2 M 2 ,
where A is the Helmholtz free energy, which is an extensive variable we will discuss
later, T the absolute temperature that is intensive, and M the magnetization that is
extensive. Find the equation of state for the free energy A for N moles of the same
substance.
We use the fourth law. If we replace extensive quantities A and M in the empirical
equation of state (1.4.2) by their N -mole counterparts, we have

2
A
M
1/2
2 =T
2
,
N
N

(1.4.3)

so we obtain the following formula:


A = 2T 1/2 M 2 /N.

(1.4.4)

u
t

1.5

Second law of thermodynamics

1.5.1 The second law of thermodynamics


We know empirically that not all conceivable processes are realizable in Nature. The
second law summarizes this as follows:
ThIIcl Clausius law: Heat cannot spontaneously be transferred from a colder to a

1.5. SECOND LAW OF THERMODYNAMICS

15

hotter body.9
ThIIk Kelvins law: A process cannot occur whose only effect is the complete conversion of heat into work. (No existence of perpetum mobile of the second kind; there
is no engine which can produce work without a radiator.)
ThIIp Plancks law: In the adiabatic process if all the work coordinates except for
E return to their original value, E 0.
1.5.2 Plancks law in thermodynamic space
The first law implies adiabatically
X
dE =
xi dXi ,

(1.5.1)

where (xi , Xi ) are conjugate pairs for work coordinates (non-thermal variables). The
variables E and Xi span thermodynamic space (1.3.1).
E

A
X2

Fig. 1.3 The path in the thermodynamic


space corresponds to the quasistatic process
(notice, however, generally that Plancks law
does not require quasistatic processes). The
vertical move implies a purely thermal process.
Adiabatically, there is no way to move from a
state to another state that is vertically below it
according to Plancks law.

X1

1.5.3 Plancks law, Kelvins law and Clausius law are equivalent
If Kelvins law could be violated, then we can absorb heat from a colder body, and
then produce work. If we simply dissipate the work into heat, we could add it to a
hotter body. Thus, Clausius law could be violated. Therefore, Clausius law implies
Kelvins law.
Conversely, if Clausius law could be violated, we can split a body into two halves,
and one of them could be made hotter than the other. This body can be used to
9

The presentation here has not defined temperature clearly yet, so strictly speaking, we cannot
describe this law properly here. The author thinks Plancks law is the most elegant formulation of
the second law.

16

CHAPTER 1. THERMODYNAMICS PRIMER

push the state from A to B in Fig. 1.3. Thus, Kelvins law would be violated. We
have completed a demonstration of the equivalence of ThIIk and ThIIcl.
If Plancks law could be violated, due to work alone E = W < 0 is possible
(i.e, the system does work to the environment), so supplying this as heat Q, we can
violate Kelvins law. Hence, Kelvin implies Planck. If Kelvin is violated, we can first
absorb heat (say A to B in Fig. 1.3) and then go from B to A by converting it into
work, violating Planck, so Planck implies Kelvin.
1.5.4 Carath
eodorys principle
Plancks law tells us that there is no way to return the system from B to A adiabatically in Fig. 1.3. Actually, notice that Planck tells us much more than we can
illustrate in the state space. Recall that if a process is not quasistatic, we cannot
draw the path corresponding to it in the state space. Still Planck tells us that irrespective of the processes involved being quasistatic or not we cannot go from B to
any state below it (along the constant work coordinate in line) the thermodynamic
space. trace.
Therefore, it is obvious that any thermodynamic state has another state that cannot be reached from the former by any adiabatic process. It is often convenient to
promote this to be a form of the second law:
ThIIca Caratheodorys principle: Each point in the thermodynamic space has in its
every neighborhood a point which is adiabatically inaccessible.
We have already said Planck implies Caratheodory. The converse requires more
conditions; this should be obvious, because Planck requires there is a state that cannot be accessible from a given state in a particular way, while Caratheodory claims
there is a state that cannot be accessible by any means.
1.5.5 Existence of entropy
Starting with 1.5.6 we demonstrate that the second law implies the existence of
entropy, the quantity that foliates the thermodynamic space into adiabats or isentropic hypersurfaces. Roughly speaking, if the entropies of two states are identical,
adiabatically reversibly these states can be connected. If state A has larger entropy
than state B, then we can never go from A to B adiabatically.
1.5.6 Adiabat expresses adiabatic accessibility limit
Choose an arbitrary point P in the thermodynamic space and a quasistatic adiabatic path connecting P and L, a line parallel to the energy axis (a constant work
coordinate line; Fig. 1.4).
Suppose the path lands on L at point Q. Can we quasistatically and adiabatically
go from P to A or B distinct from Q on the line L? Any quasistatic path may be

1.5. SECOND LAW OF THERMODYNAMICS

17

realized by a reversible process, so P Q can be traveled in any direction, but neither


the cycle P QBP nor P AQP should be allowed. B may be reached adiabatically from
P by an irreversible process, but A cannot be reached adiabatically by any means.
Therefore, we conclude that the point Q is the state with the highest internal energy
that cannot be reached from P adiabatically.
E

B
Q
P
adiabatically
inaccessible
from P

A
X

Fig. 1.4 The cycle P AQP implies that the system can absorb heat along AQ and then convert
it into work without any trace (returning to P ),
violating Plancks law. In contrast P BQP is realizable, because we have simply waste work into
heat and discard it during BQ. Notice that P Q
and QP are adiabatically allowed, and AQ, QA,
BQ and QB are possible with the aid of an appropriate heat bath.

X'

Now, moving the stick L throughout the space keeping it parallel to the energy
axis, we can construct a hypersurface consisting of points adiabatically, quasistatically and reversibly accessible from point P. This is an adiabat (the totality of the
state that can be quasistatically reachable from P without any heat exchange with
the environment).
1.5.7 Adiabats foliate thermodynamic space
See Fig. 1.5 to understand that these sheets = adiabats cannot cross. This implies
that we can define a state function S, whose level sets are given by these sheets (S =
constant defines an adiabat).

18

CHAPTER 1. THERMODYNAMICS PRIMER

Fig. 1.5 If two adiabats cross or touch,


then we can make a cycle that can be traced in
any direction, because P Q, P 0 Q can be traced
either directions (reversibility of quasistatic
processes), so can be P P 0 with the aid of an
appropriate heat bath. Plancks law is violated.

P'
P
Q

1.5.8 Adiabat can be parameterized by an increasing function of energy


This state function S parameterizing the adiabats can have no overhang, if we regard
the energy coordinate direction as the vertical direction as can be seen from Fig. 1.6.

P'
Q

Fig. 1.6
Just as Fig. 1.5, an overhang violates Plancks
law.

P
Therefore, adiabats can be parameterized by a continuous state function S that
increases monotonically with energy. This state function is essentially entropy.
1.5.9 Relation between heat and entropy
For a given system, we have seen that we can introduce a state function S that is
monotone increasing function of E (if other state variables are kept constant).
We can change entropy keeping all Xi variables constant; that is, we can change
S by supplying or removing heat Q. Since energy is extensive, so is Q.10 .
Since if d0 Q > 0, dS > 0 should be required. We may choose these two differentials
to be proportional: dS = d0 Q. This automatically implies that we assume S to be
extensive, and the proportionality constant must be intensive.
10

That is, if we double the system, we must double the heat to reach the same thermodynamic
state characterized by the same intensive parameters and densities (= extensive variables per volume)

1.5. SECOND LAW OF THERMODYNAMICS

19

Suppose two systems are in contact through a wall that allows only the exchange
of heat, and they are in thermal equilibrium. Exchange of heat d0 Q between the
system is a reversible process (say, system I gains d0 QI = d0 Q and II d0 QII = d0 Q),
so this process occurs within single adiabat of the compound system. If we write
d0 QX = X dSX (X = I or II),


1
1
0
0 = dSI + dSII = d Q

.
(1.5.2)
I II
This implies I = II . That is, when two systems have the same temperature, the
proportionality constant is also the same. Hence, we may interpret the proportionality factor as a temperature (cf. the zeroth law). The introduced temperature can
be chosen as a universal temperature T called the absolute temperature. Hence, in
the quasistatic process we can write
d0 Q = T dS.

(1.5.3)

Remark The discussion given above is a crude version of the standard thermodynamic demonstration of the existence of an integrating factor for d0 Q. The usual demonstration with the aid of a
classical ideal gas must be avoided, because the classical ideal gas is not consistent with thermodynamics.

1.5.10 Infinitesimal form of the first law with the aid of entropy
Now we can write down the infinitesimal version of the first law of thermodynamics
for the quasistatic process as follows:
dE = T dS P dV + dN + H dM + .

(1.5.4)

This is called the Gibbs relation. Notice that each term consists of a product of an
intensive factor and d[the corresponding (i.e., conjugate) extensive quantity].

20

CHAPTER 1. THERMODYNAMICS PRIMER

1.6

Clausius inequality

1.6.1 Entropy cannot decrease in isolated systems


In the preceding section we have shown that whenever there is an irreversible change
in an isolated system (more generally adiabatic system), the second law of thermodynamics implies that the entropy increases; more precisely, we have shown that we
can introduce the concept of entropy to satisfy this condition. See Fig. 1.4 in 1.5.6
again. From P we can reach the portion of L no lower than Q; we have shown that
the entropy can be introduced as an increasing function of E along L. In an adiabatic
system, only when the change is quasistatic does the entropy stay constant. Thus,
when d0 Q = 0 (adiabatic)
(1.6.1)
S 0.
Here, implies the difference between the values of the final and of the initial equilibrium state.11 This is called Clausius inequality (for the isolated system).
1.6.2 Stability of state and evolution criterion
Since all spontaneous processes are irreversible, we may say that for a system to
evolve (under an adiabatic condition) from an initial state to a final state, the entropy
change (which is completely determined by these end points in equilibrium) must
be positive. If not, there cannot be any spontaneous change; the system is in a
thermodynamically stable state. Thus, for an isolated system
S < 0
S > 0

the state is thermodynamically stable,


the state spontaneously evolves,

(1.6.2)
(1.6.3)

where implies virtual changes of states. The first line above is the stability condition, and the second the evolution criterion. If S = 0, the equilibrium state can be
changed to another adiabatically reversibly and/or can drift to another equilibrium
state.
Here, virtual changes might be taken as fictitious changes, but in the actual
system these changes are realized by thermal fluctuations. Thus, if the evolution
criterion S > 0 is satisfied, in most cases, the system actually moves away from
the current state.
11

The reader might feel a bit strange, because, equilibrium states should not evolve further.
Here, the situation indicates is as follows: initially the system is in an equilibrium state. Then,
some (environmental) conditions are changed (say, the volume is changed); whether this change is
slow or rapid, we do not care. After this change, the system (in suitable isolation as required by the
adiabatic condition) would settle down to another (new) equilibrium state (as guaranteed by the
zeroth law). compares this new equilibrium state to the original equilibrium state before change.

1.6. CLAUSIUS INEQUALITY

21

1.6.3 Variational principle for equilibrium state


If an isolated system arrives at a stable equilibrium state, its entropy must be maximized. Therefore, the second law gives us a variational principle (entropy maximization principle) to find a stable equilibrium state for an isolated system.
1.6.4 Extension to non-isolated system
Next, we would like to extend our inequality for isolated systems to non-isolated
systems. The following argument is a standard strategy that we use repeatedly
throughout statistical thermodynamics. To consider a system which is not isolated,
that is, a system which is interacting with its environment, we construct an isolated
system composed of the system itself (I) and its interacting environment (II) (Fig.
1.7). We assume that both systems are macroscopic, so we may safely ignore the
surface effect.

II

Fig. 1.7
The system II is the environment for the system
we are interested in I. II is sufficiently large so
no change in I significantly affects II.

reservoir
The environment is a stationary one, whose intensive thermodynamic variables such
as temperature are kept constant. To realize this we take a sufficiently big system
(called a reservoir like a thermostat or a chemostat) as the environmental system II.
Even if a change is a rather drastic one for the system I itself, it would be negligible
for the system II, because it is very large. Therefore, we may assume that any process
in the system I is a quasistatic process for system II. This means that the entropy
change of the compound system I+II is given by the sum of the entropy change of
the system I denoted by SI and that of the environment II denoted by SII .
1.6.5 Clausius inequality for general cases
Since the whole system I+II is isolated, the second law or Clausius inequality (1.6.1)
for isolated systems tells us that
SI + SII 0.

(1.6.4)

Let Q (> 0) be the heat transferred to the system I from the environment II. From
our assumption, we have
SII = Q/Te ,
(1.6.5)

22

CHAPTER 1. THERMODYNAMICS PRIMER

where Te is the temperature of the environment. The minus sign is because II is


losing heat to I. Combining (1.6.4) and (1.6.5) yields the following inequality:
SI Q/Te .

(1.6.6)

This is Clausius inequality for non-isolated systems. Of course, for isolated systems
Q vanishes, so we recover (1.6.1).
1.6.6 Intrinsic change of entropy
If a process is quasistatic (and isothermal), then the fundamental relation between
entropy and heat reads
(1.6.7)
SI |reversible = Q/T,
where T is the temperature of the system I. For this process Te must be identical to
T . Hence the entropy change in this reversible process is solely due to the transfer
of heat (i.e., solely due to the interaction with the environment).
When irreversibility occurs, then the equality (1.6.7) is violated. To describe this
de Donder split the entropy change into two parts, and introduced the concept of
intrinsic change of entropy due to irreversibility as
i S S Q/Te .

(1.6.8)

i S 0.

(1.6.9)

The second law reads


The intrinsic change is interpreted as the portion of the entropy change produced
inside the system due to the very irreversibility of the process.
1.6.7 Clausius inequality in terms of internal energy
Clausius inequality can be rewritten in terms of the internal energy as follows. We
have
(1.6.10)
E = Q + W + Z.
Combining this with Clausius inequality (1.6.6), we get
E W Z Te S,

(1.6.11)

E Te S Pe V + e N + ,

(1.6.12)

or
where quantities with subscript e are all for the environment.

1.6. CLAUSIUS INEQUALITY

23

1.6.8 Equilibrium conditions for two systems in contact


As an application of the entropy maximization principle 1.6.3, let us study the
equilibrium condition for two systems I and II interacting through various walls.

II

Fig. 1.8
The thick vertical segment is the wall that
selectively allow the exchange of a certain
extensive quantity.

i) Consider a rigid impermeable wall which is diathermal. Thus, the two systems in
contact through this wall exchange energy (internal energy) in the form of heat. The
total entropy of the system S is the sum of the entropy of each system SI and SII .
The total internal energy E is also the sum of subsystem internal energies EI and EII
(extensivity). We isolate the compound system and ask the equilibrium condition
for the system. We should maximize the total entropy with respect to the variation
of EI and EII (the variational principle for equilibrium; see 1.6.3):


SI
SII
SI
SII
S =
(1.6.13)
E +
E =

EI = 0,
EI I EII II
EI EII
where we have used that E = 0 or EI = EII . Hence, the equilibrium condition
is
SI
SII
=
,
(1.6.14)
EI
EII
or TI = TII .
ii) Consider an diathermal impermeable wall which is movable. In this case the two
systems can exchange energy and volume. If we assume that the total volume of the
system is kept constant, the equilibrium condition should be


SI
SII
SI SII
S =
V +
V =

VI = 0,
(1.6.15)
VI I VII II
VI VII
and TI = TII , that is,

SI
SII
=
VI
VII

(1.6.16)

24

CHAPTER 1. THERMODYNAMICS PRIMER

and TI = TII . Therefore, PI = PII is also required.


If the wall is adiabatic, then it cannot exchange heat, so there is no way to
exchange entropy. This suggests that to use the Gibbs relation (1.5.4) directly is
convenient. PI = PII is the condition; we cannot say anything about the temperatures.
iii) Consider a semi-permeable wall to the i-th chemical species. In this case it is
natural to assume that the wall is diathermal. Hence, the two systems can exchange
the molarity Ni of chemical species i and internal energy. The total number of the
i-th particles is conserved, so quite analogously to i) and ii), we get the following
equilibrium conditions:
SI
SII SI
SII
(1.6.17)
=
,
=
.
NiI
NiII EI
EII
That is, TI = TII , and iI = iII .
1.6.9 Phase coexistence condition
When two phases are in equilibrium, the phase boundary is a wall which allows
exchanges of all the extensive quantities in the system. Therefore, the equilibrium
condition (that is, the phase coexistence condition) for these two phases is that all the
intensive quantities of both the phases are identical. See Section 2.5 (e.g., 2.13.1,
2.13.2).

1.7

Various thermodynamic potentials

1.7.1 Helmholtz free energy


In reality, the variables S, V, N, to be controlled for the ordinary Gibbs relation
(1.5.4) are often hard to control or at least awkward. For example, to keep volume
constant may be more difficult than to keep pressure constant. Perhaps, to keep the
temperature constant is easier than the adiabatic condition.
In order to change independent variables from S, V, N, to T, V, N, , we perform the following Legendre transformation:
E E T S.

(1.7.1)

The introduced quantity E T S is called the Helmholtz free energy and is usually
written as A.12 The total differential of A reads
dA = dE T dS SdT = SdT P dV + dN + ,
12

Old literatures use F .

(1.7.2)

1.7. VARIOUS THERMODYNAMIC POTENTIALS

25

where we have used the Gibbs relation (1.5.4).


The Helmholtz free energy should be a good thermodynamic potential under constant T, V, N, . When we compute dS, we regard S as a function of T, V, N, .
Under constant T (1.6.12) reads
E Te S = A Pe V + e N + ,

(1.7.3)

so under constant T, V, N, for any change13


A 0.

(1.7.4)

Hence, in the stable equilibrium state under constant T, V, N, the Helmholtz free
energy must be the global minimum (i.e., in particular, A > 0).14
1.7.2 Under constant pressure and temperature: Gibbs free energy
If we wish to study a closed system under constant T and P , we should further
change the independent variables from T, V, N, to T, P, N, . The necessary
Legendre transformation is (notice that the conjugate quantity of V is P , not P )
A + P V = E T S + P V G,

(1.7.5)

which is called the Gibbs free energy. We have


dG = SdT + V dP + dN.

(1.7.6)

Exercise 1. Demonstrate that in the stable equilibrium state under constant T, P, N, ,


G is minimum. u
t
If we wish to study a system in which S, P , and N are kept constant, as is easily
guessed, the following thermodynamic potential called enthalpy H is convenient:
H E + P V.

(1.7.7)

Find the total differential of H and the stability condition for the equilibrium state
under constant S, P, N, .
Remark. The Legendre transformation is a more general concept than discussed
above. Convex analysis is the mathematical topic covering the general discussion of
Legendre transformation.u
t
13

any change here implies literally any change; the change need not be small (that is why
is used instead of ), and can be any local change in the system; for example, we could interpret
a single simple system as a compound system and manipulate the thermodynamic variables of the
constituent subsystems freely within the required overall constrains that T, V, N, are constant.
14
In equilibrium thermodynamics, usually, local minimum implies global minimum, but global
minima may not be unique. If the minima are isolated in the thermodynamic space, so locally
A > 0 holds, but it does not imply A > 0.

26

CHAPTER 1. THERMODYNAMICS PRIMER

1.7.3 Gibbs-Duhem relation


Let us pursue a consequence of the fourth law (1.4.2) of thermodynamics. If we
increase the amount of all materials in the system from Ni to (1 + )Ni , then all the
extensive quantities are multiplied by 1 + , and all the intensive quantities remain
unaltered. Therefore, (1.5.4) now reads
d[(1 + )E] = T d[(1 + )S] P d[(1 + )V ] +

i d[(1 + )Ni ] + , (1.7.8)

or
Ed = T Sd P V d +

i Ni d + .

(1.7.9)

That is, we have


E = TS PV +

i Ni + .

(1.7.10)

Combining the total differential of this formula and the Gibbs relation (1.5.4), we
arrive at
X
SdT V dP +
NI di + = 0.
(1.7.11)
This important relation is called the Gibbs-Duhem relation.
Exercise 1. Demonstrate the following formulas in two ways: 1) with the aid
of the definitions of various thermodynamic potentials, 2) directly from the total
differential formulas such as (1.7.2) and (1.7.6), using the same logic we have just
used to demonstrate the Gibbs-Duhem relation.
X
i Ni + ,
(1.7.12)
A = P V +
i

H = TS +

i Ni + ,

(1.7.13)

G=

i Ni + .

(1.7.14)

u
t
The last formula in the exercise implies that for a simple pure fluid
= G/N.

(1.7.15)

1.8. MANIPULATION OF THERMODYNAMIC FORMULAS

1.8

27

Manipulation of thermodynamic formulas

1.8.1 Symmetry of mixed second partial derivatives


Let f be a function of x and y. If its partial derivatives fx and fy exist and are
differentiable in a domain D, then
fxy = fyx

(1.8.1)

in D (Youngs theorem).
Remark. More precisely, if fx , fy and fxy exist and fxy is continuous, then fyx exists and identical
to fxy (Schwarzs theorem). Notice that fxx and fyy do not necessarily exist under this condition.

u
t
Hence, so long as thermodynamic potentials are smooth, we may apply Youngs
theorem to them. The resultant equations corresponding to (1.8.1) are collectively
called Maxwells relations. Some examples follow.
dE = T dS P dV +

(1.8.2)



P
T
=
.
V S
S V

(1.8.3)

gives

Notice the conjugate pairs appearing in the upstairs and in the conditions. If we
start from the Helmholtz free energy, we get


S
P
=
.
(1.8.4)
V T
T V

1.8.2 Jacobian technique to manipulate partial derivatives


To manipulate many partial derivatives, it is very convenient to use the so-called
Jacobian technique. The Jacobian for two independent variables is defined as the
following determinant:


X X




(X, Y ) x y y x
Y X
X Y
=

.

(1.8.5)

Y
(x, y)
x y y x
x y y x
Y

x y

In particular, we have

(X, y)
X
=
.
(x, y)
x y

(1.8.6)

28

CHAPTER 1. THERMODYNAMICS PRIMER

This and some simple algebraic relation in ?? are the keys.


Remark. More generally, the Jacobian of n functions {fi }ni=1 of n independent
variables {xi }ni=1 is defined by


(f1 , f1 , , fn )
fi
det
.
(1.8.7)
(x1 , x2 , , xn )
xj
u
t
1.8.3 Useful elementary relations involving Jacobians
From the property of the determinant, if we change the order of variables or functions,
there is a sign change:
(X, Y )
(X, Y )
(Y, X)
(Y, X)
=
=
=
.
(x, y)
(y, x)
(y, x)
(x, y)

(1.8.8)

If we assume that X and Y are functions of a and b, and that a and b are, in turn,
functions of x and y, we have the following multiplicative relation:
(X, Y ) (a, b)
(X, Y )
=
.
(a, b) (x, y)
(x, y)

(1.8.9)

This is a disguised chain rule. The proof of this relation is left to the readers. Use





X
X a
X b
=
+
.
(1.8.10)
x y
a b x y
b a x y
The rest is straightforward algebra.
From (1.8.9) we get at once
(X, Y )
=1
(x, y)

(x, y)
.
(X, Y )

(1.8.11)

In particular, we have



X
x
=1
.
x Y
X Y
Using these relations, we can easily demonstrate

,

x
X
x
=
y x
y X
X y

(1.8.12)

(1.8.13)

1.9. CONSEQUENCES OF STABILITY OF EQUILIBRIUM STATES

29

as follows:
(X, x)
(y, x)

(1.8.9)

(y, X) (X, x)
(y, x) (y, X)

(1.8.8)

(x, X) (X, y)
.
(y, X) (x, y)

Then, use (1.8.11). A concrete example of this formula is





V
p
V
=
,
T
T
p
V

(1.8.14)

(1.8.15)

which relates thermal expansivity and isothermal compressibility.


1.8.4 Maxwells relation in terms of Jacobian
All the Maxwells relations can be unified in the following form
(X, x)
= 1,
(Y, y)
where (x, X) and (y, Y ) are conjugate pairs.

1.9

(1.8.16)

Consequences of stability of equilibrium states

1.9.1 Origin of definite signs of many second derivatives


Clausius inequality was interpreted as an evolution criterion of equilibrium states
in 1.6.2. From this inequality we derived other inequalities in terms of various
thermodynamic potentials. These inequalities dictate signs of many derivatives of
thermodynamic quantities.
1.9.2 Second differential must be with a definite sign15
Let us start with (1.6.12). This is an evolution criterion, i.e., if this inequality is
satisfied, the system evolves spontaneously. Therefore, if the equilibrium state under
consideration is stable (or neutral), we must have
E Te S Pe V + e N + + xie Xi + ,

(1.9.1)

where implies virtual changes of variables and denotes other first order deferential terms. On the other hand, expansion of E to the second order reads
E = T S P V + N + + xi Xi +
1 X 2E
+
Xi Xj + higher order differential terms. (1.9.2)
2 i,j Xi Xj
15

This topic is a convex analysis topic. E is a convex function of the extensive variables.

30

CHAPTER 1. THERMODYNAMICS PRIMER

Since the system is in equilibrium with the environment with Te , Pe , e , etc., T = Te ,


P = Pe , etc. holds. Therefore, combining these two formulas, we conclude that
1 X 2E
Xi Xj 0
(1.9.3)
2 i,j Xi Xj
for any {Xi }. That is, the matrix (Hessian): Matr.( 2 E/Xi Xj ) is positive
semidefinite (if the system is really stable, positive definite).16
1.9.3 Simple consequence of positive semidefiniteness of Hessian matrix
A necessary and sufficient condition for a matrix to be positive definite is that all
its principal minors are positive. Therefore, (1.9.3) implies that all the diagonal
elements are non-negative:
2E
0.
(1.9.4)
Xi2
For example,

2E
S
>0
> 0 CV > 0,
(1.9.5)
S 2
T V

P
2E
>0
> 0 S > 0,
(1.9.6)
V 2
V
S

where CV ( T (S/T )V ) is the specific heat under constant volume, and S


( (P/V )S /V ) is the adiabatic compressibility.
That the positivity of these quantities implies the stability of the system is intuitively understandable. Suppose CV < 0. Then, if the system gains energy (as heat
gain), the temperature of the system decreases. Consequently, the system becomes
a heat sink, and sucks all the energy of the universe.
1.9.4 More general consequences of positive semidefiniteness of Hessian
matrix
Generally, the above mentioned necessary and sufficient condition for the positive
definiteness for the matrix Matr.( 2 U/Xi Xj ) implies
(xi , xj , , xl )
> 0,
(Xi , Xj , , Xl )
where xi is the conjugate variable of Xi :

E
xi
Xi

(1.9.7)

(1.9.8)

X1 X2 (except Xi )Xl

16

If the internal energy E is twice differentiable. E is always a C 1 -function: the intensive


variables are continuous functions, but twice differentiability may not be guaranteed.

1.9. CONSEQUENCES OF STABILITY OF EQUILIBRIUM STATES

31

In particular, we have
(T, P )
(T, P )
(1.9.9)
>0
< 0.
(S, V )
(S, V )
Notice that, whenever you use general formulas, the conjugate of V is not P but P .
1.9.5 Use of other thermodynamic potentials
We can start with any stability inequality. For example, we may start from the
inequality (1.7.4) for the Helmholtz free energy. In this case temperature must be
kept constant at Te . The general condition corresponding to (1.9.3) is

1 X 2 A
(1.9.10)
Yi Yj 0,
2 i,j Yi Yj Te , other Y 0 s
where Yi are extensive natural variables for A.
Example 1. Show that any specific heat Ca is positive, if the system is stable, where
suffix a implies that a is kept constant; a can be any variable, intensive or extensive.
Specific heat is always defined by


S
0 Q
=
.
(1.9.11)
Ca T
T a
T a
If a is extensive, the derivative is just a diagonal element in Matr.( 2 E/Xi Xj );
notice that ( 2 E/S 2 )Y = T /CY . If a is intensive, we use
(S, x)
(T, x)

(1.8.9)

(S, X) (S, x)
,
(T, x) (S, X)

(1.9.12)

where X is the extensive conjugate variable of x. The first factor on RHS (= righthand-side) is just a 2 2 case of (1.9.7), while the second factor is its 1 1 case.
Thus, we may conclude, for example, CP is positive. u
t
Generally, we can show

X
> 0,
(1.9.13)
x ...
where (X, x) is a conjugate pair, where are various constraints.
1.9.6 Le Chateliers principle
Suppose an equilibrium state is disturbed by applying a small change in X. Le
Chateliers principle asserts that the direct effect of this change on x occurs in the
direction to ease the effect of X. This is an interpretation of

xi
0.
(1.9.14)
Xi except Xi ,

32

CHAPTER 1. THERMODYNAMICS PRIMER

For example, suppose we introduce heat into a system in equilibrium. This is


interpreted as increasing S of the system. If the system temperature went down
(i.e., T < 0), then more heat could flow in. This causes further decrease of the
system temperature and the situation runs away. Certainly, the equilibrium state
of the system cannot be stable. Therefore, increasing S (i.e., S 0) must imply
T 0 if the equilibrium is stable. This implies the positivity of the specific heat.
1.9.7 Le Chatelier-Brauns principle
Suppose an equilibrium state is disturbed by applying a small change in X. Le
Chatelier-Brauns principle asserts that the indirect effect of this change on y occurs
in the direction to ease the effect of X. This is an interpretation of


x
x

.
(1.9.15)
X Y
X y
or we may write
(x)Y (x)y .

(1.9.16)

Let us demonstrate (1.9.15).



x
(x, y)
(x, y) (X, Y )
=
=

X y
(X, y)
(X, Y ) (X, y)





x y
Y
x y

=
y X X Y Y X
Y X X Y




x
Y x y
=

X Y
y X Y X X Y



x
Y (x, X) y
=

X Y
y X (Y, X) X Y



x
Y (x, X) (Y, y) y

=
X Y
y X (Y, y) (Y, X) X Y


2
x
Y y
(1.8.16)
=

,
X
y X
Y

(1.9.17)
(1.9.18)
(1.9.19)
(1.9.20)
(1.9.21)
(1.9.22)

where Maxwells relation 1.8.4 was used. We realize that the subtracted term in
(1.9.22) is positive. Therefore, we obtain (1.9.15).
For example, suppose we introduce heat into a system in equilibrium. This is
interpreted as increasing S of the system. Then the system temperature increases.
Let us keep the system volume. (T )V denotes the temperature increase under

1.10. IDEAL RUBBER

33

constant volume. Now, we allow the system to change its volume under constant P .
The temperature change (T )P under this condition should be smaller:


T
T

.
(1.9.23)
S P
S V
That is,
CP CV .

1.10

(1.9.24)

Ideal rubber

1.10.1 Ideal rubber band


A rubber band is made of many flexible chain molecules. In an idealization we
assume that there is no energy change required to alter conformations of molecules,
and molecules do not interact with each other. Due to thermal motion, monomers

a
b

Fig. 1.9 A polymer chain and a chain of dancing children (b after N. Saito)

making each polymer tend to point random spatial directions. This means that the
chains spontaneously coil up. Thus if a rubber band, which may be regarded as a
bundle of these chains, is stretched, it resists stretching. As illustrated in Fig. 1.9,
monomers (arrows in a) are like children hand-in-hand, dancing around like b. The
two end flag poles are surely pulled inwardly.
The internal energy of a rubber band can be written as (for simplicity, we consider
a 1D stretch)
dE = T dS + F dL,
(1.10.1)

34

CHAPTER 1. THERMODYNAMICS PRIMER

where F is the tensile force and L the total length of the rubber band. For ideal
rubber, no volume change occurs upon stretching.
1.10.2 Entropic elasticity of rubber band
Under constant stretching force F , the length should become shorter if the temperature is raised. Hence, we assume

L
< 0.
(1.10.2)
T F
Thermodynamics cannot demonstrate this inequality. This should come from empirical data or from more microscopic considerations. Many interesting conclusions
follow from this single inequality and the general thermodynamic framework.
The inequality (1.10.2) is opposite to the usual substance that expands upon heating. (1.10.2) is a signature of entropic elasticity: elasticity due to the moving around
of the microscopic constituents. The usual solids relax due to thermal motion, because their elasticity comes from the energetic interactions among constituents, so
moving around that causes thermal expansion weakens their elastic constant.
1.10.3 Upon stretching entropy decreases
What happens to the entropy of the rubber band, if we stretch it under constant
temperature?


L
S
=+
(< 0)
(1.10.3)
F
T
T

thanks to a Maxwells relation is enough to answer the question.


Stretching certainly hinders the motion of the chain, or the children in Fig. 1.9 in
1.10.1. The above inequality tells us that entropy indicates the randomness of the
microscopic constituents of a macroscopic systems.
1.10.4 Adiabatic stretching raises temperature
What happens to the temperature, if we adiabatically (i.e., under S constant) stretch
the band? We wish to know the sign of


(T, S) (F, L) (1.8.16) L
T
(T, S)
=
=
.
=
(1.10.4)
F S (F, S)
(F, L) (F, S)
S F
The relation tells us that we can answer the above question through answering the
question about the length change of a band, when entropy is increased under constant
stretching force. The latter question may not be intuitively easy. Let us use the idea
that entropy indicates the randomness of the chain (or how many different shapes the

1.11. THIRD LAW OF THERMODYNAMICS

35

chain can assume relatively easily). Increasing entropy requires easier movement of
the chain. Then, L should decrease. Therefore, (1.10.4) implies that the temperature
of the chain should rise upon increasing the stretching force.
Let us perform a formal calculation:


(T, S)
(F, T ) (T, S)
T S
T
=
> 0.
=
=
(1.10.5)
F S (F, S)
(F, S) (F, T )
CF F T
CF is the specific heat under constant force, which is positive (see (1.9.11)). We have
also used (1.10.3). Thus, the rubber band indeed becomes warm.
Perform an experiment! You can easily feel this temperature increase by touching
a rapidly stretched rubber band (a thick one such as used to bind broccoli in the
grocery store) with your lip. If you rapidly relax the stretched rubber band after it
equilibrates with the room temperature, you can again feel with your lip that the
band has cooled off.
In the above inequality F can be replaced with L:



T F
T
=
> 0.
(1.10.6)
L S
F S L S
The second factor on RHS is positive, since it is a diagonal element (cf. (1.9.13)).
What happens if we stretch the rubber band under constant temperature? Let us
study entropy.


(S, T )
(S, L) (S, T )
T T
S
=
< 0,
=
=
(1.10.7)
L T
(L, T )
(L, T ) (S, L)
CL L S
where CL is the specific heat under constant length, and we have used (1.10.6).
Exercise 1 How about the following derivatives?



L
F
F
,
,
.
(1.10.8)
S L S F S T
u
t

1.11

Third law of thermodynamics

1.11.1 Nernsts law and the third law


Nernst empirically found that all the derivatives of entropy S vanish as T 0

36

CHAPTER 1. THERMODYNAMICS PRIMER

(Nernsts law). For example,




V
S
=
0.
P T
T P

(1.11.1)

All the specific heat vanishes as T 0. Notice that these observations contradict
the ideal gas law. Nernst concluded that entropy becomes constant (independent of
any thermodynamic variables) in the T 0 limit. Later, Planck chose this constant
to be zero (the concept of absolute entropy). We will discuss absolute entropy later
in light of statistical mechanics.
S = 0 at T = 0 is sometimes chosen as the third law of thermodynamics. We
adopt the following:
ThIII Reversible change of entropy S vanishes in the T 0 limit.
1.11.2 Adiabatic cooling
To cool a system we often use adiabatic cooling. During a reversible adiabatic process
entropy is kept constant. Therefore, if entropy depends not only on T but also on
some thermodynamic variable a, there should be a way to decrease T by changing
a, since
S(T, a) = const.
(1.11.2)
We already know a good example. When we relax a stretched rubber band sufficiently
rapidly, the temperature of the band decreases. The mechanism for this cooling is
almost identical to the often employed adiabatic demagnetization method to get very
low temperatures.
A piece of a magnetic material contains many spins (or microscopic magnetic
moments17 ). If we connect these spins in a head-to-tail fashion, we get a polymer
chain. Stretching a chain corresponds to aligning spins. This we can accomplish
by external magnetic field. Hence, if we remove the external magnetic field, the
temperature of the system decreases.
More formally, we can compute


(S, T ) (H, T )
T
S
=
=
CH > 0,
(1.11.3)
H S
(H, T ) H, S)
H T
where CH is the heat capacity under constant magnetic field, and (S/H)T < 0 has
been used:


S
M
=
< 0.
(1.11.4)
H T
T H
17

However, the interactions aligning the moments are not moment-moment interactions but
electron exchange interactions.

1.11. THIRD LAW OF THERMODYNAMICS

37

This inequality must be assumed within thermodynamics, but intuitive microscopic


understanding is not hard. Hence, (11.3) implies that if H is decreased, T decreases.
1.11.3 Absolute zero is unattainable
Since the derivatives of S becomes smaller when we come closer to the absolute zero
temperature (Nernsts law), any cooling method becomes inefficient sufficiently close
to T = 0. Thus, we cannot reach T = 0.18

18

However, this does not imply the third law.

38

CHAPTER 1. THERMODYNAMICS PRIMER

Chapter 2
Statistical Mechanics Primer
Rudimentary probability and combinatorics are summarized in Appendices 2.A and
2.B.

2.1

Basic hypothesis of equilibrium statistical mechanics

2.1.1 Phase space


We describe a given macroscopic system microscopically in terms of mechanics. At
every instant, the system takes a definite microscopic state. The totality of all the
admissible microscopic states is called the phase space of the system. Microstates
are the elementary events in the terminology of probability theory (2.A.8).
If we assume that the system can be described by classical mechanics, every microscopic state is designated by positions and velocities (momenta) of all the particles
constituting the system. Therefore, we make a 6N -dimensional space called the
phase space and a microstate corresponds to a certain point in this space uniquely.
Quantum mechanically, the phase space may be identified with the vector space
spanned by all the eigenstates of the Hamiltonian of the system.1
1

Precisely speaking, only the direction of the vector matters, so the phase space is a collection
of rays.

39

40

CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.1.2 Why statistical descriptions?


We cannot expect to be able to describe a macroscopic system completely at the
microstate level. At best we hope for a kind of statistical description of the system. This is because our macroscopic observations are not instantaneous, and also
because macroscopic objects can be regarded as an ensemble of statistically (more
or less) independent subsystems which are again macroscopic. This is empirically
guaranteed by the fourth law of thermodynamics (1.4.1).
To develop a probabilistic description, we must know the probabilities of elementary events, i.e., microstates. Even if we assume that the world is completely describable by mechanics, we cannot derive the necessary fundamental probabilities from
mechanics alone. Thus, we postulate a general probabilistic law about microscopic
events, whose justification comes from the success of the framework a posteriori.
2.1.3 Principle of equal probability
Consider an isolated system. The fundamental postulate of equilibrium statistical
mechanics is:
Principle of equal probability: All microstates (i.e., elementary events) are equally
probable.
Of course, the said microstates must be the ones compatible with the constraints
imposed on the macroscopic system.
As is mentioned above, there is no justification of this postulate from the atomistic
mechanical picture of the world; invariably, something extramechanical creeps into
the derivation.
We must not forget that there cannot be any truly isolated system in this universe.
A famous argument by E. Borel goes as follows: if one gram of matter displaces 1
cm on Sirius (11 light years away), the gravitational field around us changes 1 part
per 10100 . This tiny change is, however, enough to completely destroy the intrinsic
mechanical behavior (say, particle trajectories in classical mechanics) of the system
after 1 nsec.

2.2

Boltzmanns principle

2.2.1 Probability and entropy


In statistical mechanics any macroscopic state is interpreted as a set of microstates

2.2. BOLTZMANNS PRINCIPLE

41

which give the same macroscopic observable values (the same thermodynamic quantities). If the system is isolated, all the microstates are equally likely (2.1.3), so
they have the same probability P to be observed.
We have learned the interpretation that entropy is a measure of microscopic disorder (cf. 1.10.3, 1.10.4 in Chapter 1). If a macrostate has more microstates that
are compatible with it, then its entropy should be larger. Therefore, in any case it is
sensible to assume that entropy S is a function of P : S = S(P ). This is the crucial
point Boltzmann realized about 100 years ago.
2.2.2 Boltzmanns principle
Let us derive Boltzmanns principle: the entropy of a macrostate is given by
S = kB log ,

(2.2.1)

where kB is the Boltzmann constant, and is the number of microscopic states (or
the phase volume of the set of microstates) compatible with the macrostate of an
isolated system.
A crucial observation is that entropy is an extensive quantity. If we form a compound system by combining two systems I and II already in thermal equilibrium
with each other, the entropy of the compound system is the sum of that of each
component.
The interaction introduced by the contact of the two systems is, for macroscopic
systems, a very weak one. In any case, the effect is confined to the boundary layer
whose thickness is microscopic. The probability to observe a microstate of the compound system can be computed by simply multiplying the probabilities to observe
the microstate for each subsystem. In other words, the two subsystems may be regarded statistically independent (cf. (2.A.12)).
Combining the above considerations, we have arrived at the following functional
relation:
S(PI PII ) = S(PI ) + S(PII ),
(2.2.2)
where suffixes denote subsystems.
Assume that S is a sufficiently smooth function. We conclude from the relation
that S is proportional to log P . The fundamental postulate tells us that P = 1/.
Therefore, we have arrived at (2.2.1). The proportionality coefficient kB must be
positive, because entropy should be larger with larger E that corresponds to larger
(see 1.5.5).
2.2.3 Equilibrium state is the most probable state
We know that the equilibrium state corresponds to the maximum entropy for an
isolated system (1.6.3). Formula (2.2.1) implies that the equilibrium state is the

42

CHAPTER 2. STATISTICAL MECHANICS PRIMER

most probable macrostate (meaning that it corresponds to the largest number of


microstates). Thermodynamic irreversibility is due to the change of less likely
macrostates towards more probable macrostates. Usually, an ordered state has less
compatible microstates than a disordered state, so that spontaneous processes increase the microscopic disorder of a system. In this way the origin of irreversibility
is intuitively understood.
2.2.4 Calculation of intensive quantities
Once we know the number of microscopic configurations of an isolated system as
a function of energy E, volume V and the number (or molarity) of particles N , we
can compute T , P , and with the aid of thermodynamic relations: it is convenient
to rewrite the Gibbs relation (1.3.5) as follows (be careful about the signs):
dS =

1
P

H
dE + dV dN
dM + .
T
T
T
T

(2.2.3)

From this, we get


S
1
(2.2.4)
= ,
E
T
P
S
= ,
(2.2.5)
V
T
S

= .
(2.2.6)
N
T
These derivatives are called conjugate intensive variables (with respect to entropy).
2.2.5 Example: Schottky defects
Let us consider an isolated crystal with point defects (vacancies) on the lattice sites
(Schottky defects). To create one such defect we must move an atom from a lattice
point to the surface of the crystal. The energy cost for this is assumed to be .
Although the number n of vacancies are macroscopic, we may still assume it to be
very small compared to the number N of all lattice sites. Hence, we may assume
that the volume of the system is constant. The energy of the system is a macroscopic
(thermodynamic) variable which completely specifies macrostates.
We must compute (E) as a function of the total energy E, which is given by
E = n.

(2.2.7)

The average of E is the internal energy E. We may consider as a function of n.


Obviously,
 
N
(n) =
.
(2.2.8)
n

2.3. EQUILIBRIUM AT CONSTANT TEMPERATURE


To compute this, we use Stirlings formula to evaluate N ! asymptotically:
 N
N
N!
,
e

43

(2.2.9)

or
log N ! N log N N.

(2.2.10)

Boltzmanns principle gives us


S = kB log (n) ' kB [N log N n log n (N n) log(N n)].

(2.2.11)

Using (2.2.4), we get



1 dS
1
S
kB
N n
=
=
=
log
.

T
E V
dn

(2.2.12)

If the temperature is sufficiently low or is sufficiently large so that /kB T  1, the


above formula reduces to

N
(2.2.13)
' log .
kB T
n
Hence, under this low temperature condition, the internal energy E reads
E = N e/kB T .
The constant volume specific heat CV of the system can be obtained as

2
dE

CV =
= N kB
e/kB T .
dT
kB T

(2.2.14)

(2.2.15)

Exercise 1. Find the formula for CV correct for all T . Notice that CV has a peak
when kB T is of order . u
t

2.3

Equilibrium at constant temperature

2.3.1 System + thermostat is considered as isolated


Let us consider a closed system in a thermostat (or a heat bath). In this case, the
total energy E of the system is no longer constant, but can fluctuate. Instead, the
temperature T is kept constant. We assume V and N are also kept constant (the

44

CHAPTER 2. STATISTICAL MECHANICS PRIMER

system is in a rigid container). To study this system we use the same trick used in
thermodynamics (see 1.6.4). We embed our system (I) in the heat bath (II), and
assume that the composite system is an isolated equilibrium system. The theory
developed in the preceding section then applies.
The total energy E0 of the system is given by
E0 = EI + EII .

(2.3.1)

The number of microscopic states for system I (resp., II) with energy EI (resp.,
EII ) is denoted by I (EI ) (resp., II (EII )). Thermal contact is a very weak interaction, so the two systems are statistically independent. Hence, the number of
microstates for the composite system with the energies EI in I and EII in II is given
by
(2.3.2)
I (EI )II (EII ).
The total number (E0 ) of microstates for the composite system must be the sum
of this product over all the ways to distribute energy to I and II. Therefore, we get
X
I (EI )II (E0 EI ).
(E0 ) =
(2.3.3)
0E E0

2.3.2 Derivation of canonical distribution


The fundamental postulate of equilibrium statistical mechanics 2.1.3 implies that
the probability for the system I to have energy E (or more precisely, energy between
E and E + dE) is given by
P (EI E) =

I (E)II (E0 E)
.
(E0 )

(2.3.4)

Consider the equilibrium state of the composite system; the subsystems must
also be in thermal equilibrium. We may use Boltzmanns principle 2.2.2 to rewrite
II (EII ) as2 and it agrees with S(E) at E = E as a function of
II (EII ) = exp(SII (EII )/kB ).

(2.3.5)

The system II is huge compared with I. Expand the entropy as follows:


SII (E0 E) = SII (E0 ) E
2

SII
1 2 SII
+ E2
+
2
EII 2 EII

(2.3.6)

Strictly speaking, the thermodynamic entropy S is defined only for equilibrium states, so
although S(E) is defined, general S(E) has not been defined. We must say that this general S is
defined by (2.3.5). However, here, we may interpret, for example, S(EII ) is the entropy of system
II in equilibrium with internal energy EII .

2.3. EQUILIBRIUM AT CONSTANT TEMPERATURE

45

and denote the temperature of the heat bath (i.e., system II) by T :
SII
1
= .
EII
T

(2.3.7)

The most probable E should be close to the internal energy of system I, so that due
to the extensivity of internal energy this should be of order NI , the total number of
1
, so
particles in the system I. The second derivative in (2.3.6) should be of order NII
the ratio of the second term and the third term in (2.3.6) is of order NI /NII , which
is negligibly small. Thus, we can streamline (2.3.4) as
P (EI E) I (E)eE ,

(2.3.8)

= 1/kB T

(2.3.9)

where a standard notation


is used.
2.3.3 Canonical partition function
To compute the probability we need the normalization constant for (2.3.8)
X
I (E)eE ,
(2.3.10)
Z=
E

which is called the canonical partition function. The sum may become an integral.
A more microscopic expression is also possible:
X
(2.3.11)
Z=
eE = T r eH .
all microstates

If we decompose the sum as follows, we can easily understand this formula:


X
X
X
(2.3.12)
=
,
E

all microstates

but

all microstates with energy E

eE = (E)eE .

(2.3.13)

all microstates with energy E

The probability distribution we have obtained:


P (E) =

1
(E)eE
Z I

is called the Gibbs-Boltzmann distribution.

(2.3.14)

46

CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.3.4 Calculation of internal energy


Once the canonical partition function is known, the internal energy of the system
can be obtained easily:
X
log Z()
1X
E = hEi =
P (E)E =
EI (E)eE =
,
(2.3.15)
Z

E
E
where Z (cf. (2.3.11)) is explicitly written as a function of . The formula should
be easily understood from the corresponding general formula (2.A.17) for the generating function. Indeed, the canonical partition function is the generating function of
energy.
2.3.5 Calculation of Helmholtz free energy
Boltzmanns principle tells us that (E) exp(S(E)/kB ), and is an increasing function of E. Due to the extensivity of entropy, S(E)/kB is of order N . The energy of
the system is also of the same order. Hence, (E) exp(E) is sharply peaked around
its most probable value of E, which should be very close to the internal energy E,
the average of E.
This knowledge can be used to evaluate the canonical partition function Z:
X
X
e(ET S(E)) ' e(ET S(E)) ,
(E)eE =
Z=
(2.3.16)
E

where E is the internal energy that is identified with the most probable value of the
energy. The error in this estimate is of order N , i.e., the error in log Z, which is
extensive (i.e., of order N ), is of order log N , extremely accurate.
From this we conclude the following equation of immense importance:
A = kB T log Z.

(2.3.17)

This is a consequence of Boltzmanns principle, but practically it is much more useful


than the principle itself.
It turns out that (2.3.15) is a thermodynamically well-known formula:


(A/T )
A
= E, or
= E,
(2.3.18)
(1/T ) V
V
the Gibbs-Helmholtz formula.
2.3.6 Example: Schottky defects revisited
This is a continuation of 2.2.5. With (n) known, it is easy to compute Z:
X
Z=
(n)en = (1 + e )N ,
(2.3.19)
n

2.4. SIMPLE SYSTEMS

47

where we have used the binomial theorem (2.B.6). From this the Helmholtz free
energy immediately follows:
A = N kB T log(1 + e ).

(2.3.20)

We can get entropy by differentiation:


S=

e
= N kB log(1 + e ) + N
.
T
T (1 + e )

(2.3.21)

Compare this with the formula obtained directly by Boltzmanns formula. u


t

2.4

Simple systems

2.4.1 No-interacting 1D harmonic oscillators


Consider a collection of N 1-dimensional harmonic oscillators which are not interacting with each other at all.
Let us first examine a single oscillator of frequency . Elementary quantum mechanics tells us that the energy of the system is quantized as


1
=
+ n h, n = 0, 1, 2, .
(2.4.1)
2
Each eigenstate is nondegenerate. Thus, if we specify the quantum number n, the
state of a single oscillator is completely specified. The canonical partition function
for a single oscillator reads


 

X
1
exp
Z1 =
+ n h .
(2.4.2)
2
n=0
Using (1 x)1 = 1 + x + x2 + x3 + (|x| < 1), we get
Z1 = e

h/2

(1 e

h 1


=

h
2 sinh
2

1
.

(2.4.3)

There are N independent oscillators, so the canonical partition function for the
system should be
Z = Z1N .
(2.4.4)

48

CHAPTER 2. STATISTICAL MECHANICS PRIMER

You can honestly proceed as follows, too. The state of the system should be uniquely specified (cf.
Fig. 2.1 in 2.4.5) if we know all the quantum numbers of the oscillators {n1 , n2 , , nN }; we may
identify this table and the microstate. The energy E of the microstate {n1 , n2 , , nN } is given by
the sum of the energies of individual oscillators:
E=

N 
X
1
i=1


+ ni h.

(2.4.5)

The canonical partition function is, by definition,


Z=

X
n1 =0 n2 =0

exp N h/2

nN =0

N
X

!
ni h

eh/2

!N
enh

(2.4.6)

n=0

i=1

Thus, we have arrived at (2.4.4).

From (2.4.4) we obtain




h
A(N ) = N kB T log sinh
2
and

N h
coth
E=
2

h
2


,

(2.4.7)


.

(2.4.8)

Exercise 1. Compute S and CV . Notice that A(2N ) = 2A(N ), that is, the fourth
law of thermodynamics is satisfied. We will return to this model when we study the
specific heat of insulators (2.6.3).u
t
2.4.2 Ideal gas
Consider a gas consisting of N identical noninteracting particles. Each particle can
have internal degrees of freedom which may be thermally excited. The gas is rare
enough, so that we may ignore any quantum effect due to nondistinguishability of
identical particles. For this to be true the average de Broglie wave length of each
particle must be much smaller than the average interparticle distance. The de Broglie
wave length is, on average,
p
(2.4.9)
h/ mkB T ,
where m is p
the mass of the particle, and h is Plancks constant. The mean particle
distance is 3 V /N , so the condition we want is

3/2
N
mkB T
.

(2.4.10)
V
h2
When this inequality is satisfied, we say the gas is classical. Notice that the dynamics of internal degrees of freedom such as vibration and rotation need not be

2.4. SIMPLE SYSTEMS

49

classical (cf. 2.7.4). Since there are no interactions between particles, each particle
cannot sense the density. Consequently, the internal energy of the system must be a
function of T only: E = E(T ). This is a good characterization of ideal gases.
2.4.3 Quantum calculation of one particle
Let us first compute the number of microstates allowed for a single particle in a box
of volume V . To this end we solve the Schrodinger equation in a cube with edges of
length L:
~2
(2.4.11)

= E;
2m
is the Laplacian, and a homogeneous Dirichlet boundary condition = 0 at the
wall is imposed. As is well-known, the eigenfunctions are:
k sin kx x sin ky y sin kz z,

(2.4.12)

and the boundary condition requires the following quantization condition:


k (kx , ky , kz ) =

(nx , ny , nz ) n.
L
L

(2.4.13)

Here, nx , are positive integers, 1, 2, . The eigenfunction k belongs to the


eigenvalue (energy) k 2 ~2 /2m.
The number of states with wavevectors k in the range k to k + dk is


L
L

#{k | k < |k| < k + dk} = # n k < |n| < (k + dk)



3
1
1L
1
4k 2 dk = 2 V k 2 dk.
(2.4.14)
=
4n2 dn =
3
8
8
2
The factor 1/8 is required because the relevant k are only in the first octant.
Now we can compute the canonical partition function for a single particle using
its definition:
Z
X
1V
Zt =
exp(E) '
4k 2 dk exp(k 2 ~2 /2m).
(2.4.15)
3
8

0
n >0,n >0,n >0
x

The integration is readily performed:



Zt (V ) = V

2mkB T
h2

3/2
.

(2.4.16)

50

CHAPTER 2. STATISTICAL MECHANICS PRIMER

The important point of this result is that Zt V .


If a particle has internal degrees of freedom, we must multiply the corresponding
internal partition function Zi to get the one particle partition function Z1
Z1 (V ) = Zt (V )Zi .

(2.4.17)

For later convenience V is explicitly specified. This formula can easily be understood,
because the energy of a particle is a sum of the kinetic energy of its translational
motion and the energy of its internal degrees of freedom, E = Et + Ei . Each term
in this sum does not share any common variable. Zi does not depend on V , because
internal degrees of freedom are not affected by the external environment (if the system is dilute enough).
2.4.4 Gibbs paradox
The partition function Z of the system consisting of N identical particles may seem
to be
(2.4.18)
Z = Z1N ,
just as (2.4.4), because particles do not interact.
If we assume that all the particles are distinguishable as ordinary objects we
observe daily on our length scale, the microstate is specified by a table of integer
3
vectors {ni }N
i=0 , where ni is the vector appearing in (2.4.13) for the i-th particle.
Then, the canonical partition function should read
Z =

X
n1 =1 n2 =1

exp[(E1 + E2 + + EN )],

(2.4.19)

nN =1

where Ei is the energy of the i-th particle. The sum can readily be done and we get
(2.4.18). Let us compute the Helmholtz free energy of the system. The fundamental
relation (2.3.17) gives
A(N, V ) = N kB T log Z1 (V ).
(2.4.20)
Now, prepare two identical systems each of volume V with N particles. The free
energy of each system is given by A(N, V ). Next, combine these two systems to make
a single system. The resultant system has 2N particles and volume 2V , so its free
energy should be A(2N, 2V ). The fourth law of thermodynamics (1.4.2) requires
that
A(2N, 2V ) = 2A(N, V ).
(2.4.21)

and internal states of each particle; here for simplicity we ignore internal degrees of freedom

2.5. CLASSICAL STATISTICAL MECHANICS

51

Unfortunately, as you can easily check, this is not satisfied by (2.4.20). Thus we
must conclude (2.4.18) is wrong. This is the famous Gibbs paradox. Since the fourth
law is an empirical fact, we must modify (2.4.18) to
Z = f (N )Z1N ,

(2.4.22)

where f (N ) is as yet an unspecified function of N . (2.4.21) requires


2 log f (N ) = 2N log 2 + log f (2N ),

(2.4.23)

or
f (N )2 = 22N f (2N ) (more generally, f (N ) = N f (N )).

(2.4.24)

The general solution to this functional equation is



f (N ) =

f (1)
N

N

(N !)1 .

(2.4.25)

Therefore, thermodynamics forces us to write


Z=

1 N
Z ,
N! 1

(2.4.26)

where we have discarded the unimportant multiplicative factor.


2.4.5 What can we distinguish, and what not?
Why do we need 1/N ! for gases and not for oscillators? The most important difference is: while in the case of oscillators each oscillator cannot move in space but
simply sits on, say, a lattice point, each gas particle can move around. See Fig. 2.1.
The combinatorial interpretation of N ! (2.B.17) implies that the configuration
(1) and (2) are distinguishable, but (3) and (4) are not.
We must conclude that each gas particle is indistinguishable from other particles.
A configuration of gas particles is just like one pattern on a TV screen; each pixel
is indistinguishable. Thermodynamics has forced us to abandon the naive realism;
molecules are not the same as the macroscopic objects whose distinguishability we
take for granted.

2.5

Classical statistical mechanics

52

CHAPTER 2. STATISTICAL MECHANICS PRIMER


1

(1)

5
3

4
3

(2)

harmonic oscillators
numbers are quantum numbers

(3)

(4)

gas particles
numbers denote `different' particles

Fig. 2.1 (1) and (2) are distinguishable, but (3) and (4) are not.

2.5.1 Classical formulation of canonical partition function


In classical mechanics a many body system consisting of N point particles can be
completely (microscopically) described by a table of all instantaneous positions and
momenta of the particles {q i , pi }N
i=0 , where q i is the position of the i-th particle, and
pi the momentum of the i-th particle. The vector (q 1 , q 2 , , q N , p1 , p2 , , pN )
spans the phase space. It is natural to interpret the sum in the definition of the
canonical partition function as an integral over phase space. Thus the classical
canonical partition function would be written
Z
Z
(2.5.1)
Z = dq 1 dq 2 dq N dp1 dp2 dpN eH ,
where H is the total energy (the Hamiltonian) of the system. If the system consists
of itinerating particles, Z must be divided by N ! as discussed in 2.4.5. Compare this
formula with the partition function for the ideal gas already computed as (2.4.15).
Integration over space (which gives V ) and the integration over k in the formula can
be rewritten as
Z
Z
Z
1
V
2
k dk = 3
dq
dp ,
(2.5.2)
2 2 0
h 1 1
so Z becomes
Z
Z
Z
Z
Z
Z
1
Z=
dq
dq
dq N
dp1 ,
dp2
dpN eH ,
N !h3N 1 2

(2.5.3)
or, introducing the phase space volume element
d dq 1 dq 2 dq N dp1 dp2 dpN

(2.5.4)

we may write
1
Z = 3N
h N!

d eH .

(2.5.5)

2.6. HEAT CAPACITY OF SOLID

53

2.5.2 Quantum-classical correspondence


To understand why this is the correct choice, we must reflect upon what is actually
distinguishable as a distinct microstate. Heisenbergs uncertainty principle tells us
that the error (mean square error) in the coordinate x and its conjugate momentum
px must satisfy
xpx h.

(2.5.6)

For a one dimensional system consisting of a single particle, the phase space of the
system is a plane spanned by
x andpx . Due to the uncertainty principle we can
at best distinguish squares of h h. Hence, the number of distinguishable microstates in a domain of phase area A should be A/h.
This can easily be generalized to many body systems. The number of microstates
in a volume element dq 1 dq 2 dq N dp1 dp2 dpN should be dq 1 dq 2 dq N dp1 dp2
dpN /h3N . Since the energy H does not strongly depend on the phase positions
in a box, when the summation is replaced with the integration, we may replace the
energy with the function H (Hamiltonian = total energy). Thus classical canonical
partition function is defined by (2.5.5).
In fact, the formula (2.5.5) without Plancks constant was introduced by Gibbs
before the advent of quantum mechanics. Later, it was realized that the correction factor 1/h3N makes the transition between quantum and classical mechanics
smooth. However, in most cases this factor gives an uninteresting additive term to
the Helmholtz free energy, so we may often ignore it.

2.6

Heat capacity of solid

2.6.1 Harmonic solids


Consider a crystal made of N atoms, having 3N mechanical degrees of freedom.
Small displacements of atoms around their mechanical equilibrium positions should
be a kind of harmonic oscillation. Thus, we may regard the crystal as a set of 3N
independent harmonic oscillators (modes) of various frequencies (due to coupling
among atoms). As we have already shown, the partition function of the total system
is the product of the partition function for each harmonic mode.
2.6.2 Classical harmonic solids
Treating the system completely classically and using the definition of the classical

54

CHAPTER 2. STATISTICAL MECHANICS PRIMER

partition function (2.5.5), we get


Z
Z
1
2
2 2
Z1 =
dp dqe(p + q ) = 2kB T /~.
~

(2.6.1)

The contribution of this oscillator to the internal energy is readily obtained as


E1 = kB T.

(2.6.2)

This is independent of the frequency of the oscillator (equiparition of energy), so the


total internal energy of the crystal is simply
E = 3N kB T.

(2.6.3)

If the volume is kept constant, the frequencies are also kept constant. Therefore, the
constant volume specific heat CV is given by
CV = 3N kB ,

(2.6.4)

which is independent of temperature, a contradiction to the third law of thermodynamics (1.11.1).


Actual experimental results can be summarized as
CV T 3

(2.6.5)

at low temperatures. At higher temperatures (2.6.4) is correct and is called DulongPetits law.
2.6.3 Quantum harmonic solids
If we take quantum effects into account, the energy of each oscillator must be discrete.
Most importantly, there is a gap between the ground state and the first excited state.
Therefore, at low enough temperatures, excitation becomes prohibitively difficult,
and the specific heat vanishes. Thus the quantum effect is the key to understanding
the third law of thermodynamics (1.11.1).
This point was recognized by Einstein: he treated an ensemble of 3N identical 1D
oscillators quantum mechanically (Einstein model) as a model of solid. Since the 1D
Einstein model was already studied in 2.4.1, here we have only to replace N with
3N in (2.4.7), (2.4.8), etc., so the internal energy is




3
~
~
1
E = N ~ coth
(2.6.6)
= 3N
~ + ~
.
2
2
2
e
1

2.6. HEAT CAPACITY OF SOLID

55

Hence, the specific heat is



CV = 3N kB

~
kB T

2

e~
.
(e~ 1)2

(2.6.7)

At sufficiently high temperatures (~/kB T  1) quantum effects should not be


important. As expected we recover the classical result (2.6.4):
CV 3N kB .

(2.6.8)

For sufficiently low temperatures (~/kB T  1) (2.6.7) reduces to



CV ' 3N kB

~
kB T

2

e~ .

(2.6.9)

Thus, CV vanishes at T = 0, and the third law behavior is exhibited.


However, CV goes to zero exponentially fast at variance with the empirical law
(2.6.5) mentioned above. It is a rule that whenever there is a finite energy gap between the ground and the first excited states, the specific heat behaves like exp()
at low temperatures. The empirical result implies that there is no finite energy gap
in a real crystal.
2.6.4 Real harmonic solids: density of states
Now think about an actual three dimensional crystal. It is very hard to displace
every other atom (i.e., a sublattice relative to the other sublattice), but it is easy to
propagate sound waves, which are long wavelength (relative to the atomic spacing)
vibrations. Thus, an actual crystal should contain low frequency oscillators (modes).
Denote by f ()d the number of vibrational modes with the angular frequencies
between and + d. f () is called the density of state. The number of modes
must be identical to the number of degrees of freedom 3N (D = highest allowed
frequency):
Z
D

f ()d = 3N.

(2.6.10)

Now consider a 3D lattice with a3 N atoms, but with the lattice spacing 1/a of the
original spacing. It has the same shape and size as the original crystal. The highest
frequency becomes aD (Fig. 2.3).
Hence,
Z
0

aD

f ()d = 3a3 N.

(2.6.11)

56

CHAPTER 2. STATISTICAL MECHANICS PRIMER

min

min

a=3

Fig. 2.3 Highest frequency modes.

Differentiating this equation with respect to a, we get


D f (aD ) = 9N a2 .

(2.6.12)

If the actual microscopic details do not matter, then this relation should hold for
any a,4 so we conclude
f (x) x2
(2.6.13)
(You should have noticed that the 2 in this formula is actually d 1, d being the
spatial dimensionality. N is extensive, but D is not (volume independent), so that
f must be proportional to V . Thus, we may conclude that f () = 2 V , where
is a proportionality constant, which we can fix with the aid of (2.6.10):
3
D
V /3 = 3N

(2.6.14)

3
f () = 9N 2 /D
.

(2.6.15)

Thus the density of states is

2.6.5 Debye model


The total Helmholtz free energy of a 3d-crystal is given by
Z D
X
A=
A =
dA f (),

(2.6.16)

where (see (2.4.7))




~
A = kB T log sinh
2


.

(2.6.17)

If we use (2.6.15) for the density of states f , the model of solid is called the Debye
model. The Gibbs-Helmholtz formula gives for this model
Z
9
9~N D
3
E = N ~D + 3
d.
(2.6.18)
8
D 0 e~ 1
4

Needless to say, this assumption is not true. However, if we are interested in the relatively low
frequency modes (long-wavelength modes), this is not a bad assumption. This is enough for our
purpose.

2.7. CLASSICAL IDEAL GAS

57

The high temperature limit is obtained by


Z D
3
3
D

E = 3kB T N + const.,
e~ 1
3~
0

(2.6.19)

which of course agrees with the classical result.


In the low temperature limit the integral in the internal energy can be approximated as
Z D
Z
3 ~
e
(2.6.20)
'
3 e~ T 4 .
0

From these formulas we can get



= 3N kB ~D  kB T
CV
T3
~D  kB T

(2.6.21)

Warning. Do not confuse the word state used in density of states and in number of microscopic states. In the former, state implies eigenmode of the system Hamiltonian, while in the
latter state implies eigenstate of the system Hamiltonian. Thus, even for a single mode there
are many different excitation states.u
t

2.7

Classical ideal gas

2.7.1 Free energy of classical ideal gas


We have already discussed the translational part of the partition function ZT for the
classical ideal gas (2.4.16). The partition function of a classical ideal gas consisting
of N molecules is given by (see (2.4.26))
Z=

1
ZT (T, V )N Zi (T )N ,
N!

(2.7.1)

where independent thermodynamic variables are explicitly specified. From this formula it follows
(
)

3/2
eV 2mkB T
A(T, V, N ) = N kB T log
(2.7.2)
Zi (T ) ,
N
h2
where we have used Stirlings formula.
As concluded in 2.4.3 without any computation, the Gibbs-Helmholtz equation

58

CHAPTER 2. STATISTICAL MECHANICS PRIMER

gives the internal energy independent of the volume of the system (or the density
N/V of the gas):
3
E(N, T ) = Ei + N kB T,
(2.7.3)
2
where Ei is the contribution from the internal degrees of freedom. The equation of
state is found to be

A
P =
= N kB T /V.
(2.7.4)
V T
This is the well-known equation of state of the classical ideal gas.
Remark. Independence of internal energy E from V , i.e.,

E
= 0,
V T

(2.7.5)

implies

P
P
= ,

T V
T

(2.7.6)

P f (V ) = T,

(2.7.7)

(demonstrate this) so
where f is a function of V . But since P and T are both intensive, this relation implies
P f () = T.

(2.7.8)

The equation of state of the ideal gas (2.7.4) is an example. t


u

2.7.2 Fundamental equation of state


It is clear from the derivation of the equation of state that the information about
the internal degrees of freedom in the free energy is completely lost.
On the other hand, the entropy preserves this information as is clear from
)
(
3/2

EA
3
eV 2mkB T
Zi (T ) .
S=
= N kB + N kB log
(2.7.9)
T
2
N
h2
From this S can be written in terms of V and E (Try it). The function S = S(E, V )
is called by Gibbs the fundamental equation (of state). In contrast to the ordinary
equation of state such as (2.7.8), this equation contains the information not only for
the equation of state but also for the heat capacity.
The constant volume heat capacity CV can be obtained from (2.7.9) as
3
CV = N kB + CV, int ,
2

(2.7.10)

2.7. CLASSICAL IDEAL GAS


where CV, int is the contribution of internal degrees of freedom:

log Zi
.
CV, int T
T V

59

(2.7.11)

2.7.3 Absolute entropy


The entropy obtained above is contradictory to the third law of thermodynamics,
because S log T (& ). However, since S log + (3/2) log T , making the
density of the gas sufficiently low ( & 0), we can indefinitely lower the temperature
below which the unphysical nature of the ideal gas emerges. In this sense, classical
ideal gas is a useful idealization of a real dilute gas.
Combining the equation of state (2.7.4) and the formula for the entropy (2.7.9),
we can derive
"

3/2 #
5
S
5
2m
5/2

log P = log T + + log kB


.
(2.7.12)
2
2
2
h
N kB
The entropy in this formula can be measured, if the ideal gas (i.e., a sufficiently
dilute gas) is in equilibrium with a condensed phase. Thus, we can directly check
whether the value of absolute entropy is correct or not.
2.7.4 Contribution of internal degrees of freedom
The partition function Zi (T ) for the internal degrees of freedom is not easily computed, since we have to take into account the nature of nuclei in the molecule. 5
There are essentially four important internal degrees of freedom: nuclear spins,
electrons, molecular rotation and vibrations.
Since the energy gaps between different states of nuclear spins are usually very
small, we may assume that all the states are equally populated except at very low
temperatures. Hence, this gives Zi a constant multiplicative factor (spin multiplicity) independent of T . That is, nuclear spins do not contribute to the heat capacity
except for extremely low temperatures.
The electronic degrees of freedom have very large energy gaps between their
ground and first excited states ( 5 eV, the order of ionization potentials), so they
are virtually frozen up to a few thousand K (kB = 8.62 105 eV/K, so 5eV corresponds to 5,800K). Thus electrons do not contribute to the heat capacity, either.
Vibrational degrees of freedom have energy gaps (or energy quanta) of order 0.1
5

Here, we discuss only qualitative features except for this complication that occurs for light
homonuclear diatomic molecules.

60

CHAPTER 2. STATISTICAL MECHANICS PRIMER

eV, so they cannot fully contribute at room temperatures. If T < 100K, we may
usually totally ignore vibrational degrees of freedom.
Rotational degrees of freedom have small energy quanta of order 10K or less except
for light diatomic molecules such as H2 , D2 , T2 , etc; for polyatomic gases, we may
always treat rotational degrees of freedom classically (except for methanes). Thus
the heat capacity of most classical ideal gases consists of translational and rotational
contributions.
2.7.5 Momentum distribution of classical ideal gas particles
Denote the density distribution function for momentum p by f (p); its meaning is


a particle has the momentum p such that its i-th
f (p)dp = Prob.
. (2.7.13)
component (i = x, y, z) is between pi and pi + dpi
We can use the relation between the indicator and the probability (2.A.14):
f (p0 ) = h(p p0 )i,

(2.7.14)

where hi is the equilibrium ensemble average:


Z
Z
P 2
1
dq 1 dq 2 dq N dp1 dp2 dpN e pi /2m /(ZTN /N !),
hi 3N
h N!
(2.7.15)
and irrelevant internal degrees of freedom have been ignored. Thus we get
f (p) = (2kB T )3/2 exp(p2 /2m).

(2.7.16)

This is the Maxwell distribution.


Exercise 1. Compute the average square velocity. Also compute the average square
of the relative velocity of two arbitrary molecules. u
t

2.8

Open systems

2.8.1 Open system


When a system can exchange not only energy but particles with its environment,
we call it an open system. Let us find an equilibrium distribution function for the
number of particles in the system as well as energy. The strategy here is parallel to
the one adopted for the study of closed systems. Embed our system in a reservoir

2.8. OPEN SYSTEMS

61

of energy and chemical species. Then, consider the system (I) and reservoir (II, this
time it is not only a heat bath but a chemostat) as a single isolated system (see
1.6.4); Boltzmanns principle may then be applied to the composite system. We
discuss a system consisting of a single chemical species first, and later generalize our
result to multicomponent systems.
2.8.2 System + chemostat at constant temperature
The total energy E0 and the total number of particles N0 of the system are given by
E0 = EI + EII ,

(2.8.1)

N0 = NI + NII .

(2.8.2)

The number of microscopic states for the system I (II) with energy EI (EII ) and
particle number NI (NII ) is denoted by I (EI , NI ) (II (EII , NII )). We assume the
interaction between the two systems to be very weak, so both systems are statistically
independent. Hence, the number of microstates for the composite system with an
energy EI in I and EII in II and with a number of particles NI in I and NII in II,
respectively, is given by
(2.8.3)
I (EI , NI )II (EII , NII ).
The total number (E0 , N0 ) of microstates for the composite system must be the
sum of this product over all the ways to distribute energy and molarity to I and II.
Therefore, we get
X
(E0 , N0 ) =
I (EI , NI )II (EII , NII ).
(2.8.4)
0E E0 , 0N N0

2.8.3 Probability of the system macrostates


This entry is quite parallel to 2.3.2.
The fundamental postulate of equilibrium statistical mechanics 2.1.3 implies that
the probability for the system I to have energy E and molarity N (or more precisely,
the energy between E and E + dE, and molarity between N and N + dN ) is given
by
(E, N )II (E0 E, N0 N )
P (EI E, NI N ) = I
.
(2.8.5)
(E0 , N0 )
Now consider the equilibrium state of the composite system. The subsystems must
also be in equilibrium. We may use Boltzmanns principle to rewrite II (EII , NII ) as
II (EII , NII ) = exp[SII (EII , NII )/kB ].

(2.8.6)

62

CHAPTER 2. STATISTICAL MECHANICS PRIMER

Since the reservoir (II) is huge compared to the system (I), the entropy may be
expanded as:
SII (E0 E, N0 N ) = SII (E0 , N0 ) E

SII
SII
N
+ second order terms. (2.8.7)
EII
NII

Denote the temperature of the reservoir T :


1
SII
= ,
EII
T

(2.8.8)

and introduce the chemical potential


SII

= .
NII
T

(2.8.9)

The most probable E and N should be close to their macroscopically observable


values for the system I, so that due to the extensivity of internal energy and particle
number, they should be of order NI . Thus, as was discussed previously, the second
order terms in (2.8.7) are of order NI /NII , which are negligibly small.
2.8.4 Grand canonical partition function
Now, the consideration in 2.8.3 allows us to streamline (2.8.5) as
P (EI E, NI N ) I (E, N )eE+N .
To get the probability we need the normalization constant :
X
=
I (E, N )eE+N ,

(2.8.10)

(2.8.11)

E,N

which is called the grand canonical partition function (or grand partition function).
The sum over E may become an integral. A more microscopic expression is also
possible:
X
eE+N .
(2.8.12)
=
all microstates

The probability density distribution calculated according to the grand canonical


formalism reads
1
P (E, N ) = I (E, N )eE+N .
(2.8.13)

2.9. IDEAL PARTICLE SYSTEMS QUANTUM STATISTICS

63

2.8.5 Relation between thermodynamics and grand partition function


The next task is to find the relation between what has been obtained and thermodynamics. We use the same idea as was used for the canonical partition function
(2.3.5).
X
X
=
(E, N )eE+N =
e(EN T S(E,N )) ' e(EN T S(E,N )) , (2.8.14)
E,N

E,N

where in the rightmost expression E, N are the most probable values of energy and
particle number, respectively. The error in this approximation for log , which is of
order N , is of order log N , extremely accurate just as before. From this we get the
following relation:
(2.8.15)
T S E + N = kB T log .
Using
E = T S P V + N,

(2.8.16)

P V /T = kB log .

(2.8.17)

we finally conclude that


P V /T is sometimes called Kramers q-potential. From this the equation of state
directly follows.
The Gibbs free energy can be easily obtained as
G = N ,

(2.8.18)

as was already discussed thermodynamically (1.7.3). Thus the easiest way to


compute the Helmholtz free energy from the grand partition function is via
A = G + P V.

2.9

(2.8.19)

Ideal particle systems quantum statistics

2.9.1 Specification of microstate for indistinguishable particle system


Let i (= 1, 2, ) denote the i-th one particle state, and ni be the number (occupation
number) of particles in this state. Since all the particles are indistinguishable, the

64

CHAPTER 2. STATISTICAL MECHANICS PRIMER

table of the occupation numbers {n1 , n2 , } should be sufficient to specify an elementary microstate completely. Thus we may identify this table and the microstate.
Let i be the energy of the i-th one particle state. The total energy E and the
total number of particles N can be written as
X
E=
i n i ,
(2.9.1)
i=1

and
N =

ni .

(2.9.2)

i=1

2.9.2 Grand partition function of indistinguishable particle system


Let us compute the grand canonical partition function (2.8.12) for the system in
2.9.1
X
(, ) =
(2.9.3)
eE+N .
n1 ,n2 ,

Using the microscopic descriptions of E and N ((2.9.1) and (2.9.2)), we can rearrange
the summation as
Y
=
i ,
(2.9.4)
i

where
i

exp[(i )ni ].

(2.9.5)

ni

This quantity may be called the grand canonical partition function for the one particle state i. As was warned in 2.6.5, the term state is used here in the sense of
mode (i-th mode is occupied by ni particles).
2.9.3 There are only fermions and bosons in the world
In the world it seems that there are only two kinds of particles:
bosons: there is no upper bound for the occupation number;
fermions: the occupation number can be at most 1 (the Pauli exclusion principle).
This is an empirical fact. Electrons, protons, etc., are fermions, and photons,
phonons (= quanta of sound wave) are bosons.
There is the so-called spin-statistics relation that the particles with half odd integer spins are fermions, and those with integer spins are bosons. The rule applies also
to compound particles such as hydrogen atoms. Thus, H and T are bosons, but their
nuclei are fermions. D and 3 He are fermions. 4 He is a boson, and so is its nucleus.
A complication mentioned about the heat capacity of ideal classical gases in 2.7.4

2.9. IDEAL PARTICLE SYSTEMS QUANTUM STATISTICS

65

is due to these differences.


2.9.4 Average occupation number for bosons
For bosons, any number of particles can occupy the same one particle state, so
i =

e(i )n = 1 e(i )

1

(2.9.6)

n=0

The mean occupation number of the i-th state is given by


hni i =

ni e(i )n /i ,

(2.9.7)

n=0

so we conclude


log i
1
log i
= kB T
= (i )
.
hni i =



T
e
1

(2.9.8)

This distribution is called the Bose-Einstein distribution.


Notice that the chemical potential must be smaller than the ground state energy
to maintain the positivity of the average occupation number.
2.9.5 Average occupation number for fermions
For fermions no one particle state can be occupied by more than one particle, so the
sum over the occupation number is merely the sum for n = 0 and n = 1:
i = 1 + e(i ) .
Thus, the mean occupation number is given by

1
log i
= (i )
.
hni i = kB T

T
e
+1

(2.9.9)

(2.9.10)

This distribution is called the Fermi-Dirac distribution.


It is very important to recognize the qualitative features of the Fermi-Dirac distribution function (see Fig. 2.3).
2.9.6 Classical limit of occupation number
In order to obtain the classical limit, we must take the occupation number 0 limit
to avoid quantum interference among particles (cf. 2.4.2). The chemical potential
is a measure of the strength of the chemostat to push particles into the system.

66

CHAPTER 2. STATISTICAL MECHANICS PRIMER


expected
occupation
number

kBT
symmetric around this point

1/2
0

Fig. 2.3 The cliff has a width of order kB T . is called the Fermi potential. The symmetry noted
in the figure is the so-called particle-hole symmetry.

Thus, we must make the chemical potential extremely small: & .


In this limit both Bose-Einstein (2.9.8) and Fermi-Dirac distributions (2.9.10)
reduce to the Maxwell-Boltzmann distribution as expected
hni i N ei ,

(2.9.11)

where N = e is the normalization constant determined by the total number of


particles in the system.

2.10

Free fermion gas

2.10.1 Free electron model of metal


As an application of the Fermi-Dirac distribution (2.9.10), let us consider a free electron gas. Electrons are negatively charged particles, so that the Coulomb interaction
among them is very strong. However, in a metal, the positive charges on the lattice
neutralize the total charge, and due to the screening effect even the charge density
fluctuations do not interact strongly. Thus, the free electron model of metals is at
least a good zeroth order model of real metals.
2.10.2 We can apply grand canonical theory to closed systems
We apply the grand canonical scheme to a piece of metal. Since this is not an open
system, it might appear that the grand canonical scheme is not applicable. In the
large system limit (thermodynamic limit), however, the use of grand canonical scheme
to compute thermodynamic quantities is fully justified mathematically.6
6

This is thanks to the so-called ensemble equivalence.

2.10. FREE FERMION GAS

67

Intuitive understanding of this fact is not hard. A macroscopic piece of metal in


equilibrium should be uniform, so we can get all the thermodynamic properties of
the whole piece from a tiny (but still macroscopic) part. For this tiny portion, the
rest of the specimen acts as a reservoir. Or, we may rely on the following typical
statistical mechanical assertion claiming that the properties of a macroscopic system
are virtually independent of the boundary conditions.
In a piece of metal the number of electrons N is fixed (or rather, the electron density is fixed), so a correct electron chemical potential must be chosen to be consistent
with this density.
2.10.3 Total number of particles
The total number of particles must be
N=

hni i.

(2.10.1)

The sum is over all the different one particle states of the free electron. The state
of a free electron is completely fixed by its spin (up or down) and its momentum p.
We already know the density of states for free particles (cf. 2.4.3), so the density of
states f for the free electron should read
f (p)dp = 2 V

4p2 dp
,
h3

(2.10.2)

where p |p|, V is the volume of the system (the factor 2 comes from the spin).
It is convenient to write everything in terms of energy:
= p2 /2m,

(2.10.3)

where m is the mass of the electron. The density of states reads


f ()d =

4V
(2m)3/2 1/2 d.
h3

(2.10.4)

Thus, we have arrived at


4V
N = 3 (2m)3/2
h

d
0

1/2
e() + 1

which implicitly fixes as a function of T, V and N .

(2.10.5)

68

CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.10.4 Free fermion system at T = 0


At T = 0, the Fermi-Dirac distribution is a step function:
hn()i = (F ),

(2.10.6)

where F = is the Fermi energy. Therefore, it is easy to explicitly compute the


integral in (2.10.5) to get
4V
2 3/2
N = 3 (2m)3/2 F ,
(2.10.7)
h
3
or

2/3
h2
3N
F =
.
(2.10.8)
2m 8V
For ordinary metals, F is of the order of a few eV (for copper 7.00 eV, for gold 5.90
eV). This kinetic energy corresponds to a speed of electrons of order 1% the speed
of light.
Exercise 1. Compute the internal energy at T = 0. (Answer: 3N F /5; this is the
lowest possible energy for the system.) u
t
2.10.5 Equation of state of free fermion gas
The equation of state reads
Z
PV
= log = df () log(1 + e() ),
kB T

(2.10.9)

and the easiest way to obtain the Helmholtz free energy is through (2.8.19):
Z
A = N + kB T df () log(1 + e() ).
(2.10.10)

2.10.6 Specific heat of fermion gas


Let us intuitively discuss the electronic heat capacity of metals at low temperatures.
We may assume that the Fermi-Dirac distribution is (almost) a step function. From
Fig. 2.3 in 2.9.5 we can infer that the number of excitable electrons is N kB T .
Therefore, their energetic contribution is E N (kB T )2 . Hence, CV T at lower
temperatures. Thus at sufficiently low temperatures this dominates the heat capacity of metals (where T 3  T ).

2.11. FREE BOSONS AND BOSE-EINSTEIN CONDENSATION

2.11

69

Free bosons and Bose-Einstein condensation

2.11.1 Total number of particles of free boson gas


Let us take the ground state energy of the system to be the origin of energy.7 Also the
chemical potential cannot be positive. The total number of particles in the system
of free bosons is given by
X
1
.
(2.11.1)
N=
(
)
i
e
1
i
If T is sufficiently small, the terms corresponding to the energy levels near the ground
state can become very large, so in general it is dangerous to approximate (2.11.1)
by an integral with the aid of a smooth density of state (see Fig. 2.4). (In case of
fermions, each term cannot be larger than 1, so there is no problem at all in this
approximation.)
expected
occupation
number

T > Tc

T < Tc

N0

expected
occupation
number

N1

Fig. 2.4 If T < Tc , where Tc is the Bose-Einstein condensation temperature (2.11.2), then the
ground state is occupied by N0 = O[N ] particles, so the approximation of (2.11.1) by an integral
becomes grossly incorrect.

2.11.2 Bose-Einstein condensation


Let us try to approximate (2.11.1) by integration in 3d. In this case the density of
states has the form f () = C1/2 , C being a constant independent of T :
Z
Z
1/2
z 1/2
3/2
dz
N N1 C

C(k
T
)
.
(2.11.2)
B
e() 1
ez 1
0
0
The equality holds when = 0. The integral is finite, so N1 can be made indefinitely
close to 0 by reducing T . However, the system should have N bosons independent
7

The ground state energy must be finite for a system to exist as stable matter.

70

CHAPTER 2. STATISTICAL MECHANICS PRIMER

of T , so there must be a temperature Tc at which


N = N1

(2.11.3)

N > N1 .

(2.11.4)

and below it
The temperature is called the Bose-Einstein condensation temperature; below this
the continuum approximation breaks down.
What actually happens is that a macroscopic number N0 (= N N1 ) of particles
fall into the lowest energy one particle state (see Fig. 2.4 in 2.11.1). This phenomenon is called Bose-Einstein condensation.
2.11.3 Number of particles that does not undergo condensation
Now study N0 , the number of particles in the ground state, more closely. From
(2.11.2), we know that N1 is an increasing function of , but we cannot increase
indefinitely; must be non-positive. Hence, at or below a certain particular
temperature Tc vanishes. Then, the equality holds in (2.11.2), so that Tc is fixed
by the condition
Z
z 1/2
3/2
.
(2.11.5)
N = N1 = C(kB Tc )
dz z
e 1
0
Hence, we get for T < T c (Fig. 2.5)

N1 = N

T
Tc

3/2
.

(2.11.6)

N1/N
Fig. 2.5 The ratio N1 /N of non-condensate
atoms has a singularity at the Bose-Einstein
condensation point Tc .

Tc

Remark. There is no Bose-Einstein condensation in one and two dimensional spaces,


because the integral in (2.11.2) for these cases does not converge. u
t
2.11.4 Internal energy and specific heat of ideal bose gas
The Bose-Einstein condensate does not contribute to internal energy, so we may use

2.12. PHONONS AND PHOTONS

71

the continuum approximation to compute the internal energy. Below Tc we may


set = 0, so
Z
1/2
E=
df () ()
.
(2.11.7)
e
1
0
Therefore, for T < Tc
E 5/2 .
(2.11.8)
Of course, E must be extensive, so

E = N kB Tc

T
Tc

5/2
const.,

(2.11.9)

where the constant is, according to a more detailed computation, equal to 1.93 .
From this the low temperature heat capacity is
 3/2
T
CV
.
(2.11.10)
Tc
This goes to zero with T as required by the third law (1.11.1).

2.12

Phonons and photons

2.12.1 Phonons
Phonons are quanta of sound waves. A harmonic mode with angular frequency
has states with energy (see 2.4.1)


1
n = n +
~ (n = 0, 1, 2, ).
(2.12.1)
2
When a harmonic mode has energy n , we say there are n phonons of the harmonic
mode. The canonical partition function of the phonon system has the same structure
as the grand canonical partition function of free bosons with a zero chemical potential
(see 2.4.1, if we ignore the contribution of the zero-point energy).8 Therefore, the
average number of phonons of a harmonic mode is given by
hni =
8

1
e+~

This is a mathematical formal relation; do not read too much physics in it.

(2.12.2)

72

CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.12.2 Phonon contribution to internal energy


The phonon contribution to the internal energy of a system may be computed just
as we did for the Debye model (2.6.5). We need the density of states (i.e, phonon
spectrum) f (). The internal energy of all the phonons is given by
Z
X
~
E=
hn()i~ = df () +~
(2.12.3)
.
e
1
modes

This is the internal energy without the contribution of zero-point energy. The latter
contribution is a mere constant shifting the origin of energy, so it is thermodynamically irrelevant.
The approximation of the sum in (2.12.3) by an integral is always allowed, because
the factor ~ removes the difficulty encountered in the Bose-Einstein condensation.
The total number of phonons of a given mode diverges as 0 (infrared catastrophe), but this is quite harmless, since these phonons do not have much energy.
2.12.3 Photons
Photons are quanta of light or electromagnetic wave. Photons have spin 1, but they
travel at the speed of light, so the z-component of its spin takes only 1. This corresponds to the polarization vector of light. As with phonons, there is no constraint on
the number of photons. Photons result from quantization of electromagnetic waves,
so mathematically they just behave as phonons. Thus, formally, we may treat photons as bosons with two internal states and with = 0. Hence, the number of
photons of a given mode reads exactly the same as (2.12.2).
2.12.4 Plancks radiation law: derivation
Let f ()d be the number of modes with angular frequency between and + d
(the photon spectrum). The internal energy dE and the number dN of photons in
this range are given by
~
dE = 2f () +~
d,
(2.12.4)
e
1
1
d.
dN = 2f () ~
(2.12.5)
e
1
The factor 2 comes from the polarization states.
A standard way to obtain the density of states is to study the wave equation
governing the electromagnetic waves, but here we use a shortcut. The magnitude of
the photon momentum is p = ~k = ~/c, so
d3 pd3 q
V
V
3 4p2 dp = 2 3 2 d,
3
h
h
2 c

(2.12.6)

2.12. PHONONS AND PHOTONS

73

i.e.,
V 2
f () = 2 3 .
2 c
Therefore, the photon energy per unit volume reads
u(T, )d = dU /V =

~ 3
1
d.
2
3
~
c e
1

(2.12.7)

(2.12.8)

This is called Plancks radiation law.


2.12.5 Plancks radiation law: qualitative features
It is important to know some qualitative features of this law (Fig. 2.6).
Fig. 2.6 Classical electrodynamics gives the
Rayleigh-Jeans formula (2.12.10) (green); this
is the result of equipartition of energy and due
to many UV modes, the density is not integrable (the total energy diverges). Wien reached
(2.12.11) empirically (red). Planck arrived at
his formula (black) originally by interpolation of
these results. Notice that the peak position is
proportional to the temperature.

Plancks law can explain why the spectrum blue-shifts as temperature increases; this
was not possible within the classical theory.
2.12.6 Total energy of radiation field
The total energy density u(T ) of a radiation field is obtained by integration:
Z
u(T ) =
d u(T, ).
(2.12.9)
0

With Plancks law (2.12.8) this is always finite. If the limit ~ 0 is taken (the
classical limit), we get
u(T, ) =

kB T 2
( = 2f ()kB T ) ,
2 c3

(2.12.10)

which is the formula obtained by classical physics. Upon integration, the classical
limit gives an infinite u(T ). The reason for this divergence is obviously due to the

74

CHAPTER 2. STATISTICAL MECHANICS PRIMER

contribution from the high frequency modes. Thus this difficulty is called the ultraviolet catastrophe, which destroyed classical physics. Empirically, Wien proposed
u(T, ) '

kB T 2 ~
e
.
2 c3

(2.12.11)

The formula can be obtained from Plancks law in the limiting case ~  kB T .
Using Plancks law of radiation, we immediately get
u(T ) T 4 ,

(2.12.12)

which is called the Stefan-Boltzmann law. This was derived purely thermodynamically by Boltzmann before the advent of quantum mechanics (2.12.8). The proportionality constant contains ~, so it was impossible to theoretically obtain the
factor (Stefan experimentally obtained it).
2.12.7 Radiation pressure
Photons may be treated as ideal bosons with = 0. If = 0, then A = P V , so
the equation of state is immediately obtained as
Z
PV
= log = log Z = df () log(1 e ).
(2.12.13)
kB T
The density of states has the form f () = C2 , where C is a numerical constant. We
use
Z
Z
Z
C 3 0
C
1
2

dC log(1 e ) = d ( ) log(1 e ) =
d3
,
3
3
e 1
(2.12.14)
where we have used integration by parts. This implies that
Z
Z

(2.12.15)
df ()
,
df () log(1 e ) =
3
e 1
or, with the aid of (2.12.3) and (2.12.13)
PV

= E.
kB T
3

(2.12.16)

P = u(T )/3

(2.12.17)

That is, we have obtained


Exercise 1. Derive the corresponding formula to (2.12.17) for D-dimensional space.
u
t

2.13. PHASE COEXISTENCE AND PHASE RULE

75

2.12.8 Thermodynamic derivation of Stefan-Boltzmann law


Now assuming (2.12.17), we can derive the Stefan-Boltzmann law purely thermodynamically. Since we know A = P V as noted in 2.12.7, or
V
A = E T S = u(T ).
3

(2.12.18)

Thus, we can get S as


S=

4V
u(T ).
3T

(2.12.19)

We use (E/S)V = T or

du(T ) = T d

4u(T )
3T


.

(2.12.20)

That is,
T du = 4udT,

(2.12.21)

which implies the Stefan-Boltzmann law (2.12.12).

2.13

Phase coexistence and phase rule

2.13.1 Coexistence of two phases


Consider a one component fluid system consisting of two coexisting phases. The
system is isolated as a whole. The phase boundary allows exchange of energy, volume,
and particles. Then the maximum entropy principle (cf. 1.6.8) tells us that the
equilibrium conditions for these coexisting phases are
T I = T II , P I = P II , I = II ,

(2.13.1)

where we use the usual symbols, and superscripts denote two phases I and II. We
can rewrite the last equality in (2.13.1) as
I (T, P ) = II (T, P ).

(2.13.2)

This functional relation determines a curve called the coexistence curve in the T -P
diagram.
Along this line
G = N I I + N II II .
(2.13.3)
Thus, without changing the value of G, any mass ratio of the two coexisting phases
is realizable (as we know well with water and ice).

76

CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.13.2 Number of possible coexisting phases for pure substance


How many phases can coexist at a given T and P ? Suppose we have X coexisting
phases. The following conditions must be satisfied:
I (T, P ) = II (T, P ) = = X (T, P ).

(2.13.4)

We believe that for the generic case, s are sufficiently functionally independent. To
be able to solve for T and P , we can allow at most two independent relations. That
is, at most three phases can coexist at a given T and P for a pure substance.
For a pure substance, if three phases coexist, T and P are uniquely fixed. This
point on the T -P diagram is called the triple point. The Kelvin scale of temperature
is defined so that triple point of water is at T = 273.16K. t = T 273.15 is the
temperature in Celsius.

P
Fig. 2.8 A generic phase diagram for pure
fluid. You must be able to specify which zone
corresponds to which phase, solid, liquid or gas.
The phase diagram for water near T = 273K
and 1 atm has a slight difference from this.
What is it?

critical
point

triple
point

T
2.13.3 Gibbs phase rule
Consider a more general case of a system consisting of c chemically independent
components (i.e., the number of components we can change independently). For
example, H3 O+ in pure water should not be counted, if we count H2 O among the
independent chemical components.
Suppose there are coexisting phases. The equilibrium conditions are:
(1) T and P must be common to all the phases,
(2) The chemical potentials of the c chemical species must be common to all the
phases.
To specify the composition of a phase we need c1 variables, because we need only
the concentration ratio. Thus, the chemical potential for a chemical species depends
on T , P and c1 mole fractions, which are not necessarily common to all the phases.
That is, s are c + 1 variable functions, and we have 2 + (c 1) unknown variables.
We have 1 equalities among the chemical potentials in different phases for each

2.13. PHASE COEXISTENCE AND PHASE RULE

77

chemical species, so the number of equalities we have is ( 1) c. Consequently,


for the generic case we can choose f = 2 + (c 1) c( 1) = c + 2 variables
freely. This number f is called the number of thermodynamic degrees of freedom.
We have arrived at the Gibbs phase rule:
f = c + 2 .

(2.13.5)

As astute readers have probably sensed already, the derivation is not water tight.
Rigorously speaking, we cannot derive the phase rule from the fundamental laws of
thermodynamics.
2.13.4 Clapeyron-Clausius relation
For a pure substance, as we have seen, the chemical potentials of coexisting phases
must be identical. Before and after the phase transition from phase I to II or vice
versa, there is no change of the Gibbs free energy
GCC = 0,

(2.13.6)

where CC means along the coexistence curve and implies the difference across
the coexistence curve (say, phase I phase II). This should be true even if we change
T and P simultaneously along the coexistence curve. Hence, along CC,

P
S
=
SdT + V dP = 0
.
(2.13.7)

T CC V
At the phase transition H = T S, where H is the latent heat and T the phase
transition temperature. Thus, we can rewrite (2.13.7) as

H
P
= III ,
(2.13.8)

T CC T III V
where III X denotes X II X I . This relation is called the Clapeyron-Clausius
relation.
If we may assume that one phase is an ideal gas phase, and if we ignore the volume
of the other phase, then
V ' VG = N RT /P,
(2.13.9)
where N is the mole number of the substance. Therefore, we can integrate (2.13.8)
as


L
P exp
,
(2.13.10)
N RT
where L is the latent heat (heat of evaporation). This gives the vapor pressure of
the condensed phase.

78

2.14

CHAPTER 2. STATISTICAL MECHANICS PRIMER

Phase transition

2.14.1 Phase transition as singularity of free energy


When the free energy A becomes singular, such as when it becomes nondifferentiable
or ceases to have higher order derivatives, we say the system exhibits a phase transition. When A itself becomes nondifferentiable, the phase transition is called a first
order phase transition. Other phase transitions are collectively called second order
phase transitions, continuous phase transitions or higher order phase transitions. A
typical behavior of the Gibbs free energy G is illustrated below.

G
Fig. 2.8 Typical behavior of Gibbs free energy for a pure substance. The free energy loses
differentiability at first order phase transition
points.

S
L
G
melting
point

boiling
point

First order phase transitions often depend strongly on the details of individual
material, so a general theory is hard to construct. For second order phase transitions
or critical phenomena, long wave length fluctuations become very important, so that
there are many features independent of the details of individual systems (microscopic
details). Thus, there is a nice set of general theories for the second order phase transition.
2.14.2 Typical second order phase transition
A typical second order phase transition is the one from the paramagnetic to the
ferromagnetic phase.
A magnet can be understood as a lattice of spins interacting with each other
locally in space. The interaction between two spins has a tendency to align them
parallelly. At higher temperatures, due to vigorous thermal motions, this interaction
cannot quite make order among spins, but at lower temperatures the entropic effect
becomes less significant, so spins order globally. There is a special temperature Tc
below which this ordering occurs. We say an order-disorder transition occurs at this
temperature.
The Ising model is the simplest model of this transition. At each lattice point is
a (classical) spin which takes only +1 (up) or 1 (down). A nearest neighbor spin

2.14. PHASE TRANSITION

79

pair has the following interaction energy:


Ji j ,

(2.14.1)

where J is called the coupling constant, which is positive in our example (ferromagnetic case). We assume all the spin-spin interaction energies are superposable, so the
total energy of the system for a lattice is given by
X
X
H=
(2.14.2)
Ji j
hi ,
hi,ji

where h i implies the nearest neighbor pairs, and h is the external magnetic field.
The partition function for this system reads
X
Z=
(2.14.3)
eH .
{i =1}

Here, the sum is over all spin configurations.


2.14.3 Necessity of thermodynamic limit
If the lattice size is finite, the sum in (2.14.3) is a finite sum of positive terms. Each
term in this sum is analytic in T and h, so the sum itself becomes analytic in T and
h. Furthermore, Z cannot be zero, because each term in the sum is strictly positive.
Therefore, its logarithm is analytic in T and h; the free energy of the finite lattice
system cannot exhibit any singularity. That is, there is no phase transition for this
system. Strictly speaking, there is no phase transition for any finite system, unless
each spin has infinitely many states.
Even in the actual system we study experimentally, there are only a finite number
of atoms, but this number is huge. Thus, the question of phase transitions from the
statistical physics point of view is: is there any singularity in A in the large system
limit? The large system limit, with proper caution not to increase its surface area
more than the order of V 2/3 , where V is the system volume, is called the thermodynamic limit. Strictly speaking, phase transitions can occur only in this limit.
2.14.4 Spatial dimensionality is crucial
For the existence of a phase transition, not only the system size but also the spatialdimensionality of the system is crucial.
Let us consider a one-dimensional Ising model (Ising chain), whose total energy
reads
X
H = J
i i+1 .
(2.14.4)
<i<+

80

CHAPTER 2. STATISTICAL MECHANICS PRIMER

We have ignored the external magnetic field for simplicity. Compare the energies
of the following two spin configurations (+ denotes the up spins and down spins)
(Fig. 2.9):

++++++++++++++++++++++
+++++++++++++
L
Fig. 2.9 Ising chain with a spin-flipped island

The bottom one has a larger energy than the top by 2J 2 due to the existence of
the down spin island. However, this energy difference is independent of the size L of
the island. Therefore, so long as T > 0 there is a finite chance of making big down
spin islands amidst the ocean of up spins. If a down spin island becomes large, there
is a finite probability for a large lake of up spins on it. This implies that no ordering
is possible for T > 0.
As you can easily guess there is no ordered phase in any one dimensional lattice
system with local interactions for T > 0.
2.14.5 In 2-space Ising model exhibits phase transition
Consider the two-dimensional Ising model with h = 0. Imagine there is an ocean of
up spins (Fig. 2.10). To make a circular down-spin island of radius L, we need 4JL
more energy than the completely ordered phase.

Fig. 2.10 Peierls argument illustrated.

L
down
spins

up
spins

This energy depends on L, making the formation of a larger island harder. That is,
to destroy a global order we need a macroscopic amount of energy, so for sufficiently
low temperatures, the ordered phase cannot be destroyed spontaneously. Of course,
small local islands could be made, but they never become very large. Hence, we may
conclude that a phase transition is possible for a two-dimensional system with local
interactions even for T > 0. The above argument is known as Peierls argument,R
and can be made rigorous.9
9

There is at least one more crucial factor governing the existence of phase transition. It is the
spin dimension: the degree of freedom of each spin. Ising spins cannot point different directions,

2.14. PHASE TRANSITION

81

2.14.6 Long range interactions


What happens if the range of interactions is not finite? Peierls argument is still
applicable. Obviously, if each spin can interact with all the spins in the system uniformly, an ordered phase is possible even in one dimensional space. If the coupling
constant J decays slower than 1/r2 , then an order-disorder phase transition is still
possible at a finite temperature in one dimensional space.
Exercise 1. Intuitively explain the last statement. u
t
We have learned that for phase transitions, system size, dimensionality of space,
and the range of interactions are crucial.

only up or down (their spin-dimension is 1). However, the true atomic magnets can orient in any
direction (their spin dimension is 3). This freedom makes ordering harder. Actually, in 2D space
ferromagnetic ordering at T > 0 by spins with a larger spin dimension than Ising spins is impossible.

82

CHAPTER 2. STATISTICAL MECHANICS PRIMER

Appendix 2A: Rudiments of probability


2.A.7 Introductory examples
Suppose we have a jar containing 5 white balls and 2 black balls. What is the degree
on the 0-1 scale of your confidence for you to pick a white ball out of the jar? We
expect that on the average for 5 times out of 7 we will take a white ball out. Hence,
it is sensible to say that our confidence in the above statement is 5/7.
What do we mean when we say that there will be 70% chance of rain tomorrow?
In this case, in contrast to the preceding example, we cannot repeat tomorrow again
and again. However, the meaning seems clear.
We conclude that the probability of an event should be a measure of confidence
in the occurrence of the event. A particular measure may be realistic or unrealistic,
but this is not a concern of probability theory. In probability theory we study the
consequences of the general abstract definition of probability.
2.A.8 Elementary events
An event which cannot (or need not) be analyzed further into a combination of events
is called an elementary event. For example, to rain tomorrow is an elementary event
(if you ask whether you prepare for raingear), but to rain or to snow tomorrow is a
combination of two elementary events.
Denote by the totality of elementary events allowed in the situation or to the
system under study. Any (compound) event under consideration can be identified
with a subset of . When we say an event corresponding to a subset A of occurs,
we mean that one of the elements in A occurs.
2.A.9 Probability is a volume of confidence
Let us denote the probability of A by P (A). Since probability should measure
the degree of our confidence on a 0-1 scale, we demand that
P () = 1;

(2.A.1)

something must happen. Then, it is also sensible to assume


P () = 0.

(2.A.2)

Now, consider two mutually exclusive events, A and B. This means that whenever
an elementary event in A occurs, no elementary event in B occurs and vice versa.
Hence, A B = . Thus, it is sensible to demand
P (A B) = P (A) + P (B), if A B = .

(2.A.3)

For example, if you know that with 30% chance it will rain tomorrow and with 20%
chance it will snow tomorrow (excluding sleet), then you can be sure that with 50%

2.14. PHASE TRANSITION

83

chance it will rain or snow tomorrow.


We already know quantities satisfying (2.A.3): length, area, or volume. Thus,
probability is a kind of volume to measure ones confidence.10
Remark. The concept of volume is abstracted as measure in mathematics; a measure
with its total mass equal to unity is called a probability measure. u
t
Example 1. We throw three fair coins. Find the probability to have at least two
heads.
In this case elementary events are the outcomes of one trial, say HHT (H = head,
T = tail). Thus there are 8 elementary events, and we have
= {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}.

(2.A.4)

The word fair means that all elementary events are equally likely. Hence, the
probability of any elementary event should be 1/8. The event A, defined by having
at least 2 Hs, is given by A = {HHH, HHT, HTH, THH}. Of course, elementary
events are mutually exclusive, so P (A) = 1/2. u
t
2.A.10 Some rudimentary facts about probability
It is easy to check that
P (A B) P (A) + P (B),
A B P (A) P (B).

(2.A.5)
(2.A.6)

Denoting \ A by Ac (complement), we get


P (Ac ) = 1 P (A).

(2.A.7)

Example 1. There are r people in a room. What is the probability of having at


least two persons sharing the same birthday?
Let Ar be the event of there being at least one such pair. Then Acr = all the people
have distinct birthdays. It is easier to compute P (Acr ). Assume, for simplicity, that
one year consists of 365 days, and the human birth rate is uniform throughout the
year. We get

 



1
2
r1
c
P (Ar ) = 1 1
1
1
.
(2.A.8)
365
365
365
This rapidly converges to 0: P (A30 ) = 1 P (Ac30 ) ' 0.706, and P (A50 ) ' 0.97. u
t
10

Why is such a subjective quantity meaningful objectively? Because our subjectivity is selected
to match objectivity through phylogenetic learning. Those who have subjectivity not well matched
to objectivity have been selected out during the past 4 billion years.

84

CHAPTER 2. STATISTICAL MECHANICS PRIMER

2.A.11 Conditional probability


Suppose we know for sure that an elementary event in B has occurred. Under this
condition what is the probability of the occurrence of the event A? Thus we need the
concept of conditional probability. We write this conditional probability as P (A|B),
and define it through
(2.A.9)
P (A B) = P (A|B)P (B).

2.A.12 Statistical independence


When the occurrence of an elementary event in a set (i.e., an event) A has nothing to do with that in a set B, we say the two events A and B are (statistically)
independent.11 Since knowing about the event B does not help us to obtain more
information about A if A and B are independent, we should get
P (A|B) = P (A).

(2.A.10)

P (A B) = P (A|B) P (B) = P (A) P (B).

(2.A.11)

It follows from (2.A.9) that

We use this as a definition of the (statistical) independence of two events A and B:


Two events A and B are said to be (statistically) independent, if
P (A B) = P (A) P (B).

(2.A.12)

For example, when we use two fair dice a and b and ask the probability for a
to exhibit a number less than or equal to 2 (event A), and b a number larger than
3 (event B), we have only to know the probability for each event A = {1, 2} and
B = {4, 5, 6}. Thus, the answer is P (A) P (B) = 1/3 1/2 = 1/6.
2.A.13 Expectation value
Suppose a probability P is given on a set of events = {i }, and there is an
observable F such that F (i ) is its value when the elementary event i actually
occurs. The expectation value (= average) of F with respect to the probability P is
written as EP (F ) or hF iP and is defined by
X
EP (F ) hF iP
P ({})F (),
(2.A.13)

Often the suffix P is omitted. The sum often becomes integration when we study
events which are specified by a continuous parameter.
11

The concept independence and uncorrelated should not be confused.Uncorrelated often


means that the correlation vanishes.

2.14. PHASE TRANSITION

85

2.A.14 Indicator
The indicator A of a set A is defined by

1 if A,
A ()
0 if 6 A.

(2.A.14)

Notice that
hA iP = P (A).

(2.A.15)

This is a very important relation for the computation of probabilities. The formula
implies that if we can write down an event as a set theoretical formula, we can compute its probability by summation (or more generally, by integration).
2.A.15 Generating function
It is often convenient to introduce a generating function of a probability distribution P with respect to an observable F :
X
etF () P ().
(t)
(2.A.16)

From this definition follows



d log (t)
= hF iP .
dt
t=0

(2.A.17)

That is, if we know the generating function, we can compute the expectation value
by differentiation.
2.A.16 Variance measures fluctuation
F F hF iP describes the fluctuation around the expectation value. hF 2 iP is a
measure of fluctuation called the variance of F :
hF 2 iP = hF 2 iP hF i2P .

(2.A.18)

Since
d
dt


 2
1 d
1 d
1 d2
=

+
= hF 2 iP + hF i2P ,
dt t=0
2 dt
dt2

(2.A.19)

we get
d2
log (t)|t=0 = hF 2 iP hF i2P .
dt2

(2.A.20)

86

CHAPTER 2. STATISTICAL MECHANICS PRIMER

Example 1. Let {Xi }N


i=1 be a set of independently and identically distributed
12
(iid ) random variables. Their expectation value is M and their variance is V .
PN
The expectation value of the sum YN
i=1 Xi is given by N M
p, and its variance
by
N
V
.
Therefore,
the
relative
fluctuation
of
Y
,
defined
as
hYN2 i/hYN i is
N

N V /N M 1/ N . This implies that YN clusters relatively more tightly as N


increases. This is the reason for the law of large numbers. u
t

12

This is a standard abbreviation.

2.14. PHASE TRANSITION

87

Appendix 2B: Rudiments of combinatorics

In statistical mechanics we must be able to compute the number of elementary events


(i.e., microscopic events) under various constraints. We should know rudiments of
combinatorics. C. L. Liu, Introduction to Combinatorial Mathematics (McGraw-Hill)
is a nice introduction to the subject with many (practical) examples.
2.B.17 Sequential arrangement of distinguishable objects: n Pr
Suppose there is a set of n distinguishable objects. How many ways are there to make
sequential arrangements of r objects taken from this set? This number is denoted
by n Pr P (n, r).
There are two ways to get an explicit formula for this number.
(i) There are n ways in selecting the first object. To choose the second object,
there are (n 1) ways, because we have already taken out the first one. Here, the
distinguishability of each object is crucial. In this way we arrive at
P (n, r) = n (n 1) (n r + 1) =

n!
,
(n r)!

(2.B.1)

where n! = 1 2 3 (n 1) n; n factorial is the number of ways n distinguishable


objects can be arranged in a sequence.
(ii) This derivation is an interpretation of the rightmost formula in (2.B.1) For each
arrangement of r objects in a linear order, there are (n r)! ways to complete one
arrangement of n (all) objects. The total number of ways of arranging n objects is
n!, so we must factor (n r)! out.
2.B.18 Selection of distinguishable objects: binomial coefficient
Under the same distinguishability condition, we now disregard the order in the arrangement of r objects. That is, we wish to answer the question: how many ways
are there to choose r objects from a set of n distinguishable objects?
Since we disregard the ordering in each arrangement of r objects, the answer
should be
 
n
n!
n Pr

=
.
(2.B.2)
n Cr
r
r!
(n r)!r!

The number nr is called the binomial coefficient due to a reason clear from (2.B.6).
Exercise 1. Show the following equality and give combinatorial explanations:
 
n
r Pr ,
(2.B.3)
n Pr =
r

88

CHAPTER 2. STATISTICAL MECHANICS PRIMER




n
r


=

n1
r1


+

n1
r


.

(2.B.4)

u
t
2.B.19 Multinomial coefficient
Suppose there are k species of particles. There are qi particles for the i-th species.
We assume that the particlesP
of the same species are not distinguishable. The total
k
number of particles is n
i=1 qi . How many ways are there to arrange these
particles in one dimensional array?
If we assume that all the particles are distinguishable, the answer is n!. However,
the particles of the same species cannot be distinguished, so we need not worry which
i-th particle is chosen first. Hence, we have overcounted the number of ways by the
factor qi ! for the i-th species. The same should hold for all species. Thus we arrive
at
n!
(2.B.5)
.
q1 !q2 ! qk1 !qk !
This is called the multinomial coefficient (2.B.21).

2.B.20 Binomial theorem


Consider the n-th power of x + y. There exists an expansion formula called the
binomial expansion:

n 
X
n
n
(x + y) =
xnr y r .
(2.B.6)
r
r=0

This can be seen easily as follows: We wish to expand the product of n (x + y):
n

z
}|
{
(x + y)(x + y)(x + y) (x + y) (x + y)

(2.B.7)

As an example take the term x2 y n2 . To produce this term by expanding the above
product, we must choose
2 xs from n (x + y). There are n2 ways to do this, so the

coefficient must be n2 .
2.B.21 Multinomial theorem
There is a generalization of (2.B.6) to the case of more than two variables and is
called the multinomial expansion. It can be understood from (2.B.5):
(x1 + x2 + x3 + + xm )n =

X
q1 +q2 ++qm

n!
xq11 xq22 xqmm . (2.B.8)
q
!q
!

q
!
m
=n, q 0 1 2
i

2.14. PHASE TRANSITION

89

2.B.22 Arrangement of indistinguishable objects in distinguishable boxes


Consider n indistinguishable objects. We wish to distribute them into r distinguishable boxes. How many distinguishable arrangements can be made?
Since the boxes are distinguishable, we arrange them in a fixed sequence, and then
distribute the indistinguishable objects.

.........
.........
...
Fig. 2B.1 Indistinguishable objects

Hence, the problem is equivalent to counting the number of arrangements of n indistinguishable balls and r 1 indistinguishable bars on a line (Fig. 2B.1). Apply
(2.B.5) to obtain the answer:


(n + r 1)!
n+r1
=
(2.B.9)
.
n
n!(r 1)!

Index
absolute entropy, 36
absolute temperature, 18
adiabat, 17
adiabatic cooling, 36
adiabatic demagnetization, 36
binomial coefficient, 87
binomial expansion, 88
binomial theorem, 88
Boltzmann constant, 41
Boltzmanns principle, 41
Bose-Einstein distribution, 65
Bose-Einstein condensation, 70
boson, 64
canonical partition function, 45
Caratheodorys principle, 16
chemical potential, 62
Clapeyron-Clausius relation, 77
classical gas, 48
Clausius inequality, 20, 22
Clausius law, 14
closed system, 10
coexistence curve, 75
compound system, 8
conditional probability, 84
conjugate variables, 42
convex analysis, 25
Debye model, 56
Dulong-Petits law, 54
Einstein model, 54

elementary event, 82
ensemble equivalence, 66
entropic elasticity, 34
entropy, 1820
entropy maximization principle, 21
equilibrium state, 6
equipartion of energy, 54
event, 82
evolution criterion, 20
expectation value, 84
extensive quantity, 13
Fermi energy, 68
Fermi-Dirac distribution, 65
fermion, 64
ferromagnetic phase transition, 78
first law, 11
first order phase transition, 78
fluctuation, 85
fourth law, 14, 26
free electron gas, 66
fundamental equation (of state), 58
generating function, 85
Gibbs, 5
Gibbs free energy, 25, 63
Gibbs paradox, 51
Gibbs phase rule, 77
Gibbs relation, 19
Gibbs-Boltzmann distribution, 45
Gibbs-Duhem relation, 26
Gibbs-Helmholtz formula, 46
grand canonical partition function, 62
90

INDEX
grand partition function, 62
heat, 11
Helmholtz free energy, 24, 46, 63
ideal gas, 58
independent events, 84
indicator, 85
intensive quantity, 13
internal degrees of freedom, 59
internal energy, 11
Ising model, 78

91
phase transition, 78
phonon, 71
photon, 72
Plancks law, 15
Plancks radiation law, 73
principle of equal probability, 40
probability measure, 83
probability theory, 82
quasistatic process, 8
reservoir, 21
rubber band, 33

Jacobian technique, 27
Kelvins law, 15
Kramers q, 63
laws of thermodynamics), 6
Le Chateliers principle, 31
Le Chatelier-Brauns principle, 32
Legendre transformation, 24

Schottky defect, 42
second law, 14
second order phase transitions, 78
sign convention, of energy exchange, 12
simple system, 8
spin-statistics relation, 64
stability condition, 20
state function, 9
Stefan-Boltzmann law, 74, 75
Stirlings formula, 43

macroscopic object, 6
mass action, 11
Maxwell distribution, 60
temperature, 7
Maxwells relation, 27
thermal equilibrium, 7
Maxwells relation in terms of Jacobian, thermodynamic conjugate pair, 13
29
thermodynamic coordinates, 8
measure, 83
thermodynamic degrees of freedom, 77
multinomial coefficient, 88
thermodynamic limit, 66, 79
multinomial expansion, 88
thermodynamic space, 8
multinomial theorem, 88
third law, 36, 54, 59, 71
triple point, 76
Nernsts law, 36
variance, 85
open system, 60
order-disorder phase transition, 78
work coordinates, 8
Peierls argument, 80
phase coexistence condition, 24
phase space, 39, 52

Youngs theorem, 27
zeroth law, 7

S-ar putea să vă placă și