Sunteți pe pagina 1din 29

1

Lectures on Statmech 402b 2009


R.Shankar Not for duplication or circulation.

The problem

Imagine a system with a very large number of degrees of freedom, typically 1023 . First let
this be a gas of molecules. At a microscopic level, the complete state of the system is given by
listing the (q, p) for each molecule. As time progresses these change according to Hamiltons
equations.
We are neither interested in, nor capable of following the system at this level of detail.
Instead we will try to describe it by some gross macroscopic variables. In this case it can be
the volume V , energy U and number N . A question we can ask is: what will be the pressure
exerted by the gas for this value of (U, V, N )?
If one side of the container is a piston with weights on it, the pressure is due to molecules
bouncing off it. The force they exert will be rapidly varying and uneven over very short
times (at a molecular level). What we are after is the average over a period of time, short
in the macroscopic but small in the microscopic scale.
The answer we will get on a given sample upon making a pressure measurement at some
given time thus depends on the microscopic situation. Suppose we had a rule for calculating
the odds for every microscopic situation. We could then give the probabilities for getting the
corresponding value of pressure. We could find the average of these numbers and also the
variance. Recall that if an experiment can have several outcomes labeled i with normalized
probabilities pi , i.e.,
X
pi = 1
(1)
i

the average value for a variable v is


v =

pi vi

(2)

where vi is the value v takes when the system is in state i and the root-mean-square or rms
deviation of v is
sX
v =
pi (vi v)2
(3)
i

In case v = 0, v = v2 . A sharp distribution will have a small v.


Thus it appears that even if we knew all the pi s we could only give the odds for what
we would observe on a given system. The mean would correspond to the average over many
measurements of the same system or over many many systems with the same (U, V, N ). The
latter collection of systems is called the microcanonical ensemble. In general a microcanonical
ensemble corresponding to an isolated system is made of a large or infinite number of systems
with the same macroscopic parameters but uniformly distributed over all possible microscopic
configurations consistent with these. Note that an ensemble is a mental concept used to define
averages and does not exist as a real entity.
Now recall the relation from the thermodynamics of an ideal gas:
3
U = PV
2

orP =

2U
3V

(4)

2 THE MAGNETIC SYSTEM

which suggests that U and V determine the pressure uniquely. How do we reconcile this
with the fact that P is a statistical variable with a mean and fluctuations of size v around
the mean?
The answer will turn out to be that P will have a very sharp probability distribution so
that any significant deviations from the mean are utterly unlikely (but not impossible.) If we
neglect the fluctuations around the mean w e regain thermodynamics.
To see how this happens we first need the probabilities for various possible states of the
gas. This cannot be computed, but is postulated. It is the only postulate we will need and
it applies to gases and non-gases alike.
Postulate: Every allowed microscopic state of an isolated system will occur
with the same probability.
For the gas this means that every configuration of molecules has equal weight as long as
there are N molecules with energy U (since the particle number and energy of an isolated
system are conserved) and the molecular coordinates do not leave the box.
Using this postulate we will regain Equation 4 but where P will stand for P , the average.
Besides reproducing thermodynamics as an average result, statistical mechanics also predicts
that P will vary by small amounts, and calculates the variation P . This in turn will come
from the fact that most of the microscopic configurations will yield nearly the same pressure
P , with a few outliers yielding slightly different numbers.
To understand statistical mechanics, it is better for pedagogical reasons, to consider a
simpler system than a gas, and to which we turn our attention.

The magnetic system

Consider a line of atomic magnets labeled 1, 2, ..., N , where N is assumed to be even. Unlike
real magnets which can point in any direction, these will be allowed to point only up or
down an applied magnetic field B, with moment . We shall associate with these two sign
choices a dimensionless Ising spin si = 1. The Ising spin i has an energy
= si B.

(5)

The system as a whole can have total spin


s=

N
X

si

(6)

i=1

which runs between N . The corresponding total energy U = sB runs between N B


U = BN, B(N 2), ......, BN

(7)

as s runs between N . Note that if we flip one up spin to down, the total spin drops by 2,
so that the only allowed values for the total spin are even.
Let us consider the case B = 0. The spin still has two values s = 1, but all 2N
microstates have the same energy U = 0, and all are accessible and equally probable. This
is however not true for the total magnetic moment
M =

X
i

si

(8)

2 THE MAGNETIC SYSTEM

which is a macroscopic observable, that is, one that can be measured with macroscopic
probes like induction coils. Let us see why M does not have a flat probability distribution.
To the macrostate with M = N there corresponds just one microstate, with all spins up. If
we consider M = (N 2), obtained by flipping one spin, there are N candidates, depending
which spin we choose for flipping. If we flip two spins there are N choices for the first, N 1
for the second leading to a total of N (N 1)/2 choices where we divide by 2 because choosing
say spin number 19 to flip first and spin number 32 to flip next is the same state we get by
choosing them in the opposite order. In general if we flip n spins the number of ways is
(N, n) =

N (N 1)...(N n + 1)
n!

(9)

where we divide by n!, the number of permutations among the flipped spins which differ
only in the order in which they are chosen, which is irrelevant to the microstate in question.
Multiplying top and bottom of Eqn.10 by (N 1)!) we obtain a nicer looking result
(N, n) =

N!
n!(N n)!

(10)

It is easily seen that (N, n) initially grows very rapidly with n since the more spins we
flip, the more ways there are to choose them. This changes when n crosses N/2, and we
begin to approach the state with all spins down. Indeed there is just only one way to have
all spins flipped down. This symmetry under the exchange n N n is evident in Eqn. 10.
When n spins are flipped the total magnetic moment is
M (n) = (N 2n)

(11)

and there are (N, n) ways for this to happen. Thus although every microstate is equally
likely, macrostates defined by M are not equally likely, states with M ' 0, where the ups
nearly equal downs are far more likely than states with M ' N .
Consider the case of N = 100. Of the total of 2100 states,
(100, 50) =

100!
50!50!

(12)

equals roughly 8% and states with 4 M + 4. If we include in addition macrostates


with M lying within 10% of the maximum moment N they occur with 73% probability.
We still see a 10% deviation from the mean with a 37% chance because N is not really large.
To see a really sharp distribution encountered in real thermodynamic systems, we need to
consider N of the order of Avogadros number 1023 . For this it is useful to invoke Sterlings
formula:
ln N ! = N ln N N + ..
(13)
which works for large N . A simple way to understand this formula is replace the sum by an
integral as follows
ln N ! = ln N + ln(N 1) + ... ln 1 '

Z N
1

ln xdx = N ln N N

(14)

2 THE MAGNETIC SYSTEM

We will often use a similar result:


n2
X

Z n2

f (n) '

n1

n1

f (x)dx

(15)

where on the LHS f (n) is defined only for integer n and on the right hand side x is continuous.
The LHS is a sum of areas of rectangles of base 1 and height f (n) while in the right hand
side we have the area under a function f (x) which varies continuously with x and coincides
with f (n) at integer values of x. If f varies smoothly with n, the two areas will be close.
However before we can do that we need to be able to extend the function of n for integer n
to a function where n is continuous. This is easy of say f (n) = n2 but not if f (n) = n!. (Not
easy but not impossible as we will see when we learn about the Gamma function.) Luckily
the Sterling approximation ln n! ' n ln n n allows us to let n be continuous in n ln n n.
Continuing,
ln (N, n) ' ln(

N!
) = N ln N N n ln n + n (N n) ln(N n) + N n
n!(N n)!
= N ln N n ln n (N n) ln(N n).
(16)

Since n is a continuous variable, we find ln reaches its maximum when


d ln (N, n)
= ln n + ln(N n) = 0
dn

(17)

which gives the most probable value for n as


n =

N
2

(18)

as anticipated. The maximum vale of is then


(N, n ) = exp [N ln N 2 (N/2) ln(N/2) = N ln 2] = 2N

(19)

Note that the number of states associated with n = N/2 is 2N , which is all of the states! This
is of course not exactly correct and comes from the Sterling approximation. It does however
show something real: that the most probable configuration and its immediate neighbors
essentially saturate the sum over states. To see this clearly let us expand ln in a Taylor
series around n = N/2 for n = n + 2s . We write the deviation from n as s/2 since the
corresponding total spin is s.

d ln (N, n)
s
1 d2 ln (N, n)
s2
s
( )+
(
) + .. (20)
ln (N, n ) = 2N +

2
dn
2
2
dn2
4
n=n = N
n=n = N

The linear term is zero (since we are at a maximum) and the quadratic term (with which
we stop ) has a coefficient

1 d2 ln (N, n)
2

=
2

2
dn
N
n=n = N
2

(21)

2 THE MAGNETIC SYSTEM

We have to this approximation (upon exponentiating the expression for ln )


s
2
(N, n ) = 2N es /2N
(22)
2

This formula tells


us
how

falls
with
s:
beyond
|s|
'
N , the exponential dies rapidly.

Now a width s = N seems like a large value for the deviation of s, but not if we compare
it to its maximum value of N . Indeed

s
' 1/ N .
N

(23)

This is an approximate formula since the Sterling approximation has been made. However
it works very well. Consider s2 . Its exact value is
PN

s2 = PN

N!
s2 ( N s !)(
N +s
!)
2

(24)

N!

( N2s !)( N2+s !)

Using Mathematica we can evaluate it to find it equals N , for say N = 50.


If we use the approximate form and treat s as continuous we find
R

s2 2N es /2N ds
R
s2 =
= N.
N s2 /2N ds
2 e

(25)

where we have taken the limits to since the Gaussian function falls off very rapidly from
its maximum and because with these limits we can do the integral exactly. The denominator
is needed because the probability distribution in the numerator is not normalized. This
exact agreement between the sum and the integral using the Sterling approximation is an
accident. For example in the case of s4 there are small differences between the approximate
and exact results: for M = 50, ((s4 )1/4 computed two ways are in the ratio of 1.0302.
Since you may not be familiar with this integral we digress to consider it and a related
one.
First we need the result

I() =

x2

dx =

v
u
u
t

(26)

which is shown as follows:


I()I() =
=

x2

Z Z 2
0

= 2

Z
0

dx

ey dy

(27)

er rdrd

(28)

dz
=
2

(29)

ez

(30)

3 ENTER ENTROPY

where we have made two copies of I with two different dummy variables, changed to polar
coordinates and gone from r to z = r2 . Eqn. 26 follows upon taking square roots of both
sides.
Differentiating both sides of Eq. 26 wrt we find
Z

x2

(x )e

1
dx =
2

(31)

You can clearly differentiate more times to get integrals involving higher even powers of x.
Returning to our problem all this means
Z

2
es /2N ds = 2N
(32)

and

s2 es

2 /2N

ds = N 2N

(33)

The mean value of s is of course zero by symmetry and the average of s2 is


s

s2 =

1 Z
2
dss2 es /4N = N
2N

which means the rms deviation is


s =
and

(34)

(35)

s
1
=
.
(36)
N
N
This simple example illustrates how statistical mechanics really works. An experimentalist who has a macroscopic sample of Ising magnets wants to know what the total moment
of the system will be when probed. It could be any number between N , and we can only
give odds. Usually when this is so, we need to repeat the experiment many times to verify
that the probability distribution is correct. However in the case of statistical mechanics ,
the probability distribution is so sharply peaked around the macroscopic variable, we can
confront the mean with what is going to measured on an individual system in the lab.
The difference between statistical mechanics and thermodynamics is that the state of
equilibrium in thermodynamics is unchanging and described by just the mean, while in
statistical mechanics the equilibrium state includes fluctuations which are calculable.

Enter entropy

In thermodynamics the entropy S was introduced as a state variable the change in which is
given by
Qrev
(37)
S =
T
where the subscript in Qrev reminds us the heat transfer must be reversible, and done arbitrarily slowly, keeping the system close to equilibrium. The second law said

3 ENTER ENTROPY

For an isolated system, S 0

An example was the free expansion of a gas from V1 to V2 which was associated with an
increase
V2
S = N k ln
(38)
V1
Recall that even though no heat flowed in the free expansion, S > 0 because to find the
actual change in entropy between the initial and final equilibrium states we had to find a
reversible path connecting them. Along this path at fixed T (since U was unchanged in free
expansion) the system absorbed heat from a reservoir as it expanded very slowly to go from
V1 to V2 .
It was however not clear from all this what S stood for or why it always went up for
processes we considered irreversible (like free expansion). Furthermore only formulas for
entropy difference were given, entropy did not seem to be absolutely defined.
The great insight from statistical mechanics is a formula for the entropy of an isolated
system with some known macroscopic parameters:

S = k ln

(39)

where is the number of available microstates compatible with the macroscopic parameters. For example, for a gas defined by some (U, V, N ) , (U, V, N ) is the number of
configurations of the N atoms with the given total energy and volume. Counting these
configurations is a little tricky so consider the magnetic system.
If all we know is that there are N spins,
S = k ln (N ) = k ln 2N = N k ln 2

(40)

In the Sterling approximation (valid for large N ) the number of microstates associated with
the most probable state, the state with the greatest number of microstates, i.e., with M = 0
is
N!
(41)
(N, N/2) = N 2
( 2 !)
In the Sterling approximation, it has a log of
ln

N!
( N2 !)2

= N ln 2

(42)

Thus in the limit N , S associated with all possible states and the most probable state
are indistinguishable. Let me elaborate. The function has a maximum at n = N/2. It is
very sharp, but let me assume it has width equal to N itself. The total area is then no more
than 2N N and on taking logs,
ln 2N N = N ln 2 + ln N = N (ln 2 +

ln N
) ' N ln 2 as N .
N

(43)

4 SPINS IN A FIELD

It is generally true that we can compute S from all the allowed states or the ones associated
with the most highest probability in the thermodynamic limit N . For the same reason,
there is little difference between the exact mean of any macroscopic variable and its value in
the most probable state.
In short the most probable state is so overwhelmingly probable that we can saturate all
averages with it.

Spins in a field

Let us now turn on a magnetic field B so that U = sB. Now the system is stuck at one
value of s by energy conservation. It will freely roam over the
s
N!
(N, n ) = N s N +s
2
( 2 !)( 2 !)

(44)

microstates with that s. We prefer the approximation valid for large N and s in which s is
treated as continuous:
s
2
(N, n ) = 2N es /2N
(45)
2
The corresponding entropy is
s2
(46)
2N
Imagine the system starts in a state with all spins up. (This can be arranged by placing it
in a very strong field). It has an entropy S = 0. Suppose we suddenly turn off the field.
We are now allowing it to explore all values of s since they all have the same energy. It
will very quickly evolve into the state with s ' 0 for these have the overwhelming odds and
multiplicities.
This is the reason we say that the entropy will increase in a spontaneous process when a
constraint is removed. The constraint here was the magnetic field which kept the system at
a fixed s (which was M in our example but could have been any other value of s) since the
corresponding energy U (s) was conserved. Turning off the field lifts the constraint and the
system will now search over all available states to find a state of maximum multiplicity. The
precess is irreversible since the system will not, of its own, go back to all spins up within any
reasonable time. (One can estimate that it will take many times the age of the universe for
such a fluctuation to occur if N is macroscopic.)
We have seen the analogous situation in an ideal gas which is initially trapped in a subvolume V1 of a container of full volume V2 by a partition. It has en entropy S(U, V1 ) related
to how many microstates area allowed with this energy and volume. If the partition is now
removed, the gas can roam all over V2 . There are clearly more things it can do now and its
entropy will go up. In fact we can easily see by how much it goes up.
S(U (s)) = k ln = N ln 2

Entropy for an ideal gas

Since the spins had a discrete and countable number of states, there was not problem calculating the entropy for any macrostate. Consider however a gas molecule. Its microstate is

5 ENTROPY FOR AN IDEAL GAS

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Figure 1: The big circles depicts all available microstates. The region with the brick design
dominates and represents the most probable macrostate (and its neighbors, in terms of properties). The hatched and grey regions represent all other macrostates. A system starting
out in an improbable state (where it was held by constraints) will begin to wander when the
constraints are removed and almost certainly reach the most probable configuration. Occasionally it may wander back to different ones, since all states are in principle possible, but
macrostates differing in any significant way from the most probable one are overwhelmingly
improbable. In practice the most probable configurations dominate the allowed region far
more than is indicated in the figure.
given by the pair (q, p), its coordinate and momentum. (We are considering one dimension
for simplicity.) Every point in the q p plane, called the phase space, represents a state of
the system. Since the number of points in any tiny region is infinite, so will be the entropy
no matter what is going on. So what one does is to divide the phase space into cells of size
dpdq = h where h is some small but arbitrary number. The number of states associated with
any macrostate will then be, for a gas of N molecules in 3d of energy U and volume V
(U, V, N ) =

Z Y
N
d3 pi d3 ri
i=1

h3

(U, V )

(47)

where the function (U, V ) is nonzero only over regions compatible with the macroscopic
parameters. For example it is zero if any coordinate lies outside the box of volume V or if the
total kinetic energy does not equal the prescribed U . Since h arbitrary in classical mechanics,
the entropy S will have an additive term and only changes in S will be independent of h.
Later when I give you a quick summary of quantum mechanics, you will see that the problem
goes away because microstates are discrete and countable as in the spin problem.
In general the integral above cannot be done since U depends on the momenta and
positions of the molecules, the latter entering via the potential energy. For the ideal gas

5 ENTROPY FOR AN IDEAL GAS

10

there are no interactions and U is simply the sum of kinetic energies


U=

N
X
p2ix + p2iy + p2iz

(48)

2m

i=1

The integral over all coordinates gives a factor V N . The momentum integrals are limited
to those that obey Eq. 48. If we imagine a momentum space of dimension 3N , the
momentum state of the entire gas is given by a point with coordinates p1x , p1y , ....pN z and
the energy is just
p2
U=
(49)
2m
where p is thew length of a vector in 3N dimensions with components p1x , p1y , ....pN z . Thus
the allowed points lie on a sphere of radius

p = 2mU
(50)
The area of a sphere of radius r in D dimensions is
2 D/2 rD1

A=
where

(51)

h i
D
2

xz1 ex dx

(52)

(z) = (z 1)(z 1)

(53)

(n) = (n 1)!

(54)

(z) =

You can show by integrating by parts that

so that if z = n, an integer then


If n is a half-integer, we will need to know ( 12 ): for example ( 52 ) = 32 ( 12 ). We evaluate it
from its definition and using the substitution x = y 2 :
Z
Z

1
2
( ) =
x1/2 ex dx = 2
ey dy =
2
0
0

(55)

Returning to our problem


3N

U 2
(U, V, N ) = AV 3N
( 2 )!
N

(56)

where A is a constant independent of (U, V, N ). (We have also neglected the difference
between N and N 1.) Thus the entropy of a classical gas is

3
3 3N
ln U ln
+c
2
2
2
3 U
= N k(ln V + ln + c0 )
2 N

S(U, V, N ) = N k ln V +

(57)
(58)

5 ENTROPY FOR AN IDEAL GAS

11

where c and c0 are constants independent of (U, V, N ) and the Sterling approximation
has been invoked.
It follows readily from Eq. 58 that when a gas expands freely against a vacuum (so that
U is constant) from V1 to V2 the change in S is
S2 S1 = N K ln

V2
V1

(59)

in accordance with the result derived in thermodynamics. However, now we understand the
increase in S as the increase of available phase space for each molecule. I do not know who
I admire more, those who got the expression for the change in S before we knew about
molecules or those who found the microscopic basis behind it.
Eqn. 58 has one defect. Entropy, being an extensive quantity must double if I double
(U,V,N), i.e., make two copies of the gas and glue them together. This will happen if it is N
times a function of V /N and U/N . We see that this is not true for the V term. This leads
to the Gibbs paradox. Imagine two adjacent copies of a gas each with (U, V, N ), separated
by a partition. Now open the partition. The above formula will show an increase of S by
2N k ln 2. This makes no sense. While it is true that the molecules which used to be confined
one of the boxes can now travel over both, the gas looks the same, with a roughly 50-50 split.
Had the gases been different, the mixing would produce visible effects and an increase in S
would be reasonable. Indeed in the case of identical gases, one can insert the partition back
and go back to what used to be, i.e., the process is reversible. Had the gases been different,
we cannot unmix them, in which case an increase in S makes sense.
Gibbs, who brought up the paradox, also provided the way to fix this problem: divide
in Eqn. 56 by N !, so that the S which meets the extensive requirement is

3 U
V
S(U, V, N ) = N k ln + ln + c .
N 2 N

(60)

Quantum mechanics gives a partial explanation for dividing by N !. In classical mechanics


if we exchange two identical particles, say Moe and Joe, we get a new configuration: Moe
being here and Joe being there is different from Joe being here and Moe being there, even if
they are identical. Suppose someone exchanges Moe and Joe when I am not looking. I will
not know, but the change is real, since someone else could have been watching and observed
the swap. In other words, even though the particles are identical, exchanging them produces
a new configuration since the particle exchange can be followed. On the other hand, in
quantum theory, particles do not have definite trajectories and no specific location except
when they are observed. In this case the exchange of identical particles has no operational
meaning. Consequently the N ! permutations of particles, which correspond to different
microstates in classical mechanics collapse into one configuration in quantum theory.
While quantum mechanics explains the division by N !, it goes beyond that. Identical
particles are classified into bosons and fermions. Fermions have additional restrictions on
the allowed configurations: no quantum state can be occupied by more than one fermion.
Thus the classical results will not equal the quantum ones even after dividing by N ! except

6 ENTER T , P AND

12

in the case of a very dilute gas where there is only a very slim chance of two particles trying
to occupy the same state. We will return to this point later.
For now, let us assume Eq. 60. Knowledge of S(U, V, N ), called the FUNDAMENTAL
RELATION constitutes the most complete knowledge of the system. Everything follows from
it. Now you might say, What about P V = N kT ? Where is T and where is P ? We will
now see how all of these emerge from the fundamental relation S(U, V, N ).

Enter T , P and

Imagine two systems with their own (U10 , V10 , N10 ) and (U20 , V20 , N20 ). They are separated by a
partition or barrier that is immobile, cannot conduct heat or allow particles to cross over. The
systems are in thermal equilibrium and have entropies S1 (U10 , V10 , N10 ) and S2 (U20 , V20 , N20 ).
Note that at this stage the individual energies, volumes and particle numbers are separately
conserved as is the total:
U = U10 + U20
V = V10 + V20
N = N10 + N20 .

(61)
(62)
(63)

Suppose we now allow the partition to conduct heat. Energy can now be exchanged,
keeping the total constant. What decides how the total energy will be shared? The law
of increasing entropy provides the answer: the systems will exchange energy till they reach
a point where the total entropy is a maximum. Let us examine this more slowly, bearing
in mind that the combined system is still isolated and all its microstates are still equally
probable.
Initially the number of states allowed to the combined system was
0 = 1 (U10 , V10 , N10 ) 2 (U20 , V20 , N20 )

(64)

S0 = S10 + S20

(65)

Taking logs we get


This additivity is what we expect of an extensive quantity.
Once energy transfer is allowed, only the total energy (and not individual) ones are
conserved and the number of microstates becomes a sum over all possible ways of sharing
energy keeping the total constant:
=

1 (U1 , V10 , N10 ) 2 (U2 = U U1 , V20 , N20 )

(66)

U1

(The sum could be an integral, without affecting our argument.) What will the system do?
Recall the magnetic case in zero field where all spin states are allowed. We could write the
allowed states as a sum over sectors of definite total spin s as follows
=

+N
X

N!

N s N +s
! 2 !
s=N
2

(67)

6 ENTER T , P AND

13

We saw that the sum was dominated by the most probable configuration s = 0 and its
immediate neighbors. Indeed for computing S we could just take log of this term. (We saw
that if we multiplied the biggest height by some width even as large as N , to get the total
area, the effect on S was negligible in the limit N .)
The same applies to Eq. 66. As we increase U1 , 1 (U1 ) will rise rapidly and 1 (U2 =
U U1 ) will fall very rapidly and the product will have a very sharp maximum which will
dominate the sum and indeed saturate it.
To find the macrostate corresponding to the most probable configuration in the sum, we
take the U1 derivative of the log of the multiplicity at each U1 and equate it to zero:

ln 1 (U1 , V10 , N10 )


ln 2 (U2 = U U1 , V20 , N20 )
dU2

dU1 +

dU1 = 0

U1
U2
dU1
V1 ,N1
V2 ,N2
Noting that

dU2
dU1

(68)

= 1 since U2 = U U1 , we find the most probable macrostate obeys

ln 1 (U1 , V10 , N10 )


ln 2 (U2 = U U1 , V20 , N20 )

U1
U2
V1 ,N1
V2 ,N2

(69)

In terms of the entropies we could write this as


S1 (U1 )
S2 (U2 )
=
U1
U2

(70)

where arguments of S that do not vary, like volume and particle number are suppressed.
The equality in Eqn 78 must refer to the equality of the temperatures of the two systems
when heat transfer is allowed. But the derivative itself need not be the temperature, it could
be any function of it. To find out what let us turn to the ideal gas where we had

V
3 U
S(U, V, N ) = N k ln + ln + c .
N
2 N
We find

(71)

S
3N k

=
U V,N
2U

(72)

comparing which to U = 32 N KT we see that

S
1

U V,N
T

(73)

which is a very profound relation relating the absolute temperature to the derivative of the
log of the number of states with respect to energy. We can equivalently turn to

=T
S V,N

(74)

which comes from the First Law of thermodynamics:

dU = T dS P dV

U
T =

S V,N

U
P =

V S,N

(75)

6 ENTER T , P AND

14

We can also rewrite the above relation with S viewed as a function of U and N :

1
P
1
S
S
P
dS = dU + dV
=

T
T
T
U V,N
T
V U,N

(76)

which agrees with Eq. 73.


Three subtle points.
First, only if we saturate the sum in Eq. 66 by the dominant term can we write the total
entropy as a sum S = S1 + S2 . We have seen that even if we multiply the dominant height
by some width of order U , S will change only by a term of order ln U while its overall scale
is proportional to N and U .
Secondly, let us consider the process of letting the separation become heat conducting.
This is an irreversible process, unless by chance both sides were at the same temperature.
Once hot and old have mixed spontaneously we cannot undo it without external intervention.
The system originally restricted to a state with some number of microstates has now the
option of exploring many more options and will typically end up in the most probable sector.
Figure1 depicts the situation.
Finally, imagine what happens when the barrier becomes conducting. There will be a
period of nonequilibrium. We really cannot say what the entropy of each half is even if it
has a definite energy since this energy may not even be uniformly distributed within the
volume. Yet people often say the entropy is rising in real time when the barrier is allowed
to conduct. This makes sense if the barrier allows heat to flow very slowly. Suppose we let
some energy flow and then stop the flow by inserting an insulating wall. The two systems
get to equilibrate and have well defined entropies. The odds are very high that this state
will have a bigger S than before since the evolution to a macrostate with more microstates
is overwhelmingly more likely. Suppose we allow some more heat to flow and repeat the
process. Then at every stage we have a well defined entropy and soon it will reach the
maximum and after that waiting will change nothing. (This is analogous to letting the free
expansion take place in stages with a piston that provides the right counter force and can
be locked in placed after every small expansion.)
So we finally understand why heat flow from a hot body to a cold body in microscopic
terms. The change in S when heat flows is
dS = (

1
1
)dU1
T1 T2

(77)

If dS > 0, it means that if T1 > T2 , then dU1 < 0, i.e., the hot body loses energy. In other
words the hot body may lose some states when it loses energy but the cold body is able to
more than make up that loss by the increase in its states. In other words, heat flows the way
it does to increase the number of states for the combined system.

6.1

The pressure

Imagine now that the barrier or partition between systems is heat conducting and movable.
By similar arguments, there will be volume changes until we have a situation with

S1 (U1 , V1 , N1 )
S2 (U2 , V2 , N2 )

V1
V2
U1 ,N1
U2 ,N2

(78)

6 ENTER T , P AND

15

This must correspond to the situation where the pressures have equalized. To verify this,
consider the ideal gas again for which

V
3 U
S(U, V, N ) = N k ln + ln + c .
N
2 N
and take the V -derivative to find

S
Nk

=
V U,N
V

(79)
(80)

upon comparing to P V = N kT we find that

P
S

=
V U,N
T

(81)

again in agreement with Eqn. 76 of thermodynamics.


This means Eqn. 78 refers to the the equality of P/T and hence that of just P , since we
already know the T s are equal. ( Had the separating barrier been movable but nonconducting, we could not deduce from the equality of P/T the equality of P and T . It looks like we
cannot predict what will happen in this case. This is correct. To see this, consider a case
of two gases. If barrier is released and free to move, it will oscillate indefinitely in the ideal
frictionless case, unless heat transfer across it allows it to find a stable equilibrium position.
If friction is involved in damping the oscillations, the answer will depend on the details of
friction.)
Finally consider a case when the systems can exchange heat, volume and particles. By
similar arguments we will get the following conditions for equilibrium:

S1 (U1 , V1 , N1 )
S2 (U2 , V2 , N2 )

N1
N2
U1 ,V1
U2 ,V2

(82)

The derivative S/N is not a very familiar thing. To get a feel for it, let us go back to
thermodynamics and consider U (S, V, N ). For small changes of its arguments we have
dU = T dS P dV + dN
which defines

(83)

U
=

N S,V

(84)

called the chemical potential, is the energy increase due to adding one particle at fixed volume
and entropy. We will get a feeling for it later. For now, note that if we rewrite Eq. 83 in
terms of S
1
P

dS = dU + dV dN
(85)
T
T
T
At equilibrium the chemical potentials for both sides must be equal if particles can flow
freely.
For an ideal gas we find

P
3/2
(cT )
(86)
= kT ln
kT
where c is a constant which involves the particle mass, and the cell size h in phase space (to
which we have not paid much attention).

7 THE CANONICAL ENSEMBLE

16

Reservoir

j our
system
0

Isolated system

Figure 2: Our system of interest is in contact with the huge reservoir. The two of them form
the isolated system to which the law of equal probabilities applies. The figure shows three
levels of the system: the ground state 0 and two more, labeled i and j. The figure and the
derivation refer to a discrete set of energy states. If energy varies continuously, sums must
be replaced by integrals.

The canonical ensemble

We are now going to switch from an isolated system to one in thermal equilibrium with
a reservoir of fixed temperature T , a common situation. The system need not be small,
the reservoir just needs to be arbitrarily large, so that its temperature is not affected by
what the system is doing. For example the system could be a cylinder of gas immersed in a
huge swimming pool at fixed T . Since only the combined energy of system and reservoir is
constant, the system can have a range of energies, unlike the isolated one.
The question we ask is this: what are the odds that the system will be in a state i whose
energy is i ?
We do not need a new postulate here. We will apply the original one of equal probabilities
for every microstate of an isolated system, choosing for our isolated system the combination
of our system of interest and the reservoir, as shown in Figure 2. Let E0 be the total energy
of the system and reservoir and let the lowest energy state of the system be 0.
Let (UR ) be the number of microstates the reservoir can be in if its energy is UR .
Consider two cases: one where the system is in a particular microstate i of energy i and
the reservoir is in any one of (U0 i ) microstates and the other where the system is in
a particular microstate state j with energy j and the reservoir is in any one of (U0 j )
microstates. What can we say about the probabilities P (i) and P (j) for these two outcomes?
Since the system is in a definite microstate in either case (i or j), the number of states the
entire system can have is just the corresponding (UR ). From the microcanonical postulate,
the ratio of probabilities for the two cases is :
(U0 i ) 1
P (i)
=
P (j)
(U0 j ) 1
where the 1s denote microstates open to the system under the conditions specified.

(87)

7 THE CANONICAL ENSEMBLE

17

Since i << UR , we Taylor expand ln


ln (U0 i ) = ln (U0 ) i + . . . where

ln (UR )
1
=
=
UR U0
kT

(88)
(89)

Bear in mind that we will often refer to 1/kT as .


We drop higher derivatives because they correspond to the rate of change of = 1/kT
of the reservoir as the system moves up and down in energy and the temperature of an ideal
reservoir, by definition, is a constant unaffected by the system. It is like saying that my
jumping into or out of the Atlantic ocean will not alter its temperature.
It follows upon exponentiating both sides of Eqn. 88 that
P (i)
(U0 )ei
ei
=
=
P (j)
(U0 )ej
ej

(90)

that is to say, the relative probability of the system being in state i of energy i is ei .
This is called canonical distribution and ei is called the Boltzmann weight of state i.
Now P (i), the absolute probability that the system will be in a state i that has energy
i is
ei
ei
P (i) = P i =
(91)
Z
ie
where we have defined the partition function
Z=

ei .

(92)

Once we have P (i), we can get the average of any variable v which takes a value v(i) in
state i:
P
v(i)ei
v = i
(93)
Z
Consider the interesting case of the average energy which we will denote by U :
P

U =

i
i i e

d ln Z
d

(94)

so that if Z is known in closed form, we just need take its logarithmic derivative with respect
to .
Consider a gas of atoms in a box. We will treat it quantum mechanically. All you need
to know is that in a box of sides (Lx , Ly , Lz ), the only allowed energies for each atom are
2 2 n2y h
h
2 2 n2x h
2 2 n2z
=
+
+
2mL2x
2mL2y
2mL2z

(95)

where nx etc are positive integers. The state i of the gas tells us how many atoms are in
each possible quantum state of the type given above and P (i) = ei /Z is the corresponding
probability for i to happen.

7 THE CANONICAL ENSEMBLE

18

Suppose we squeeze or expand the box, say by varying Lx . The levels will move and the
atoms will move with them provided the variation is slow enough. In this case the change in
i
energy of the gas is
dV if it is in state i and the weighted average of the change is
V
U =

X i
i

dV ei /Z =

ln Z
dV P dV
V

(96)

where we have defined the pressure by U = P dV so that


P
ln Z
=
kT
V

(97)

Let us apply these ideas first to a gas of atoms treated by classical means. Let us focus
on just one molecule and imagine the rest of the gas as the reservoir it is in contact with.
(In classical mechanics we can keep track of the molecule even if it is constantly bumping
into other identical ones.) The states of this molecule are continuous and given by (p, r).
We cannot assign a finite probability to each such point in phase space since there are
an infinite number of them in any tiny region. Instead we assign a probability density which
is defined as follows. Suppose x is a continuous variable. We define the probability density
P (x) as follows
P (x)dx is the probability x will lie within dx of a point x

(98)

If P (x) is normalized, we will demand


Z

P (x)dx = 1,

(99)

assuming x can take any value. Even P (x) is not normalized, it can still be used to get the
relative probabilities. For example
R2
1 P (x)dx
(100)
R 12
11 P (x)dx
gives the ratio of odds of finding it between 1 x 2 and 11 x 12.
Let us turn to our gas atom and ask for the non-normalized probability density that it
is in a tiny cell of volume d3 pd3 r centered at position r and momentum p. The energy of
this state is (p, r) = p2 /2m. Note that the energy and hence P are independent of r as
long as it lies inside the box of volume V . Let us assume this is the case and focus on the p
dependence
p2

P (p)d3 p = N e 2m d3 p

(101)

where N may be chosen to normalize P if we want. Let us do that. We write


N

p
2m

dp = N

= 4N

p2

e 2m p2 dpd sin d
s

2m
.

(102)
(103)

7 THE CANONICAL ENSEMBLE


For this to equal 1, we need N =

4m

19
q

.
2m

So

P (p)d p =
4m
3

p2 3
e 2m d p
2m

(104)

is the normalized probability density that the atom has a momentum lying in the tiny cell
of volume d3 p. Suppose we want the probability density P (p) that the momentum has a
magnitude p but any direction. We should keep all points within a spherical shell of radius
p and thickness dp:
s

2 p2
P (p) dp =
p e 2m dp
(105)
m 2m
It is more common to rewrite this result in terms of the speed v rather than p, and obtain
what is called the Maxwell-Boltzmann velocity distribution:
s

P (v)dv = m

m 2 mv2
v e 2 dv
2

(106)

The main feature to note is that the function initially rises due to the v 2 factor (the area of
a sphere of radius v which contains points of a given v) and is eventually brought down by
the exponential Boltzmann factor which says that each q
of these points becomes less likely
as the kinetic energy grows. It has a maximum at vm = 2kT
.
m
When we first studied the ideal gas we made the simplifying assumption that all atoms
had a common velocity v. We see that in reality there is a probability distribution (which
has been verified experimentally.) We also said K = 32 kT was the kinetic energy of the
atoms. We see this corresponds to the average K coming from Eqn. 105:
Z
p2

K=
=
2m
m
0

p2 2 p2
3
3kT
p e 2m dp =
=
2m 2m
2
2

(107)

ln Z
The average energy can be deduced with a lot less effort as follows. Since U =
,
we need just the dependence which may be extracted as follows:

Z=A
where z 2 =

p2
2m

Z
0

p2

p2 e 2m dp = A0 3/2

Z
0

z 2 ez dz

(108)

and A and A0 do not depend on . Thus


3
ln Z = ln + ln A
2

(109)

Eq. 107 follows since A does not depend on .


You can amuse yourself by computing Z in cartesian coordinates. It will be the cube of
1
a gaussian integral proportional to 2 .

8 THE HARMONIC OSCILLATOR: CLASSICAL AND QUANTUM

20

The harmonic oscillator: classical and quantum

We have seen that the harmonic oscillator with hamiltonian


p2
1
H=
+ m 2 x2
2m 2

(110)

describes any system near a point of stable equilibrium because the potential energy can be
expanded as

1 d2 U
dU
(x x0 ) +
(x x0 )2 + . . .
U (x) = U (x0 ) +
2

dx x0
2 dx x0

(111)

If x = x0 is a point of equilibrium, the force dU/dx vanishes there. Shifting the origin
to x0 , redefining the zero of potential energy at U (x0 ) and dropping higher derivatives for
because x x0 is small we find

1 d2 U
1
U (x) =
x2 m 2 x2
2

2 dx x0
2

(112)

where we have used the fact that 2 = k/m and k = ddxU2 .


x0
When N degrees of freedom are oscillating near equilibrium ( recall the case of two
coupled masses) we have seen that by changing variables we can make H a sum over N
non-interacting oscillators with frequencies i i = 1, ...N .
Thus the thermodynamics of an oscillator covers a huge variety of problems. Let first do
one and then turn to many that describe the normal modes of some system.
2

8.1

One oscillator-classical

The partition function is, by definition


Z() =

Z Z
dxdp

"

p2 m 2 2
exp
+
x
2m
2

(113)

and readily evaluated to be


Z() =

2kT
2
=
h
h

(114)

The average energy is given by


1
ln Z
= = kT
U =

(115)

Recall that the mean kinetic energy of a molecules with three degrees of freedom was 32 kT .
It is clear that in a problem where H is a sum over d quadratic terms,
q the average energy
d

is U = 2 kT since the gaussian integral factorize each giving a factor 1/ independent of


constants like , m,h etc., . which drop out when we take the log and then differentiate it.

8 THE HARMONIC OSCILLATOR: CLASSICAL AND QUANTUM

21

Consider now a solid in three dimensions. In equilibrium every atom has its place. When
disturbed, the atoms vibrate. Assuming first that the atoms vibrate independently of each
other at some frequency , with
H=

p2
p2
m 2 x2 m 2 y 2 m 2 z 2
p2x
+ y + z +
+
+
2m 2m 2m
2
2
2

the internal energy of the solid is

(116)

U = 3N kT

(117)

C = 3k per molecule or 3R per mole.

(118)

and the specific heat of the solid is

(To be careful we must call it CV .) Note that the answer, the Law of Dulong and Petit, is
independent of the solid: mass of the atoms m, k etc. ). This law is found to work at high
temperatures but breaks down as T 0, when C 0 for all solids.
The resolution was provided by Einstein using quantum theory. So we will now take a
crash course in that subject on a need-to know basis.

8.2

Oscillator-quantum version

An oscillator in classical mechanics, like mass spring system, can have any position and any
momentum and hence any energy starting from zero. In quantum theory the energy levels of
systems are typically quantized. For a single oscillator of frequency , the allowed energies
are discrete (quantized) and labeled by an integer n:
1
n = (n + )
h
2

(119)

h
' 1034 J s

(120)

where
is related from Plancks original constant h by h
= h/(2). Note that the lowest energy of
the quantum oscillator is h
/2. The classical state of zero energy in which the oscillator
sits at rest the origin is forbidden by the Heisenberg uncertainty principle which disallows a
state of exactly defined position and momentum.
The quantum partition function ZQ is now a sum over n
ZQ () =

e((n+ 2 )h)

(121)

n=0
21
h

= e

enh

(122)

n=0
1

e 2 h
=
1 eh
1
1
=

h =
h

2 sinh 2kT
2 sinh 2

(123)
(124)

8 THE HARMONIC OSCILLATOR: CLASSICAL AND QUANTUM


in deriving which I have used

xn =

1
1x

valid of |x| < 1 , which is the case here since x = e


sinh x =
The internal energy is

22

(125)

h
2

and

ex ex
.
2

ln Z
h
cosh 2kT
U =
=
h

2 sinh 2kT

(126)

(127)

When T 0, cosh and sinh both blow up exponentially with ratio of unity so that U = h2 .
This makes sense: as T 0, the oscillator is stuck in the ground state, the Boltzmann factor
killing all excited states.
As T if we use sinh x ' x and cosh x ' 1, we find
U = kT

(128)

as in the classical case. The corresponding specific heat


C = k(per oscillator)

(129)

agrees with the classical case and high temperature data. At low temperatures we find, for
3N degrees of freedom (i.e., a solid in three dimensions)

E
C = 3N k
T

E
T

(130)

where E = hk is called the Einstein temperature. The details are left as an exercise.
Note that despite the 1/T 2 in front C actually vanishes very fast as T 0 because of the
E
e T factor which vanishes exponentially fast in 1/T . When the oscillator had continuous
energies as in the classical case, its specific heat never vanished since it could absorb energies
at arbitrarily low T . In the quantum case, unless kT ' h
, the oscillator is stuck in its ground
state. Thus Einstein managed to make the specific heat vanish by invoking the quantization
of energies for the oscillators. However a careful study shows that C(T 0) ' T 3 , which
is a slower decay than the exponential. Debye managed to fix this by showing that in
reality when 3N degrees of freedom vibrate, they are not equal to 3N oscillators at the
same frequency as Einstein had assumed. Recall that even in the case of two masses, there
were two normal modes and two normal frequencies. In the two normal modes, the masses
oscillated in step or exactly out of step. In a solid the atomic vibrations occur in the form
of waves of definite wave length and frequency. (This is what we call sound in a solid.)
Finding all 3N frequencies is a difficult process. However at low frequencies, there is an
approximation that Debye used that led to the T 3 fall off. The key ingredient was that the
oscillators had a range of frequencies going all the way down to zero, though the number of
oscillators at frequency vanished as 2 which led to the T 3 law.

9 THE FREE ENERGY F

23

The free energy F

The internal energy U is a function of S and V :


dU = T dS P dV

(131)

with T and P as its two partial derivatives.


Consider the function F = U ST , called the free energy. We find
dF = dU SdT T dS = T dS P dV SdT T dS = SdT P dV

(132)

so that F is now a function of (T, V ) with S and P as the corresponding partial derivatives:

F
S =

T V

F
P =
.
V T

(133)

In other words S and T have exchanged roles as independent variable and corresponding
partial derivative when we switch from U to F . Mathematically this is exactly what happens
when we go from the Lagrangian L(q, q)
with
p=

L
q

(134)

to
H = q q L
with
q =

(135)

H
p

(136)

Recall however that we needed to get rid of q in favor of (q, p) in passing from L to H.
Likewise here we are supposed to use

U
T =

S V

(137)

which expresses T as a function of S and V , invert it, to get S(T, V ) and then form the free
energy which is a function of (T, V )
F (T, V ) = U (S(T, V ), V ) S(T, V )T.

(138)

We saw that the internal energy U of thermodynamics became the average U of statistical
mechanics . How does F appear in statistical mechanics ? We reason as follows:

But recall that

F
F
F
U = F + ST = F T
=F +
=

T V
V
V

(139)

ln Z
U =

(140)

9 THE FREE ENERGY F

24

which implies
F = ln Z
Z = eF

(141)
(142)

ln Z
If you are careful you will note that we did not bother to hold V fixed in deriving U =
starting with
X
ei
(143)
Z=
i

assuming that the derivative brought down the energy. However in the quantum problem
we saw that depends on the volume and we do not want to touch that dependence, so the
correct formula is

(144)
U=
V
P
ln Z
P
F
Recall our showing that kT
= V
. If Z = eF , this implies kT
= V
in accordance
with Eq. 133.
The free energy is a very useful function since it depends on (T, V ) which we can control
easily unlike U which depends on S, which cannot be so easily controlled. Also in the free
energy we sum over all configurations regardless of their energy and finally once we have
done the sum, S is obtained by taking the T derivative. Rather than find S for a particular
problem let us start with
#
"
X
1
i
F = ln
e
(145)

and see what S =

T V

tells us. I leave it an exercise to show that

S = k

P (i) ln P (i)

where P (i) =

ei
Z

(146)

is the absolute probability for finding the system in state i. Along the way you will need to
P
use i P (i) = 1.
Recall that for an isolated system with definite energy U we defined the entropy as
S = k ln (U ). In the case of a system in contact with a reservoir at temperature T all
energies are possible and S above is the proper definition. Note that if we apply it to the
case where the system is isolated and has just one energy and all (U ) accessible states are
equally likely with probability P (i) = 1/(U ), we find
(U )

S = k

X
i=1

1
1
ln
= k ln .
(U ) (U )

(147)

Here is another way to see why Z = eF , where F = U ST . Consider a very large


system. Suppose we group all states of energy E (and hence the same Boltzmann weight)
and write Z as a sum over energies rather than states, we obtain
Z=

X
E

eE (E)

(148)

10 HEAT AND WORK

25

where is the number of states of energy E. For a system with N degrees of freedom, (E)
grows very fast in N (as E 3N/2 for an ideal gas) while eE falls exponentially. The product
then has a very sharp maximum at some E and we may write
Z = Ae(E

kT

ln (E ))

(149)

where A is some prefactor which measures the width of the peak and is typically some power
of N . If we take the logarithm of both sides we obtain (upon dropping ln A compared to the
ln of the exponential),
ln Z = F = (E kT ln (E )) (E S(E )T )

(150)

where S(E ) is the entropy


S = k ln (E ).

(151)

In the limit N , the maximum at E is so sharp that the system has essentially that
one energy which we may identify also with the mean U and write
F = U ST.

(152)

The free energy thus captures the struggle between energy and entropy. At low temperatures a few states at the lowest E dominate Z because the Boltzmann factor favors them. At
higher temperatures, their large numbers (entropy) allows states at high energy to overcome
the factor eE which suppresses them individually.

10

Heat and work

Consider a system with some energy levels i at temperature . Its mean energy is
U =

i P (i)

(153)

where P (i) is the probability of being in state i. Consider an infinitesimal change in U . It


can be due to a change in i or P (i):
U =

X
i

i P (i) +

i P (i) = W + Q

(154)

We have identified the two terms as above because when we move the walls of the container
very slowly the levels move, carrying the particles with them, so that the change in the
energy of the level is the change in the system energy. The occupation probabilities are not
changed. Thus the first term is the work done. The second, in which the walls and hence i
are fixed but the occupation probabilities change describes heat input by placing the system
against a hotter or colder reservoir.

11 QUANTUM GASES

11

26

Quantum gases

Consider a box with just one atom in contact with a reservoir at temperature T . Its partition
function is
X
(155)
Z(1) =
ei
i

where i are the levels of the single atom, called single-particle levels. For example in a box
of length L in one dimension they correspond some choice of n = 1, 2, . . . in
n =

h
2 2 n2
2mL2

For pedagogical purposes, let us imagine there are just two energy levels a and b so
that
Z(1) = ea + eb .
(156)
If we have 2 atoms, and they do not interact so that the energies for the two-particle system
are just sums of single particle energies, and they are distinguishable, I claim the total
partition function is

2
Z(2) = Z(1)
= ea + eb ea + eb

(157)

You can see this is correct because all four states of the two-particle system, (a, a), (a, b), (b, a), (b, b)
with energies 2a , a + b , b + a , 2b appear once and with the right Boltzmann weight. In
general for N distinguishable particles with the same set of energy levels we have
N
Z(N ) = Z(1)

(158)

Suppose the N atoms are identical, say all hydrogen. The answer above is wrong because of
the following rules from quantum mechanics.
For identical atoms, configurations obtained by exchanging them are viewed as one
and the same. Thus the (a, b) and (b, a) are the same configuration and should not
be counted twice. The quantum state of a many-particle system is fully specified by
saying how many particle are in each single-particle state.
Particles in nature are either bosons (pions, photon etc.) or fermions (electrons, protons
etc.). A single particle quantum state can be occupied by no more that one fermion
(Pauli principle.) There is no such restriction for bosons.
2
Consider the result Z(2) = Z(1)
for two identical fermions. It is clearly wrong since it
allows the doubly-occupied states (a, a) and (b, b) in violation of the Pauli principle. In
addition it counts (a, b) and (b, a) as two states. Had the particles been bosons, the doubly
occupied states would be allowed but the singly-occupied one double counted.
There is no simple way to get the right answer for the N -particle problem by manipulating
N
Z(1) , except in one limiting case. Suppose the gas is so dilute that there is negligible chance

12 GRAND CANONICAL DISTRIBUTION

27

of two atoms being in the same state. Now all the generated states are allowed for bosons
and fermions, but overcounted by a factor of N !. Thus in the dilute limit we may use
Z(N )

N
Z(1)
'
N!

(159)

as we did earlier in addressing the Gibbs paradox.


We now ask how to obtain Z(N ) for the general non-dilute quantum case.

12

Grand canonical distribution

The way to get Z(N ) is to forget about the particles and focus on the single-particle levels.
The quantum state of a many-particle system is fully specified by saying how many particle
are in each single-particle state. Consider as an example a system of particles in a box. The
levels for one particle are labeled by an integer n:
n =

h
2 2 n2
2mL2

(160)

These are called single-particle levels. Now suppose there are many particles in the box. To
specify what they are doing, that is, the many-particle state, we need to say how many are
in each single-particle level. For example there string I = (3, 4, 1, 0, 0, 0, ...), where I serves
as a state label, says there are 3 particles in the lowest level (n = 1), 4 particles in the next
level (n = 2), one in n = 3 and none in any of the higher levels. (I use I and EI to label a
many-particle states and energy, and i and i to label a single-particle state.) The energy of
the state I is EI = n1 1 + n2 2 + ....
Our goal is to see how many particles will be in a given level i on average.
For this we need the notion of a grand canonical distribution. Consider a system in
contact with a reservoir with which it can exchange heat as well as particles. Let U0 and
N0 be the total energy and number of particles and let R (UR , NR ) be the number of states
available to the reservoir when when it has energy UR and NR particles. Consider two states
of the system, a state I(N ) with N particles and energy EI(N ) and a state 0 with no particles
and zero energy (which we take to be the lowest possible value).
Reasoning as in the canonical case
1 R (U0 EI(N ) , N0 N )
P (I(N ))
=
P (0)
1 R (U, N0 )

(161)

Once again we Taylor expand


ln R (U0 EI(N ) , N0 N ) = ln R (U0 , N0 ) EI(N ) + N + ..

(162)

where we have used the definition of the chemical potential Eq. 85. The probability for the
system to be in an N particle-state I of energy EI(N ) is
P (I(N )) =

e(EI(N ) N )
Z

(163)

12 GRAND CANONICAL DISTRIBUTION

28

where the Grand partition function is


Z=

X X

e(EI(N ) N ) .

(164)

N I(N )

The double sum means the following. First pick some N , the number of particles. For each
N sum over all the levels I(N ) available to the N -particle system.
In the microcanonical case we deal with a system with fixed U and N . In the canonical
case the system has a fixed temperature T because it in equilibrium with a reservoir. It can
have any energy, and the average energy U is controlled by T or . If the system is big, U
will be very sharply peaked. In the grand canonical case, the system has a variable number
of particles and variable energy, the averages of which can be dialed by varying and . If
the system is big, deviations from the average will be negligible. Note for future use that
P

(EI(N ) N )
ln Z
= PN PI(N ) N e
N
=
(EI(N ) N )

N
I(N ) e

(165)

Let us now consider the quantum gas of fermions. We want to know how many fermions
will occupy a single-particle state i of energy i . To this end let us consider the level i as the
system exchanging particles with a reservoir, which will be just the gas itself. Clearly
ZF = 1 + e(i )

(166)

where the two terms represent N = 0 and N = 1. The mean occupation number of state i is
(i )
1
F (i) = 0 1 + 1 e
N
= (i )
(
)
i
1+e
e
+1

(167)

To see how comes into play consider very small T or very large . If i > , the denominator
F 0, while if i < , the denominator reduces to 1 and N
F 1. In other
diverges and N
words all states with energies less than are almost certainly occupied and all those above
are empty. If the system is to have say 500 particles as T 0, we must choose equal to
the 500-th energy level. This energy is called the Fermi energy. For example for particles in
a box in one dimension with
h
2 2 n2
n =
(168)
2mL2
we choose = (500) if we want the system to have 500 fermions. At T > 0 there will be
some particles above = , states below will not be filled with unit probability, and
itself will vary with T to ensure a certain value of n
.
Consider finally bosons. Now we sum over all possible integer values of N to obtain
ZB = 1 + e(i ) + e2(i ) + e3(i ) + . . . =

1
1

e(i )

(169)

The mean occupation number is


B (i) =
N

1
e(i )

(170)

12 GRAND CANONICAL DISTRIBUTION

29

In contrast to fermions, for bosons must lie below the lowest single particle energy level,
the ground state. If we choose that to be at 0, must be negative. As approaches 0 from
B (i) get arbitrarily large for the ground state. One must stop at a point when the
below, N
average occupation over all states adds up the total number we want in the system.
Bose systems exhibit the following interesting behavior first pointed out by Einstein. As
0, we of course expect the number in the ground state to shoot up. But we would also
expect the number in the next higher level to shoot up as well. What happens is that the
number in the ground state can be a macroscopic fraction (like 1/2 ) of the total number
while those in the higher levels will fractions of order 1/N . This is called Bose condensation.
I have merely stated this fact and made no effort to fully explain it.
Note that when (i ) is large, both distributions reduce to that of the classical gas,
e .

S-ar putea să vă placă și