Notes For A Course On Statistical Mechanics PDF

Notes for a Course in
Statistical Thermodynamics
Third Edition
Paul J. Gans
Department of Chemistry
New York University
Notes for a Course in
Statistical Thermodynamics
Third Edition
Paul J. Gans
Department of Chemistry
New York University
Version 6.12
October 27, 2008
Credits and Notices
This material was created by Paul J. Gans as both text and notes for the lectures in his course
G25.2600, Statistical Thermodynamics, given at New York University starting in 2003.
The rst version of the text was written in the summer of 2003 and rst used in class in the Fall of
2003. Since then several versions have been produced. The rst versions were titled Computational
Statistical Thermodynamics Since the fourth version these notes are simply titled Notes for a
Course in Statistical Thermodynamics as the detailed computational aspects have been dropped
as impractical in a one semester course. The new title is felt to more accurately reect the subject
matter and the order of presentation.
This version was typeset using the L
A
T
E
X typesetting system on a SuSE Linux-based computer
using pdfT
E
X Version 3.141592-1.40.3 (Web2C 7.5.6) and LaTeX2e <2005/12/01>. This program
produces a pdf le directly from the L
A
T
E
X source.
Contents copyright c 2003-2008 by Paul J. Gans. All rights reserved including but
not limited to mechanical, electronic, and photographic reproduction and distribution.
Preface
This is a work in progress and probably always will be. What you see now is the sixth version of
these notes. But it is only the third edition as I have adoped the convention that an edition changes
only on a major change in the subject matter.
Each previous version was used in class and students reactions as well as teachability was noted.
This is one reason for the constant change. The other is my enduring, and surely endless, search for
the perfect arrangement of material and the perfect explanation of ideas.
The target audience is rst-year graduate students in chemistry. In the actual event many are.
Others were students in our computational biology program. Still others were in biochemistry.
Some few came from Medical Schools, both ours and other major schools in the New York area.
And, to their credit, each year some of our senior undergraduate chemistry majors have taken the
course and have always done very well indeed.
Of course, as a work in introductory statistical thermodynamics, a great deal of the content is pre-
dictable. The main idea is that statistical thermodynamics is no longer only a paper and pencil
subject. Today most of the work in the eld is done using computers and the theorist does calcu-
lations unimagined just 20 (if not 10) years ago. To that end it might have been best to introduce
classical statistical mechanics rst since almost all computations are based on classical models, even
if parts of calculations such as potential energy surfaces often come from quantum calculations.
However, the target audience nds classical mechanics even more foreign that statistical mechanics,
so that was not done.
Experience has shown that the mathematical background of many students is not up to the re-
quirements of the typical course in statistical thermodynamics. This is true even including students
whose background is in chemistry or physics. As a result many of those students nd the material
daunting. They are learning the needed mathematics at the same time they are learning the con-
cepts of statistical thermodynamics and the two together, especially at the start of the course, are
a bit overwhelming.
The rst version of the course was clearly experimental. It was dicult and did not suit the
needs of most of the students. These notes were at that time titled Computational Statistical
Thermodynamics, which did little to reassure the audience.
To make matters worse, the next version attempted to put more emphasis on models and compu-
tations. It included simple computer programs written to show how results could be obtained from
those models.
Of course the students had no common computer language. So a simplied dialect of C was invented
for the description of the algorithms. Indeed at one point the author was even tempted to resur-
rect Algol, but luckily resisted the urge. The idea was that each student could translate material
presented into a computer language he or she did know.
Needless to say, in practice the idea failed miserably, having only enthusiasm and little practicality
going for it.
The third version tried teaching classical mechanics rst. This led students, even those in chemistry
and physics, into unfamiliar byways.
The fourth version, retitled Statistical Thermodynamics in the grand tradition of such works, re-
turned to the time honored standard order of topics with semi-quantum statistical thermodynam-
iii
Preface iv
ics discussed rst, some chapters on classical statistical thermodynamics inserted after, and a last
section on systems with interactions nishing o the course. Further, it was no longer a text, but
in fact a reprise of the actual material taught in class.
The fth version, had a changed title to more accurately reect the contents and the order of
presentation.
1
For this, the sixth version, once again the content was been extensively revised. To be sure, many
remains of earlier versions can be found, but even those have been rewritten. Material has been
somewhat reordered, presentation has been, one hopes, improved, many diagrams sadly lacking in
earlier versions have been added, and some new material introduced.
The aim at this point is to take students as quickly as possible to usable material. Earlier versions
suered from too much front-loaded theory. The theory is still there, but the initial chapters are
now somewhat ad hoc with justication and proofs coming later. One hopes that this will work.
The author is aware that there is somewhat more material here than can be taught in one semester.
While early chapters rely on previous chapters for basic ideas, later chapters are more or less inde-
pendent. And in any case sections can be omitted in many places. But most importantly, many of
the less important points in each chapter can be left for the student to read and assimilate on his
or her own.
Some of the material, particularly that presented in Appendices is designed for independent study.
It is primarily mathematical and is present to aid students unfamiliar with such material. This is
regarded as a feature of these notes.
That said, the author is heavily indebted to those who have written before. A list of works consulted
appears at the end of the book. Most of these are older works since it was felt that in many
cases the older treatments of the more classical material was not only better, but contained fewer
misconceptions and strange ideas than many current works.
In addition the help of friends and family must also be acknowledged. Professor Mark Tuckerman
was very helpful and I also thank Mr. J. Benjamin Abrams, who suered through the rst version
of this work for a number of useful conversations.
My wife Gail put up with this, one son-in-law, Prof. John Crocker thought me passing strange,
the other, Victor Mather, kindly acted as though I was perfectly normal, and my children, Susan
and Abbey, more used to my hermit-like ways, forbore criticism. And my grandchildren, Josephine,
Eva, and Hannah, to whom I dedicate this work, were mercifully unaware of grandpas multiyear
obsession.
To all of these my thanks and apologies, but it isnt over yet, versions seven and eight may needed
to clean up all sorts of awkwardnesses, reduce the number of errors, and generally clean things up.
1
The authors preference in a text would be to group the introductory theoretical material together at the start,
thus allowing the interconnections among ensembles and their relationships to various dimensionless thermodynamic
potentials to be stressed. But that is hard food for the new student.
Contents
Preface iii
A Note on Notation xiii
1. The Nature of Statistical Thermodynamics 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 The Classical Description of a System . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 The Quantum Mechanical Description of a System . . . . . . . . . . . . . . . . . . . . 4
1.5 Boltzmanns Trajectory Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 The Gibbs Ensemble Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7 The Equivalence of Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.8 The Basic Program of Statistical Thermodynamics . . . . . . . . . . . . . . . . . . . 8
2. The Microcanonical Ensemble 10
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Occupation Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 The Principle of Democratic Ignorance . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 A Subproblem: The Multinomial Coecient . . . . . . . . . . . . . . . . . . . . . . . 12
2.5 The Maximization of W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.1 Finding the Maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5.2 Lagranges Method of Undetermined Multipliers . . . . . . . . . . . . . . . . . 14
2.5.3 The Final Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.7 The Most Simple Spin System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
v
CONTENTS vi
2.8 Appendix: The Gamma Function and Stirlings Approximation . . . . . . . . . . . . 23
3. The Canonical Ensemble 26
3.1 The Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 The Most Probable Occupation Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3 The Thermodynamics of the Canonical Ensemble . . . . . . . . . . . . . . . . . . . . 29
3.3.1 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.2 Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3.3 Chemical Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.4 The Canonical Partition Function and the Identication of Beta . . . . . . . . 31
3.4 The Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Degeneracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.6 A Simple Spin System in an External Field . . . . . . . . . . . . . . . . . . . . . . . . 34
4. Independent Subsystems in the Canonical Ensemble 38
4.1 Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Independent Energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3 Single Particle Canonical Partition Functions . . . . . . . . . . . . . . . . . . . . . . . 39
4.4 Thermodynamics of Canonical Systems of Independent Subsystems . . . . . . . . . . 41
5. The Ideal Monatomic Gas 43
5.1 The Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 The Evaluation of the Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3 The Degeneracy Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 The Thermodynamics of a Monatomic Ideal Gas . . . . . . . . . . . . . . . . . . . . . 47
5.5 The Electronic Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.6 Appendix: The Euler-Maclaurin Summation Formula . . . . . . . . . . . . . . . . . . 54
6. Ideal Polyatomic Gases 57
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 Vibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.2.1 Vibration in Diatomic Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.2.2 Vibration in Polyatomic Molecules . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
CONTENTS vii
6.3.1 Rotational in Diatomic Molecules . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3.2 Evaluation of the Rotational Partition Function . . . . . . . . . . . . . . . . . 65
6.3.3 Rotation in Polyatomic Molecules . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.4 The Electronic Partition Function in Polyatomic Molecules . . . . . . . . . . . . . . . 71
6.5 The Thermodynamics of Polyatomic Molecules . . . . . . . . . . . . . . . . . . . . . . 72
6.5.1 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5.2 Vibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.5.3 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.6 Appendix: Homonuclear Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.6.1 Singlets and Triplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.6.2 Rotational Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
7. The Grand Canonical Ensemble 78
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.2 The Equations for the Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.3 Solution of the Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.4 Thermodynamics of the Grand Canonical Ensemble . . . . . . . . . . . . . . . . . . . 81
7.5 Entropy and Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.6 The Relationship to the Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . 83
7.7 A Direct Consequence of Having Independent Subsystems . . . . . . . . . . . . . . . 83
7.8 A General Result for Independent Sites . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8. The Equivalence of Ensembles 89
8.1 Expansion of the Grand Canonical Partition Function . . . . . . . . . . . . . . . . . . 89
8.2 Generalized Laplace Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.3 Transformations Among Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.3.1 The Maximum Term Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.3.2 A Cautionary Example: The Isobaric-Isothermal Ensemble . . . . . . . . . . . 93
8.4 Summary: The Relationship Among Ensembles . . . . . . . . . . . . . . . . . . . . . 95
8.5 Appendix: Legendre and Massieu Transforms . . . . . . . . . . . . . . . . . . . . . . 97
8.5.1 Eulers Theorem on Homogeneous Functions . . . . . . . . . . . . . . . . . . . 97
8.5.2 The Legendre Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.5.3 The Massieu Transformations and Dimensionless Equations . . . . . . . . . . . 99
CONTENTS viii
9. Simple Quantum Statistics 103
9.1 Quantum Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
9.2 Simple Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.3 The Ideal Gas Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10. The Ideal Crystal 108
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
10.2 The Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.2.1 Common Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
10.2.2 The Einstein Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
10.2.3 The Debye Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10.3 The One-Dimensional Crystal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.4 Appendix: Behavior of the Debye Function . . . . . . . . . . . . . . . . . . . . . . . 119
10.5 Appendix: Dierentiating Functions Dened by Integrals . . . . . . . . . . . . . . . 121
11. Simple Lattice Statistics 123
11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.2 Langmuir Adsorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.3 The Grand Canonical Site Partition Function . . . . . . . . . . . . . . . . . . . . . . 125
11.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
11.4.1 The Langmuir Adsorption Isotherm Again . . . . . . . . . . . . . . . . . . . . 126
11.4.2 Independent Pairs of Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.4.3 Brunauer-Emmett-Teller Adsorption . . . . . . . . . . . . . . . . . . . . . . . 127
11.5 Lattice Gas in One Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
12. Ideal Quantum Gases 134
12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
12.2 Weakly Degenerate Ideal Fermi-Dirac Gas . . . . . . . . . . . . . . . . . . . . . . . . 134
12.3 Strongly Degenerate Ideal Fermi-Dirac Gas . . . . . . . . . . . . . . . . . . . . . . . 140
12.3.1 Absolute Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
12.3.2 The Merely Cold Ideal Fermi-Dirac Gas . . . . . . . . . . . . . . . . . . . . . 141
12.4 The Weakly Degenerate Bose-Einstein Gas . . . . . . . . . . . . . . . . . . . . . . . 145
12.5 The Strongly Degenerate Bose-Einstein Gas . . . . . . . . . . . . . . . . . . . . . . . 146
CONTENTS ix
12.6 The Photon Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
12.7 Appendix: Operations with Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
12.7.1 Powers of Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
12.7.2 Reversion of Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
12.8 Appendix: The Zeta Function and Generalizations . . . . . . . . . . . . . . . . . . . 155
12.8.1 The Zeta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
12.8.2 The Dirichlet Eta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
12.8.3 The Polylogarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
13. Classical Mechanics 160
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
13.2 Denitions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
13.3 Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
13.4 Newtons Laws of Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
13.5 Making Mechanics More Simple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
13.5.1 Coordinate Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
13.5.2 The Lagrangian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
13.6 The Center of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
13.7 The Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
13.7.1 Hamiltons Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
13.7.2 More on Legendre Transformations . . . . . . . . . . . . . . . . . . . . . . . . 176
13.7.3 Phase Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
13.7.4 Properties of the Hamiltonian . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
13.7.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
13.8 Appendix: Standard Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . 182
13.8.1 Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
13.8.2 Cylindrical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
13.8.3 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
14. Classical Statistical Mechanics 185
14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
14.2 Liouvilles Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
14.2.1 Incompressible ow in Phase Space . . . . . . . . . . . . . . . . . . . . . . . . 190
CONTENTS x
14.2.2 Conservation of Extension in Phase . . . . . . . . . . . . . . . . . . . . . . . 190
14.3 The Virial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
15. The Classical Microcanonical Ensemble 193
15.1 The Specication of a System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
15.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
15.3 The Microcanonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
15.4 The Thermodynamics of the Microcanonical Ensemble . . . . . . . . . . . . . . . . . 200
15.5 The Number of Microstates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
15.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
15.7 Appendix: Volume of an n-Dimensional Hypersphere . . . . . . . . . . . . . . . . . 203
16. The van der Waals Gas 205
16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
16.2 The Approximate Derivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
17. Real Gases 210
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
17.2 Virial Coecients and Conguration Integrals . . . . . . . . . . . . . . . . . . . . . 211
17.3 Evaluating the Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
17.4 The Third Virial Coecient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
17.5 The Lennard-Jones Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
18. Wide versus Deep 224
18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
18.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
18.3 But What Does It All Mean? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
History 227
Fundamental and Derived Physical Constants 229
List of Works Consulted 230
List of Tables
1.1 The Common Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Total derivatives for some of the common thermodynamic Potentials . . . . . . . . . 3
5.1 Derivatives for the Evaluation of the Translational Partition Function . . . . . . . . 44
5.2 for Various Monatomic Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3
t
for Various Monatomic Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.4 Monatomic Gas Electronic States . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5 The First Few Bernoulli Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.1
v
for Selected Diatomic Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2
r
for Selected Diatomic Molecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.3 Derivatives for the Evaluation of the Rotational Partition Function . . . . . . . . . . 66
6.4 Names for Rigid Rotators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
1 Fundamental and Derived Physical Constants . . . . . . . . . . . . . . . . . . 229
xi
List of Figures
2.1 Spin entropy per spin as a function of fraction upspins . . . . . . . . . . . . . . . . . 22
2.2 The Gamma Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1 Schematic Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Energy of a Noninteracting Spin System. . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Heat Capacity of a Spin System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.1 Typical Potential Energy for a Diatomic Molecule . . . . . . . . . . . . . . . . . . . 59
6.2 Typical Potential Energy for a Diatomic Molecule showing the dierence between D
o
and D
e
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6.3 Graph of C
V
/k vs T/
V
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
10.1 Graph of C
V
vs T/ for an Einstein Crystal . . . . . . . . . . . . . . . . . . . . . . 111
10.2 Graph of C
V
vs T/
D
for a Debye Crystal . . . . . . . . . . . . . . . . . . . . . . . 114
11.1 Graph of vs. p for Langmuir Adsorption . . . . . . . . . . . . . . . . . . . . . . . . 125
11.2 Isotherm for Pairs of Sites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
11.3 Plot of the BET Isotherm for r = 200 . . . . . . . . . . . . . . . . . . . . . . . . . . 128
11.4 Pressure-Area Isotherm for the One-Dimensional Lattice Gas . . . . . . . . . . . . . 133
18.1 Model Potential Energy Surface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
xii
A Note on Notation
Statistical thermodynamics is a subject that uses a large number of symbols. It essentially exhausts
the Latin alphabet, even with symbols being denoted by capital or bold face letter. The Greek
alphabet, capitals as well as lower case, is also pressed into service, but it is still not enough.
The reason for this rich set of symbols is that statistical thermodynamics uses the notations of quan-
tum mechanics, classical mechanics, and classical thermodynamics in addition to its own symbols.
Further, what can be printed in a book such as this one (using bold face for instance) does not often
translate to symbols that can be easily written on a blackboard or on a written page.
Since these are notes for a course to be taught in an actual classroom, some of the choices made for
symbols were made to enable decent presentation on a blackboard.
2
Certain horrible choices had to be made, even though standard symbols were always preferred if
possible.
Symbols for both probability and for pressure are needed. Both are usually denoted by the letter
p. Since these quantities sometimes appear in the same equations, an arbitrary dierentiation has
had to be made. Pressure is thus denoted by p (lower case) and probabilities by P (upper case). On
a some occasions there is also need for a symbol for momentum. Luckily context usually provides
enough hints to allow disabiguation, so p (lower case) is also used for momentum.
In physics the potential energy is usually denoted by V or sometimes by U. In chemistry the volume
of a system is invariably denoted by V while U is used for the (thermodynamic) internal energy of
a system. Here I have chosen to denote the volume by V , the thermodynamic internal energy by U,
and the potential energy by V. The quantum mechanical potential energy operator is denoted by
U. On the other hand, to add confusion, the physical energy of a system in general will be denoted
by E.
In addition a symbol is needed for the mechanical kinetic energy of a system, often denoted by T,
which is customarily used for the temperature of a system. Here T will be used for the temperature
and T for the kinetic energy.
3
Several problems arise in dealing with classical mechanics as well. In particular the momentum is
almost universally denoted by p. I have kept that notation and trust that context will allow that p
to be distinguished from the pressure p. Further, the classical Hamiltonian is invariably H, but that
is the thermodynamic enthalpy. Here Ive used H for both the Hamiltonian and the total energy of
a system. Since only conservative systems are handled here, this is technically permissible, but a
stylistical horror.
Another problem is how to represent quantum mechanical operators. The choice the I have made
was to indicate the Hamiltonian operator H as well by H. The potential energy operator then was
rendered as U, which matches its use as the mechanical potential energy. Context should make clear
the intended meaning.
Yet other instances of possible confusion come with the Helmholtz free energy, denoted by A in the
chemical literature. This is not to be confused with the number of systems in an ensemble /. Note
that this latter A is in a calligraphic face. calligraphic face was chosen because it has been used
for this purpose in other texts.
2
Though those time-sanctied teaching devices are now being replaced by white boards, which the author regards
as educational abominations equivalent to writing on walls with paintbrushes.
3
Curiously, this font is known as Blackboard and while ne in a book, but not so easy to draw on a blackboard.
xiii
A Note on Notation xiv
And a is used in two senses. In the rst it is the number of members of an ensemble that are in
the jth state. In the second it is the activity as used in chemistry. But here there should be no
confusion possible since the dierence in usage is quite easily understood from context.
1. The Nature of Statistical
Thermodynamics
1.1 Introduction
This course deals with statistical thermodynamics which is a major branch of a more general eld
called statistical mechanics. Both of these deal with the calculation of the macroscopic properties
of a system from the properties of the microscopic constituents of the system.
Of the two, statistical mechanics is more general. It deals with both time-dependent properties and
equilibrium properties. Statistical thermodynamics is more specialized. It deals with the computa-
tion of equilibrium thermodynamic properties.
These notes deals only with the latter.
In discussing statistical thermodynamics we will need material not only from equilibrium thermo-
dynamics, but also from both classical and quantum mechanics. These are briey reviewed below
from the standpoint of what we will need here in this text.
1.2 Thermodynamics
Thermodynamics deals with macroscopic systems at equilibrium. It knows nothing about atoms or
molecules. The thermodynamic state of a system is dened by specifying a relatively few variables.
1
Typical variables of thermodynamics are internal energy,
2
temperature, pressure, and the like. The
exact number of such variables needed to dene the state of a system is given by the Gibbs Phase
Rule and depends on the number of independent components of the system and the number of
phases present in the system.
Thermodynamics is a simple, self-contained system that depends on several axioms (called laws in
thermodynamics) and a certain amount of experimental information such as heat capacity data that
cannot itself be calculated from thermodynamics.
There are ve major common thermodynamic potentials. These are the entropy S, the internal
energy U, the enthalpy H, the Gibbs free energy G, and the Helmholtz free energy H.
3
Each of
these has a set of natural variables such that when a thermodynamic potential is expressed in terms
of its natural variables, all other thermodynamic properties can be derived from the potential. In
addition, a thermodynamic potential points the way to equilibrium since its value is an extremum
4
when the system is allowed to come to equilibrium with its natural variables held constant.
As an example, if one knew the internal energy in terms of the entropy, volume, and number of
particles present, then the temperature T is given by (U/S)
N,V
, the pressure p by (U/V )
S,N
,
1
By dening the state of a system we mean that specifying these few variables is sucient to x the values of all
other variables of the system.
2
The term internal energy means the energy contained inside the system and not including any energies derived
from the position or motion of the system as a whole.
3
These are not the only possible thermodynamic potentials, but they are the ones that occur most often in actual
use.
4
An extremum is a maximum or a minimum.
1
Chapter 1: The Nature of Statistical Thermodynamics 2
and the chemical potential by (U/N)
S,V
. Further, the energy reaches a minimum at constant
S, V , and N.
Another example: The natural variables for the entropy are U, V , and N. In an isolated system
these are held constant
5
and the entropy increases for any spontaneous change in such a system.
Here is a listing of the ve common potentials, their natural variables, and the direction in which
they move for a spontaneous change: Each of these potentials possesses a total derivative. For
S(U, V, N) 0
G(T, p, N) 0
A(T, V, N) 0
U(S, V, N) 0
H(S, p, N) 0
Table 1.1: The Common Potentials
example the total derivative of the entropy in terms of its natural variables
6
is:
dS =
_
S
U
_
V,N
dU +
_
S
V
_
U,N
dV +
_
S
N
_
U,V
dN . (1.2.1)
or
dS =
1
T
dU +
p
T
dV

T
dN (1.2.2)
In statistical thermodynamics it is often very useful to put the total derivative in dimensionless
terms.
7
For the entropy this is:
dS
k
=
1
kT
dU +
p
kT
dV

kT
dN , (1.2.3)
where k is Boltzmanns constant and not R, the gas constant, since N is the number of particles
and not the number of moles. It is also clear that in some sense the proper thermodynamic
temperature variable is not T, but kT. This will often be seen in the following chapters.
While there is a total derivative for any thermodynamic function for any set of independent variables,
it is a simple fact that there can be only one total derivative of a given function for a given set of
independent variables.
8
Table 1.2 on the following page gives a few total derivatives of the common potentials in dimensionless
form is given below.
1.3 The Classical Description of a System
We will need to discuss some topics in classical mechanics later in this text. For now let us simply
review how classical mechanics deals with the motion of a system of N particles.
5
Constraints are supplied by the walls of the system. Here they are adiabatic so that no heat can enter, rigid so
that the volume can not change, and impenetrable so that the number of particles cannot change.
6
All thermodynamic functions possess total derivatives in terms of whatever variables are chosen as independent.
The natural variables cause the corresponding potential to be special.
7
How to do this will be extensively discussed in Section 8.5.
8
This, as it should, discounts changes of scale. One can always change units by multiplying through by a constant
factor. But this does not result in a truly dierent equation.
d
_
S
k
_
=
_
1
kT
_
dU +
_
p
kT
_
dV
_

kT
_
dN
d
_
G
kT
_
= Ud
_
1
kT
_
V d
_
p
kT
_
_

kT
_
dN
d
_
A
kT
_
= Ud
_
1
kT
_
+
_
p
kT
_
dV
_

kT
_
dN
_
1
kT
_
dU =
dS
k

_
p
kT
_
dV +
_

kT
_
dN
Table 1.2: Total derivatives for some of the common thermodynamic potentials in terms of their
natural variables
In principle this is quite simple. Newton pointed out that F = ma, where F is the (vector) force
on particles of mass m and a is the (vector) acceleration felt by those particles. Both F and a are
vectors having 3N components in three-dimensional space, three for each particle..
Now recognizing that a = d
2
x/dt
2
, where x is the coordinate vector, we then have the second order
dierential equation:
m
d
2
x
dt
2
= F, (1.3.1)
which is really a set of 3N dierential equations, one for each component of each particles position.
If we assume that all the forces in our system are conservative
9
then the force can be computed from
the potential energy U of the system:
10
F =
dU(x)
dx
, (1.3.2)
where the notation is shorthand for 3N such derivatives, one for each spatial component of each
particle.
If the potential energy is then known, the force can be computed and Equation (1.3.1) can then be
integrated twice to obtain the position of the particles as a function of time.
11
This integration will require two constants of integration per particle, normally taken as the initial
velocities
12
v and the initial positions x.
We then require six initial conditions per particle or 6N initial conditions overall. In principle this
results in our knowing the positions and velocities of all the particles at any time t,
13
as long as we
know the potential energy function for the N particles.
9
Meaning not only that there is no friction or similar dissipative force, but more directly means that energy is
conserved in the system.
10
Consult H. Goldstein, Classical Mechanics, Addison-Wesley, 1953 or any equivalent book on classical mechanics
for a proof of this.
11
Leaving aside the technical diculties of integrating something of the order of 10
23
second order dierential
equations...
12
If the masses are constant, which they will be as long as we are dealing with atoms and molecules, then the initial
momentum p = mv will supply the initial velocity.
13
I cant resist pointing out that the nal integrated equations not only give the positions and velocities of all the
particles for any time t after the initial time, it also allows the calculation of the positions and velocities of all the
particles for any time t before the initial time. Thus the history of this system is be known for all time.
In principle, if we know all the positions and all the velocities of all the particles, we can compute
the value of any mechanical variable at that time.
1.4 The Quantum Mechanical Description of a System
To complement the discussion of classical mechanics, we give a quick examination of the equivalent
quantum mechanical description of a system of N identical particles
14
in a xed volume at equilib-
rium.
15
This description is given by the systems wave function . This state is usually a mixed
state; a linear combination of all the possible pure quantum mechanical states which satisfy the
external conditions imposed on the system. Each of these pure states has its own wave function:
i
(q
1
, q
2
, ...q
N
) . (1.4.1)
The wave function
i
is a function of the coordinates q of the N particles in the system and the
specic quantum state i represented by that wave function.
The overall system is then in the mixed state given by the linear combination these pure states:
= c
1
1
+c
2
2
+... , (1.4.2)
where the c
i
s are constants.
In more common terminology, a mixed state is one that is degenerate. The pure states are
non-degenerate.
We expect the system to be in all these states simultaneously until we examine it.
16
When we do
examine the system we know that we will nd it in state i with probability
17
P
i
=
c
2
i
k
c
2
k
. (1.4.3)
A typical large system has a very large number of possible pure quantum states
i
. Thus the mixed
state corresponding to the collection of all the pure states with the same energy will clearly be
degenerate.
An example of this in even a very small system is the eld-free hydrogen atom where the degeneracy
of the nth electronic level is n
2
, where n is the radial quantum number. Only the ground state,
n = 1 is non-degenerate.
Indeed, the n = 2 level is quadruply degenerate, the possible pure states being (in traditional chemist
notation) 2s, 2p
x
, 2p
y
, and 2p
z
. And the cs in Equation (1.4.3) are all 1/4.
Associated with the system is a set of operators A
j
such that to every state i of the system and for
every property of interest A
ji
) there is an operator A
j
satisfying
A
j
i
(q
1
, q
2
, ...q
N
) = A
ji
)
i
(q
1
, q
2
, ...q
N
) . (1.4.4)
where A
ji
) is the expected value of the property corresponding to A
j
.
14
This description can fairly easily be extended to systems containing groups of non-identical particles, but we
wont introduce that additional complication here.
15
By equilibrium we mean that the wave function(s) of the system are not functions of the time.
16
There is no magic in this. Until we have an idea as to which state the system actually is in, we must treat all
compatible states as possible.
17
We here run into one of a number of notational diculties. We need symbols for the probability and for the
pressure. Both are usually denoted by the letter p. Ive made the arbitrary choice of denoting the pressure by p
and probabilities by P. See the Note on Notation on page xiii.
A typical example is the energy in state i for which the operator is the Hamiltonian operator H
H =

2
2m
N
k=1
2
q
2
k
+U(q
1
, ...q
N
) , (1.4.5)
where U(q
1
, ...q
N
) is the potential energy operator of the system and the corresponding energies
E
i
) satisfy
H
i
(q
1
, q
2
, ...q
N
) = E
i
)
i
(q
1
, q
2
, ...q
N
) . (1.4.6)
Note that the term U(q
1
, ...q
N
) contains the implicit constraints on the system such as walls. For
instance a free particle has U identically zero everywhere while particles in a box of side L have the
potential
U =
_
0 0 q
1
, q
2
, ...q
N
L
otherwise
(1.4.7)
1.5 Boltzmanns Trajectory Method
In the late 19th century Ludwig Boltzmann developed what can be called the trajectory method of
determining macroscopic properties from microscopic properties. It is a classical method depending
on Newtons Laws and using its language.
Consider a system of N particles contained in a volume V . Each particle has 3 space coordinates,
x, y, and z, and three momentum coordinates, p
x
, p
y
, and p
z
. At any moment in time those six
variables have specic values. And those values can be represented as a point in a six-dimensional
graph. Over time this point moves in the graph, tracing out a path called a trajectory. The
trajectory is clearly continuous since at any instant the variable can only change innitesimally from
their values the instant before.
Such a graph is called a -space graph.
18
It is easy to imagine not one but all N particles of the
system being plotted on the same graph. This produces a swarm of points moving in seeming random
motion.
In fact the trajectories are bounded in the space coordinates since all must remain within the volume
of the system. And they are bounded in the momentum coordinates since no particle can have a
momentum larger in magnitude than

2mE, where E is the total initial energy of the particles.
19
The result is that wed have N trajectories being traced out in a bounded region of phase space.
A more clear picture can be gotten by switching to a slightly dierent graph. This graph has 6N
coordinates,
20
one for each component of position and momentum for each particle in the system.
The entire system is now represented by a single point on this graph. This graph is called phase
space or sometimes -space.
21
The single point in phase space representing the entire system moves with time in phase space,
tracing out a system trajectory. At each point in phase space we can compute the value of a
physical property of the system from the known coordinates and momenta at that instant of time.
To get an average value of any property, all we need do is add up the values of the property at each
point along the trajectory and divide by the total number of points.
18
With standing, perhaps, for micro.
19
And indeed, a particle can only have this magnitude of momentum if all the other particles have zero momentum.
20
Making it very very hard to even imagine.
21
Perhaps stands for grand as in large.
Of course this cant be done because there are an uncountable innity of points along the trajectory
and the time spent at each point is zero. As a result for many years no progress was made in this
computation.
Ludwig Boltzmann got around this by a neat trick.
Boltzmann conceptually divided phase space up into separate small but nite cells he called mi-
crostates.
22
While nite in size the microstates were to be of a size small enough so that the value
of a property anywhere in it would be essentially constant.
23
With the microstates nite in size, the
system would spend a small, but nite amount of time in each microstate.
He could then compute averages using the following denition:
X) = lim
t
1
t
i
X
i
t
i
(1.5.1)
where X
i
is the value of the property in cell i and t
i
is the amount of time that the system spent in
cell i. The result is then X), the average value of the property along the trajectory.
Boltzmann then identied this average X) with the thermodynamic property X.
However, it turns out that it is very dicult to compute anything with Equation (1.5.1) using paper
and pencil classical mechanical methods. The one major exception is that Boltzmann did, in fact,
succeed in computing the properties of an ideal gas using what wed today call the microcanonical
ensemble.
One can also use Equation (1.5.1) in a modied way to compute properties using a computer and
numerical integration. Modications are needed because in a system with any reasonable N, only a
very small portion of a trajectory can be followed and one ends up hoping that it is a representative
portion.
Quantum mechanics does not change this much. As weve seen the state of the system is a super-
position of many degenerate wave functions.
And quantum mechanics tells us that the system will be in state i with probability P
i
given by
P
i
=
c
2
i
j
c
2
j
, (1.5.2)
where the c
i
are the coecients in Equation (1.4.2) on page 4. Now we can now let P
i
play the role
of t
i
/t in the Boltzmann formula, Equation (1.5.1) and calculate system properties from it.
1.6 The Gibbs Ensemble Method
Gibbs
24
had an insight into this situation. Instead of following a single system along its trajectory
in phase space, he proposed that one create (mentally, naturally) a collection of macroscopically
identical systems all having the same xed external values of N, V , and U, but which would of
course be in dierent microscopic states.
22
Theres that word again! Here it refers to a small volume in phase space and not to a quantum mechanical state.
Boltzmann, of course, worked before quantum mechanics was invented.
23
This is called coarse graining. Coarse graining is a process in which we divide a continuous quantity into small
regions or cells. It is quite interesting in that without it all sorts of mathematical diculties would arise in Boltzmanns
method.
24
Yes, that Gibbs.
In other words let there be a very very large number / of systems in the collection where / is
greater in size than , the degeneracy of the system energy state E corresponding to the externally
xed energy U. Since can be huge, / must be much much more huge, but, and this is important,
still nite.
Gibbs then assumed that a
i
of the systems in the collection were in the small Boltzmann microstate
i. Then he assumed that the ratio a
i
// corresponded to the fraction of time we could expect to
nd the macroscopic system in state i. Thus
P
i
=
a
i
i
a
i
(1.6.1)
and any property X could be found from X
i
, the value of X associated with the microstate i using:
X) =
i
P
i
X
i
(1.6.2)
Did Gibbs gain anything by this? Yes. It turns out to be much more convenient to do theoretical
calculations using the Gibbs idea. But it remains more convenient to do computer calculations
using the Boltzmann plan.
25
Gibbs termed this collection of systems an ensemble. This particular
ensemble, with xed N, V , and U, is called the microcanonical ensemble. There are other ensembles
as well, and we shall meet many of them.
1.7 The Equivalence of Methods
Do the Boltzmann and Gibbs methods give dierent answers or are they equivalent?
Perhaps an analogy is useful here. Imagine that we have a Super Duper High Speed Fantastically
Magnifying Movie Camera
26
and we use it to photograph a microscopic system as it evolves in time
along its trajectory in phase space. In other words, we are following a system in time as Boltzmann
would do it.
What we have is a movie, a series of frames that capture the dynamics of the system. We can (in
theory) do a calculation of the property A for each frame in the movie and then average the values
to get A), the equilibrium value of A.
No problem there.
Now the Gibbs method is analogous to cutting up that movie into individual frames and then mixing
them up. This would then be an ensemble of frames. We can still perform the same frame-by-frame
analysis and should come up with the same answer, assuming of course that the trajectory will
eventually cover all physically possible points in phase space.
But perhaps a real trajectory does not cover all points in phase space? The Gibbs method does not
use only points along the trajectory. It doesnt know what the trajectory is. It uses all the available
points in phase space.
27
What happens if the actual system can not (or does not) go through all the possible points in phase
space? Then clearly the two methods will not necessarily give the same answers. And there are such
systems! Heres one example:
Imagine a system of N particles in a cubical box of volume V with total energy E. In other words
what is called a microcanonical system. Now if all of the molecules in the system have y- and
25
However, there is another way of computationally handling some systems in which the equations of motion are
not integrated. This method is called Monte Carlo and it really is an application of the Gibbs idea to computations.
26
Digital, of course.
27
If the energy is xed, then the only points available to either the trajectory or the ensemble must be points
corresponding to that xed energy.
z-components of velocity exactly equal to zero the molecules will then simply bounce back and
forth between the walls perpendicular to the x-axis and never hit the other four sides of the box
even though there is no energetic reason why they should do this. It is all a matter of the initial
conditions.
On the other hand, a Gibbsian approach will have molecules in all possible initial states, most of
which will not have zero y- and z -velocity components. The Boltzmann approach will not.
Of course this example is a bit contrived but the point is clear. The initial conditions on a system
may very well restrict it to move in only a portion of the available phase space.
All we need to do of course is to ensure that any system we want to study does not have such
restrictions. But sadly, we dont know how to check for this. Indeed, mathematical theory says that
we cannot in general tell.
28
In practice however, the two methods, Boltzmann and Gibbs, give identical results except in situ-
ations (such as the example given above) where the number of attainable sections of phase space
are notably fewer than expected because of strange initial conditions. Such situations rarely (the
author is tempted to say never) arise in practice.
1.8 The Basic Program of Statistical Thermodynamics
Since Daltons time at the start of the 19th century we have understood that the basic building
blocks of chemistry are atoms.
29
For a century after Dalton it was believed that atoms exactly
obeyed Newtons Laws in their behavior. And so it followed that thermodynamics ought, in some
way, to be deducible from Newtons Laws.
It was also clear that the 10
23
or so quantities needed to describe a typical macroscopic system using
Newtons laws were somehow reduced to only three, four, or ve properties needed to describe a
system in equilibrium thermodynamics.
The development of quantum mechanics did not change this view. Again the multitude of individual
pieces of information that would be obtained if we could solve the quantum mechanical equations
of an equilibrium system of 10
23
particles again must somehow boil down to the few properties of
equilibrium thermodynamics.
Similarly it was clear that no individual particle in a microscopic system of identical particles is
special in the sense that it alone determines the macroscopic properties of that system. We know
from experience that adding a particle or two to such a system will not sensibly change it. We then
must conclude that all the particles present contribute in some manner to an averaging process that
results in the thermodynamic properties of the system.
The fundamental problem of statistical mechanics is thus to discover how to do the averaging over
the manifold properties of the particles of a system to obtain the thermodynamic properties from
them.
The basic program in statistical thermodynamics can be summarized as follows. We attempt to
compute some property X of a macroscopic system. We start with the independent variables of the
system (perhaps N, V , and U as in the example above) and then determine what the microstates
of the system actually are.
28
The famous ergodic hypothesis deals with this. It claims, in essence, that the two approaches are identical.
Sadly it was only proven in a weakened form. And it is false in general. Indeed, our example has shown that.
29
In fact, mass was the rst physical quantity ever quantized. We can thank Dalton for that.
We then associate with each microstate i, a value X
i
of the property X that we wish to compute.
Then we compute the average value of X by:
X) =
i
X
i
P
i
, (1.8.1)
where P
i
, the probability of nding a particle in state i, is calculated either from classical mechanics
or quantum mechanics, depending on how we have approached the problem
We now assert that the measured average X), is, in fact, the macroscopic value of X wed observe
in a system with the given values of N, V , and E.
Of course, this is checked by experiment. And, it works!
2. The Microcanonical Ensemble
2.1 Introduction
As remarked in Chapter 1, the preferred method for doing theoretical calculations is the Gibbsian
Ensemble method. We illustrate this here with a simple but not fantastically useful example.
1
We
consider a single-phase, single-component system with xed number of particles N, a xed volume
V , and a xed energy E. It goes without saying that these values are somewhat restricted. After
all, neither N nor V can be negative and their ratio, the density, cannot be so great as to leave the
range of chemical processes and enter the realm of nuclear physics.
The system in the volume V has energy E which is highly degenerate. The system is in one of the
degenerate energy levels of the system.
We will apply the Gibbsian approach to calculate the probability that the system is in a particular
degenerate state.
To apply the Gibbsian approach we mentally construct a huge number, / of macroscopic replicas
of our system. Each will have the same identical values of N, V , and E, and each will be in one of
the one or another of the degenerate energy levels.
We then mentally freeze the systems in whatever energy level they happen to be in at that moment.
We now have a static collection of / systems, each in a denite energy level.
The usual terminology for a denite system energy level is microstate. It corresponds to the system
being in a particular non-degenerate quantum state
i
.
This collection of systems with the independent variables N, V , and E is called a microcanonical
ensemble.
2.2 Occupation Numbers
We chose the number / of ensemble members to satisfy the condition:
/ >> , (2.2.1)
where is the degeneracy of the quantum mechanical degenerate state corresponding to the imposed
external conditions.
This is done to ensure that there is a far large number of systems than degenerate levels. This way
we can expect that usually there will be at least several systems in any given microstate. Indeed,
what we hope is that (1) all possible microstates of the system are sampled by this process and (2)
that there will be many members of the ensemble in each of those possible microstates.
We let a
j
be the number of systems in the ensemble that are in microstate j. These as are called
1
So why do we bother? Because we will introduce a mathematical technique here that will be very useful to us in
later work. And it is easier to rst see it operate in a situation that is physically simple.
10
Chapter 2: The Microcanonical Ensemble 11
the occupation numbers of the microstates of the ensemble. It is immediately clear that:
j=1
a
j
= /. (2.2.2)
If we knew the a
j
s, we could determine the macroscopic properties X) of the system because we
assume that, as stated before (see Section 1.6 on page 6), that we can associate a property X
j
with each microstate j. And, most importantly, it is clear that the probability of nding any given
ensemble member in state j is just a
j
//. So given the a
j
s we can work out the thermodynamics.
However, we dont yet have enough information to nd the a
j
s. There are many dierent sets of
a
j
s that will satisfy Equation (2.2.2). For instance, if there were three systems in an ensemble
that had two possible microstates, then / = 3 and = 2. We can describe this toy ensemble by
three digits, the rst being the microstate of the rst system, the second being the microstate of the
second, etc.
The possible arrangements of the systems among the microstates are then:
111, 112, 121, 122, 211, 212, 221, 222
All of these have a
1
+a
2
= 3.
Note that there is only one way to have a
1
= 3 while there are three ways to have a
1
= 2. So
even though weve placed the systems evenly among the possible microstates, the various sets of
occupation numbers a
j
are not equally likely.
Of course, weve assumed that all states are equally likely to be occupied. But is that even true?
2.3 The Principle of Democratic Ignorance
The problem before us is:
Are all states equally likely to be occupied?
We do not know which, if any, of the degenerate microstates the system prefers to be in. Worse,
we have no obvious way of judging if any group of microstates is more or less likely to be occupied
than any other.
So we are at a loss as to how to proceed.
What we need is some sort of rule that will help us. And there is such a general rule in science that
covers situations like this. Well call it the rule of democratic ignorance. It is essentially this:
When there is a situation with a number of possibilities, and when there is absolutely no
reason to prefer one possibility over any other, then all possibilities have to considered
to be equally likely.
Of course this rule can not be proven. But it is instructive to consider possible alternatives:
1. The Rule of Primogeniture: Whatever state is numbered 1 is the most likely; number 2
is second most likely, and so on.
2. The Rule of Authority: The most probable state is the one the author says is the most
probable.
3. The Rule of Mystery: There is a most probable state but we can never know what it is.
The rst alternative is clearly silly. Numbering the states is up to the person who does the numbering.
It is a pure accident which state is numbered rst and cannot reect any physical reality. Thus we
can safely ignore Rule 1.
Alternative 2 is equally silly. Simple moral authority only substitutes someone elses ignorance for
yours.
Alternative 3 is an abdication of responsibility. It is throwing up our hands and saying that we
cannot solve this problem and should go on to do something else as a career, like perhaps being a
movie star.
2
It is doubtful that anyone can suggest a rule other than the Rule of Democratic Ignorance. Certainly,
nobody yet has been able to do that. So in the absence of anything better, we are left with the Rule
of Democratic Ignorance.
Applying the Rule of Democratic Ignorance to the situation at hand, we must conclude that: Any
particular system has an equal chance of being in any of the dierent microstates. This is equivalent
to saying that all
A
dierent ways of arranging the / systems among the states are equally likely.
This does not mean that each of the sets of as are equally likely. For instance in the example with
three systems and two states above there are eight equally likely arrangements, three of them lead
to the set 2,1 but only one way leads to the set 3,0.
It turns out that of all possible arrangements, one is far more likely than any other. That cant be
guessed beforehand, but it is an interesting consequence of the fact that N is huge.
Accepting this for now (we will retroactively prove it later) our question: What values can the as
be expected to have can now be changed to:
What set of as has the greatest number of ways of occurring?
2.4 A Subproblem: The Multinomial Coecient
Listing the number of ways in which three systems can be arranged so that, for example, there are
two of them in the rst state and one in the second could be simplied if we had a way to compute
how many such ways there are.
In particular we are going to be very interested in knowing how many ways there are to get any
particular given set of as. Put more formally:
How many ways are there of arranging / systems so that there are a
1
in the rst quantum
state, a
2
in the second, etc.
Or, in slightly dierent language:
2
Actually, one might like that...
How many ways are there of arranging / objects into piles such that there are a
1
in
the rst pile, a
2
in the second pile, etc.
If we let the number of ways of making such piles be W, then it can be shown that
W(a
1
, a
2
, ..., a
) =
/!
a
1
!a
2
!...a
!
=
/!
j
a
j
!
. (2.4.1)
Heres an example:
Example 2.1:
How many arrangements are there of three systems that result in the occupation numbers a
1
= 2
and a
2
= 1?
W(2, 1) =
3!
2!1!
=
6
2
= 3 . (2.4.2)
The quantity W is known as the multinomial coecient. It is a generalization of the binomial
coecient to multinomials. It arises naturally in algebra in the expansion of multinomials:
(x
1
+x
2
+... +x
)
A
=
A
a1=1
A
a2=1

A
a=1
W(a
1
, a
2
, ..., a
)x
a1
1
x
a2
2
...x
a
. (2.4.3)
This gives us a quick and useful side result. If all of the xs are set to 1 we get:
A
=
[a1=1,...,A],[a2=1,...,A],[...]
W(a
1
, a
2
, ..., a
) . (2.4.4)
Thus the total number of distinct ways of arranging / objects into piles is
A
.
This is a very very very large number.
If, for instance, were only 100 and / were only 1000 (both much smaller than the numbers wed
expect to run into), there would be 10
2000
dierent ways to arrange the 1000 objects among the 100
piles.
2.5 The Maximization of W
We want to nd the set of as that has the greatest number of ways of occurring. We can nd that
by nding the set of as that maximize W.
3
2.5.1 Finding the Maximum
If W depended only on one variable, say a, we would simply form dW/da, set the result to zero, and
solve to nd the value of a that maximizes W
3
In truth this is not exactly what we are looking for. We should be looking for the most probable set of as. We
assume that this is the set with the greatest number of ways of occurring. If W is a symmetric function with a single
maximum, the maximum and the most probable will be the same.
But here we have dierent as.
The as and W dene an + 1 dimensional space. The as are the independent variables and
W is the dependent variable.
At a local maximum, the surface dening W in this space is at. That is, any small variation in the
as will cause no change in W.
Now, if we make small changes da in the as the corresponding change in W is:
dW(a
1
, a
2
, ..., a
) =
_
W
a
1
_
da
1
+
_
W
a
2
_
da
2
+... +
_
W
a
_
da
, (2.5.1)
and so, it seems that all we have to do is to set dW = 0 and solve for the as. This will maximize
W.
Recall that:
W(a
1
, a
2
, ..., a
) =
/!
j
a
j
!
. (2.5.2)
Because of the product in the denominator, dierentiating this is very messy.
4
In order to neaten
up the math, we introduce a trick worth remembering: we take logarithms:
lnW = lnA!
j=1
lna
j
! , (2.5.3)
and maximize lnW instead:
d lnW =
_
lnW
a
1
_
da
1
+
_
lnW
a
2
_
da
2
+... +
_
lnW
a
_
da
. (2.5.4)
This works because of a simple mathematical fact: in general if f(x) has a maximum at x = x
,
then lnf(x) also has a maximum at x = x
.
To see this note that the maximum in lnf(x) occurs at
d lnf(x)
dx
=
1
f(x)
df(x)
dx
= 0 , (2.5.5)
and so, as long as f(x) is never zero, we have our desired result. Here, of course f(x) is W, and W,
by denition, is never less than 1 and so cant ever be zero.
2.5.2 Lagranges Method of Undetermined Multipliers
What we have after setting Equation (2.5.5) to zero is:
d lnW =
_
lnW
a
1
_
da
1
+
_
lnW
a
2
_
da
2
+... +
_
lnW
a
_
da
= 0 (2.5.6)
We would like to argue as follows: The as are independent quantities. So their small variations,
da, are also independent. This means that we can pick whatever value we wish for them and the
equation above must still be true.
But this can be true if and only if the coecients of the das are identically zero. Why? Imagine
we set each of the as to be 1 10
100
, which is certainly small enough to be a da, and by accident
4
Not to mention the factorials which cant simply be dierentiated at all!
the terms all added up to be zero. Now we change several of the das to be 2 10
100
, still small
enough to be a da, and add them up. Can we still expect the sum to be zero? What if we chose
another set of values for the das?
The only way the equation could be always true would be if and only if each of the coecients of
the as were separately equal to zero.
Then we could simply set
_
lnW
a
i
_
= 0 for all i , (2.5.7)
and solve the separate equations trivially.
But we can not do that!
The as are not independent! The a
i
must satisfy
j=1
a
j
= /. (2.5.8)
So if - 1 of the as are known, we can nd the remaining one by using Equation (2.5.8).
This is a constraint on the solution. Lagrange dealt with the problem of nding constrained maxima
and came up with an interesting answer.
5
First we note that because of the constraint equation
Equation (2.5.8) we also have:
j=1
da
j
= 0 , (2.5.9)
because / is a constant.
Pay attention now. Since this equation is identically zero, it can be added to the right-hand side of
any other equation without changing that equation in any way. In fact, any multiple of this equation
can be added to the any side
6
of any other equation without changing its value. So we will, following
Lagrange, add
j=1
da
j
to each term of
_
lnW
a
1
_
da
1
+
_
lnW
a
2
_
da
2
+. . . +
_
lnW
a
_
da
j=1
_
lnW
a
j
_
da
j
= 0 ((2.5.6))
to get:
j=1
__
lnW
a
j
_
_
da
j
= 0 . (2.5.10)
Of course lambda is a constant that can be set to any value we wish (even zero) without changing
the total value of Equation (2.5.10). So we can chose it to be:
=
_
lnW
a
1
_
, (2.5.11)
which xes to be whatever the value of that derivative has.
5
To a non-mathematician it is more than interesting, it is quite amazing. Back when I rst saw it I had a momentary
ash of how amazing things can be done in mathematics if one opened ones mind.
6
Or subtracted for that matter!
Heres the clever part.
7
That choice of exactly cancels the rst term of Equation (2.5.10) on the
previous page, so now that equation reads:
j=2
__
lnW
a
j
_
_
da
j
= 0 , (2.5.12)
with a
1
is no longer present in the equation. Now there are only 1 as left. And since there are
1 independent variables, all the remaining a
j
s can be taken as independent.
Now we can correctly argue that each of the remaining da
j
s are independent and the only way that
Equation (2.5.12) can be true is that if each of the coecients of the a
j
s is itself identically zero.
Thus we conclude that
_
lnW
a
j
_
= j = 2, ..., (2.5.13)
so that each of the derivatives is not only constant, but they are all the same constant, !
Further we notice that Equation (2.5.11) on the previous page ts the scheme of Equation (2.5.13)
as well, so we can regard that Equation as holding for any j from 1 to .
All that remains in order to solve our problem of nding what set of as gives the largest value to
W is to calculate lnW/a
j
. To do this we look at the expression for lnW:
lnW = ln/!
j=1
lna
j
! , ((2.5.3))
and see that all we have to do is dierentiate it with respect to a
j
and evaluate the resulting
lnW/a
j
. For j = 1 that will give us and the value of all the other as.
Doing all this is the subject of the next section.
2.5.3 The Final Details
We already know that the values of the a
j
that maximize W are those which satisfy
_
lnW
a
j
_
= j = 1, ..., ((2.5.13))
where W is given by:
lnW = ln/!
j=1
lna
j
! . ((2.5.3))
We see that we will have to dierentiate a factorial. There is no simple way to do this. The factorial
function we know is based upon integers and so is not a continuous function and simply cannot be
dierentiated.
However, there is a generalization of the factorial function known as the gamma function which is
continuous. This function is discussed in the Appendix to this chapter on page 23.
The gamma function gives rise to a well-known approximation to the factorial function called Stir-
lings Approximation.
8
7
Lagrange was a clever fellow!
8
Stirlings Approximation is discussed in detail in Section 2.8.
In its simplest form Stirlings Approximation is:
lnn! = nlnn n. (2.5.14)
Substituting that transforms Equation (2.5.3) on page 14 to:
lnW = ln/!
j=1
[a
j
lna
j
a
j
] . (2.5.15)
Dierentiating gives:
_
lnW
a
j
_
= [lna
j
+ 1 1] = lna
j
. (2.5.16)
Thus
lna
j
= , (2.5.17)
or
a
j
= e
j = 1, ... , , (2.5.18)
where an asterisk is used on a
j
to indicate that this is the value of a
j
that maximizes W and not
just any old a
j
.
Since the a
j
must sum to /, it is clear that
9
a
j
=
/
. (2.5.19)
We can now calculate W
max
, given by:
lnW
max
= ln/!
j=1
lna
j
! , (2.5.20)
which is:
lnW
max
= ln/!
j=1
ln
_
/
_
! . (2.5.21)
There are equal terms on the right, so this can be written as:
lnW
max
= /ln//
_
/
ln
/
_
, (2.5.22)
which can easily be shown to reduce to:
lnW
max
= /ln, (2.5.23)
or, what is the same thing:
W
max
=
A
. (2.5.24)
Those of you with good memories will recall that the total number of ways of arranging our /
ensemble members in microstates is precisely
A
.
What can this mean? Can the most probable set of as account for all the dierent ways of arranging
the ensemble members?
9
Actually, one could have guessed this result once it was realized that all the a
j
s had to have the same value. But
the Lagrange multiplier technique is so important statistical thermodynamics that it was thought appropriate to go
through the entire procedure.
Almost. The answer is that the most probable set of as is so overwhelmingly probable that the other
arrangements together simply dont add up to very much. Of course our W
max
isnt quite right,
that error comes from using Stirlings approximation, but it is very close to being right. The right
answer would be imperceptibly smaller than the value weve calculated. All the other arrangements
only contribute a negligible amount to W
max
.
The reason for this is known. The multinomial distribution is very sharply peaked when dealing
with the huge numbers we are dealing with here. It is, to all practical purposes a very thin spike
diering from essentially zero only at the most probable value. Indeed, we will make use of this fact
later to show what well call the equivalence of ensembles but well defer discussion of that until
then.
2.6 Thermodynamics
In order to compute properties in the microcanonical ensemble, we need to know P
j
, the proba-
bility that a randomly chosen microcanonical system is in the quantum state j. For this we need
Equation (1.8.1) on page 9.
Since all the a
j
are equal, the probability P
j
that a random system will be in microstate j must be
1/ because there are a
j
s.
So
P
j
=
1
. (2.6.1)
Since a property of the system depend on P
j
as well as the property X
j
itself, as shown in Equa-
tion (1.8.1) on page 9, then it turns out that in a microcanonical ensemble the degeneracy of the
energy E state determines the properties of the system!
10
Of course the independent variables in
our microcanonical systems are N, V , and E. Thus here is a function of N, V , and E,and is more
properly written as (N, V, E).
This just makes it more clear that is directly related to the thermodynamic properties of the
systems.
In thermodynamics each thermodynamic quantity has a set of natural independent variables. The
thermodynamic quantity goes to a maximum or, more usually, a minimum as a function of those
variables. And there is only one such property for each set of independent variables.
As an example think of the Gibbs free energy as a function of N, p, and T, or the entropy as a
function of E, V , and N, which are exactly the independent variables here.
So we suspect that (N, V, E) is somehow connected to the entropy.
Entropy has certain properties. The two most important of which are that it increases as an isolated
system, goes to equilibrium. The second is that the entropy is extensive. That is, its value is directly
proportional to the extent of the system.
To be denite let us do a thought experiment. Take a system consisting of an ideal gas at constant
N, V , and E. Further, to make sure we start o in a non-equilibrium state let us mentally conne
the molecules of the gas to the lower half of the volume of the system.
We then replicate this system to make our ensemble. Note that the occupation numbers a
j
are no
10
For folks like me who grew up thinking that the degeneracy of a given energy level was just an annoying compli-
cation of no particular signicance, this revelation came as a bit of a shock!
longer equal. Many of those as pertain to systems with molecules in the top half of their volumes
and those are now zero. So instead of having a situation where all the as are equal, we have many
of them zero.
Weve already shown that this set will not maximize W, because we know what values the as must
have to do that.
If we let our ensemble of systems evolve with time, we know that systems will move from microstate
to microstate. While the total energy of the system will remain constant the systems will redistribute
themselves over the states until they reach the most probable distribution.
11
In other words, W increases as the system goes to equilibrium. In fact, once at equilibrium we
expect almost never to see anything other than the most probable distribution.
12
The result of this thought experiment
13
is the realization that W in a non-equilibrium ensemble will
increase to a maximum as the system goes to equilibrium. This behavior is just what wed expect
from the entropy.
Since there is only one property indicating the direction of equilibrium for a given set of independent
variables, we conclude that W, and hence must be connected to the entropy.
Let us now look at the extensive aspect of the entropy.
Put another way, being extensive means that if we take a system and duplicate it and then put the
two duplicates together to form a single new system, the entropy of the new system is exactly the
sum of the entropies of the two original systems.
14
Let us look at two systems, A and B. They have entropies S
A
and S
B
and a total entropy of
S = S
A
+S
B
. (2.6.2)
The degeneracies of the systems are
A
and
B
. The total degeneracy is
=
A
B
. (2.6.3)
Why is this true? Because if system A is in microstate j that doesnt aect the microstate of system
B at all. So for each microstate of A there are
B
microstates of B. Hence Equation (2.6.3).
The s dont behave like entropies at all. Entropies add, s multiply. But logarithms of s add!
We can try to see if
S(N, V, E) = k ln(N, V, E)
works. It certainly works for the maximum property, since S is a maximum exactly where or W
is a maximum. And weve just seen that ln has the proper additivity property
But we quickly realize that there are a couple of minor problems. We rst need to establish a zero
point for the entropy. We know that S = 0 occurs when there is only one allowable state for the
entire system. And surely enough, that means that for such a system must be 1, and ln1 = 0. So
we are all right there.
11
Let me stress that this is NOT because the molecules know which distribution is the most probable. Indeed,
molecules are incredibly stupid. They end up in the most probable distribution because in fact most of the possible
arrangements of the occupation numbers of the microstates are in the most probable distribution!
12
Be careful here. We will discuss this more fully in a later chapter. Suce it to say that if the number of particles
in the system is small, the chance of nding the system in a state other than the most probable increases greatly.
And even if the number is very large, there is a very very small, but non-zero chance that such a state will occur.
13
Arent thought experiments neat? Easy to set up and easy to clean up afterwards. The doing of them is still
sometimes hard, though.
14
Entropy is not the only extensive thermodynamic function. All the direction-of-equilibrium functions are extensive.
And there is a problem with the scale of the entropy. We measure entropies today in Joules per
Kelvin. Back not too long ago we measured entropies in calories per Kelvin. All that changed when
we changed units was the scale. To set the scale of the entropy we use an appropriate factor k,
which also takes care of the units of entropy since logarithms dont have units.
So, following Boltzmann, who rst discovered this relationship, we assume the following rather
arbitrary, but properly behaved denition of entropy:
S(N, V, E) = k ln(N, V, E) , (2.6.4)
where k is a constant, known appropriately today as Boltzmanns constant.
15
A quick digression. We will study several more ensembles in succeeding chapters. In each of these
it will be shown that the entropy is given by the formula:
S = k
j=1
P
i
lnP
i
. (2.6.5)
where the sum is taken over all allowed states and P
i
is the probability that a randomly chosen
system in the ensemble will be found in state i. Here P
i
= 1/. Using this in Equation (2.6.5) gives:
S = k
j=1
P
i
lnP
i
= k
j=1
1
ln
_
1
_
=
k
ln = k ln (2.6.6)
which shows that this formula applies to the microcanonical ensemble as well.
When a thermodynamic quantity is expressed in terms of its natural values, it not only points in the
direction of equilibrium, but it has another property as well. If one has an explicit formula for the
property in terms of its natural variables, then all other thermodynamic properties can be obtained
from that formula.
In classical thermodynamics we know that if we have a formula for the entropy in terms of N, V ,
and E, we can calculate all other thermodynamic functions from it. We can see how to do that from
the total dierential:
dS =
_
S
E
_
dE +
_
S
V
_
dV +
_
S
N
_
dN , (2.6.7)
which, if we evaluate the dierentials, gives us:
dS =
1
T
dE +
p
T
dV

T
dN , (2.6.8)
showing that the temperature, pressure, and chemical potential are also known.
Similarly, if we had an equation for (N, V, E) in terms of N, V , and E as independent variables,
we could dierentiate its logarithm to get:
d ln =
1
kT
dE +
p
kT
dV

kT
dN . (2.6.9)
Of course we have not (yet) developed such a formula. What we have here is only a formalism, a
group of mathematical equations that tell us how we could compute things of interest if we actually
had a concrete formula for in terms of N, V , and E.
15
And it is carved onto his tombstone, possibly the only equation in human history to be so remembered.
We can complete our formal presentation with formal formulas for computing the temperature,
pressure, and chemical potential. From Equation (2.6.9) on the preceding page we see that:
1
kT
=
_
ln
E
_
V,N
(2.6.10)
p
kT
=
_
ln
V
_
E,N
(2.6.11)

kT
=
_
ln
N
_
E,V
(2.6.12)
To get actual, useful formulas for T, p, and one needs to have an actual formula for . To do that
we have to examine specic systems.
To be more exact, we have to examine a model of whatever system we are interested in. Why a
model? Because any real system is way too complicated to represent mathematically in full detail.
The physicist Steven Wolfram has said that to compute the next state of the universe requires a
computer as large and as complex as the universe. No smaller computer can do that.
16
It is easy to
see that the same thing is true for a system as simple as a box of real gas. Every molecule aects
the entire system.
To overcome this problem, we never attempt to deal with every single detail of the behavior of a
system. We cant. Instead we make a model of the system.
We pick what we believe are the important phenomena in a system and ignore the rest. The
phenomena we pick constitute the model.
For example, the model for an ideal gas is not too complex. Indeed, we will look at that in a later
chapter.
In general, however, models for the microcanonical ensemble are dicult to generate, primarily
because the restriction to a constant energy is very limiting. Thus it is not much used for modern
theoretical work. However, it is used for computer computations where one can follow the dynamics
of a system as it moves from state to state keeping the energy, volume, and number of particles
constant.
2.7 The Most Simple Spin System
Here is a very simple model of a very simple spin system. Let us assume that we have M particles
that have two possible spins. We denote these as up and down. At any point we have N up spins
and M N down spins. There is no energy of any kind. The spins are all independent and there is
no external eld they can interact with.
In this very simple model, M plays the role of volume V and N the role of number N. The energy
E is zero. We want to nd (0, M, N) for this system.
This is simple.
17
There are M! ways of arranging the N up-spins and M N down-spins. But the
order of the up-spins doesnt matter, so weve over counted by the N! arrangements of the up-spins.
Similarly the arrangement of the down-spins also doesnt matter, so weve over counted by (MN)!
16
This is easy to see, since if some parts of the universe were not needed to compute the next state of the entire
universe, then those parts can have no interaction whatsoever with the rest of the universe. In that case those parts
are not there at all!
17
I said it was a simple model of a simple system!
there too. The result is then:
(0, M, N) =
_
M
N
_
=
M!
N!(M N)!
. (2.7.1)
We can signicantly simplify this by applying Stirlings approximation:
ln = M lnM M N lnN +N (M N) ln(M N) . (2.7.2)
Then with = N/M:
1
M
ln = lnM
N
M
lnN
N
M

_
1
N
M
_
ln(M N) + 1
N
M
= lnM lnN (1 ) lnM (1 ) ln(1 ) , (2.7.3)
which, after a bit more manipulation becomes:
S(0, M, N)/k = ln (0, M, N) = M ln + (1 ) ln(1 ) , (2.7.4)
which, since is a mole fraction of up spins in a two-component system, Equation (2.7.4) gives
exactly the thermodynamic entropy of mixing of two components non-interacting components, up
spins and down spins.
Figure 2.1: Spin entropy per spin as a function of fraction upspins
Computation of the chemical potential /kT and the pressure p/kT is left as an exercise for the
reader.
2.8 Appendix: The Gamma Function and Stirlings Ap-
proximation
Stirlings approximation is a useful equation that allows writing factorials in terms of continuous
functions. The approximation is:
lnn! = nlnn n. (2.8.1)
These are the rst few terms of an asymptotic expansion for lnn!. An asymptotic expansion is a
series expansion that does not converge to the actual value of the function but also has the property
that the ratio of its value and the correct value goes to one as the argument of the function increases
in size.
The factorial n! is dened only for the non-negative integers. The factorial of zero is dened as 1.
They are a special case of the gamma function (z), where z is a complex variable whose real part
is always greater than 0. The gamma function is given by:
(z) =
_

0
t
z1
e
t
dt real part of z > 0 . (2.8.2)
Figure 2.2: The Gamma Function
It is simple to show that the gamma function obeys the relationship:
(z + 1) = z(z) . (2.8.3)
To do this we need only integrate
(z + 1) =
_

0
t
z
e
t
dt , (2.8.4)
by parts. Using the formula
_
udv = uv
_
vdu we can choose t
z
= u and e
t
dt = dv. This gives
us v = e
t
and dv = zt
z1
dt. So
(z + 1) = t
z
e
t
0
+z
_

0
t
z1
e
t
dt . (2.8.5)
The rst term is zero at both limits and the second gives us z(z)
We can use Equation (2.8.3) to show that the gamma function acts like the factorial for integer z
by noting that, for example:
(4) = 3(3) = 3 2(2) = 3 2 1(1) , (2.8.6)
and knowing that (1) = 1 we have (4) = 3!. Showing that (1) = 1 is easy enough. We need
only set z = 1 in Equation (2.8.2) on the previous page and doing the resulting simple integral. By
induction one can then show that
(z + 1) = z! integer z > 1 (2.8.7)
It is useful to know that
(1/2) =
(2.8.8)
which can be derived by noting that
(1/2) =
_

0
t
1/2
e
t
dt = 2
_

0
e
u
2
du =
(2.8.9)
where weve used the substitution t = u
2
. The last integral is a standard one equal to

/2, which
leads immediately to Equation (2.8.8).
Indeed, by the use of Equation (2.8.3) on the previous page the useful general relation:
(n + 1/2) =
1 3 5 (2n 1)
2
n
n = 1, 2, . . . (2.8.10)
is easily discovered.
The gamma function can be dierentiated. The result is called the psi or digamma function:
(z) =
d ln(z)
dz
=

(z)
(z)
(2.8.11)
where
(z) stands for d(z)/dz.

It can be shown that
18
(1) = (2.8.12)
where is Eulers constant 0.5772156649... , an irrational number as famous among mathematicians
as and e.
For integer n:
(n) = +
n1
k=1
1
k
(2.8.13)
which gives us the derivative of a factorial.
19
To make dealing with factorials easier, the British mathematician Stirling derived the asymptotic
series expansion the leading terms of which are given in Equation (2.8.1) on the previous page. It is
fairly easy to derive.
20
We start with Equation (2.8.4) on the preceding page and note that the integrand is very sharply
peaked. This is because t
z
rises rapidly while e
t
falls rapidly. Indeed the integrand is zero at both
0 and . We now write t
z
as e
z ln t
and so:
(z + 1) =
_

0
e
z ln tt
dt (2.8.14)
18
Dont you love it when the Author says it can be shown...
19
Equation (2.8.13) is easy enough to evaluate for n = 7, but it can be suspected that youd not care to evaluate
it, term by term, for n = 6 10
23
.
20
The derivation given here is mainly from D.A. McQuarrie, Mathematical Methods for Scientists and Engineers,
University Science Books, 2003, pages 119.
A little numerical exploration shows
21
that this integrand is peaked at z = t, so we expand the
exponent z lnt t around z = t in a Taylor series. For this we will need the derivatives:
f(t) = z lnt t
f
(t) =
z
t
1 (2.8.15)
f
(t) =
z
t
2
f
(t) =
2z
t
3
so the expansion is:
z lnt t z lnz z
(t z)
2
2z
+ (2.8.16)
where terms in (t z)
3
and higher have been neglected.
We now have, for Equation (2.8.14) on the previous page
(z + 1)
_

0
e
z ln zz((tz)
2
)/2z
dt (2.8.17)
e
z ln zz
_

0
e
((tz)
2
)/2z
dt (2.8.18)
If we now let (t z)
2
/2z = u
2
, we get, after some manipulation
(z + 1) (2z)
1/2
z
z
e
z
_

(z/2)
1/2
e
u
2
du (2.8.19)
Because z is assumed to be large and because the integrand falls of rapidly around its peak at u = 0,
we can extend the lower limit to without introducing too much more error. We do this because
we can then do the resulting integral
22
which is:
_

e
u
2
du = 2
_

0
e
u
2
du =

1/2
2
(2.8.20)
We then nally get our asymptotic approximation:
(z + 1) = z! (2z)
1/2
z
z
e
z
(2.8.21)
The error in Equation (2.8.21) is too great to allow its use directly,
23
However its logarithm is very useful:
lnz! z lnz z +
1
2
ln(2z) (2.8.22)
where normally only the rst two terms are used.
24
The more complete Stirlings Approximation is:
(z + 1) = z! (2z)
1/2
z
z
e
z
_
1 +
1
12z
+
1
288z
2
+
_
(2.8.23)
21
You can try to show this analytically, but it isnt easy.
22
Necessity is often the cause of much inspiration.
23
Check it yourself. Ten factorial is 3628800, yet Equation (2.8.21) gives 3598695.6, an error of about 30,000...
24
Again, think of z as 10
23
. In that case the last term is about 100 which is quite negligible.
3. The Canonical Ensemble
We now turn our attention to the canonical ensemble. This was rst described by Gibbs
1
and turns
out, in a practical way, to be much more useful than the microcanonical ensemble.
The reason for this is simple. The microcanonical ensemble assumes that the energy is xed. That
implies that the system can do no mechanical work (and thus have rigid walls) and can neither
take up or give up heat (and thus also has adiabatic walls.) Such a system is isolated from its
surrounding. This is not a common situation for real systems.
2
To partially remove the isolation, we take a single component single phase system with walls that are
rigid and impermeable (ensuring constant volume and number of particles) but which are also heat
conducting. This allows the energy of the system to vary. What we x instead
3
is the temperature.
We do this by putting the system into contact with a constant temperature bath at a temperature
T, which then becomes the system temperature.
The energy of this system is not constant, indeed it can and does uctuate. What is constant is the
temperature of the system, held that way by the external heat bath.
4
3.1 The Ensemble
We again follow a Gibbsian approach. We take our system dened above and replicate it / times,
once again taking / to be a very large number. In doing this we create another ensemble, this one
called canonical. To ensure that these systems all have the same value of T we insert them into the
same constant temperature bath at temperature T.
5
Figure 3.1: Schematic Canonical Ensemble. Black
boxes represent systems of xed N, V , and T; the
grey walls represent a rigid adiabatic walls; the white
space represents a heat bath at a temperature T.
At equilibrium at some instant, each of these macroscopic systems will be in a particular macroscopic
energy state E
j
at any instant of time. At another instant it may well be in another energy state,
but that does not worry us because, once again, we will freeze the ensemble of systems at some
1
See J. Willard. Gibbs, Elementary Principles in Statistical Mechanics, Dover, 1960.
2
Indeed, the most common systems have xed temperature and pressure. We will meet an appropriate ensemble
for that situation later.
3
We need to x three independent variables for a single component single phase system.
4
Extensive variables such as volume, energy, and number of particles can be set without reference to anything
on the outside of the system. Intensive variables such as the temperature need external control. We will see other
examples of this soon enough.
5
Considering the huge size of the ensemble, this will have to be a really huge constant temperature bath. Luckily,
we only have to imagine it, not build it.
26
Chapter 3: The Canonical Ensemble 27
instant.
6
Once weve frozen the systems we will want to compute the number of ensemble members
in any particular energy state.
We will denote the ensemble members in energy state E
j
by a
j
, noting that many dierent E
j
s may
have the same value of energy
7
while many others will have totally dierent energies.
One constraint on our calculations (just as it was for the microcanonical ensemble) is
j
a
j
= /. (3.1.1)
What we want to do is what we did with the microcanonical ensemble. First, we want to assume
that all microstates are equally likely, subject to the constraint on the number of ensemble members
(Equation (3.1.1) and to the constraint that the temperature is constant. We have not formulated
that constraint yet, but we will.
Second, we want to nd what distribution of the systems among the various energy states is the
most likely. Again, that means nding which set of as will most probably occur.
And once weve done that we want to nd the probability that a given ensemble member is in state j.
Thats actually the easiest part because if the set of as that has the maximum chance of occurring
is
a
1
, a
2
, ...., a
j
, ... ,
then the probability of nding a random system in state j is just
P
j
=
a
j
/
. (3.1.2)
Unlike the microcanonical ensemble, we also have a heat bath to consider. Probably the best way
to deal with it is to consider it as a separate system. We can even think of it as system with xed
N, V , and T having a large number of possible internal states just like an ensemble member.
Then we can make use of a fantastically clever idea.
8
That idea is to take the entire ensemble plus
the heat bath and wrap it in totally rigid impermeable adiabatic walls. (See Figure 3.1 on the
previous page.)
When we do that, what we have is a new single system that is microcanonical!
We can now formulate a constraint that takes the constant temperature bath into account.
9
We do
this by recognizing that the total energy of all the systems plus the heat bath is constant because
of the isolating walls around the entire ensemble and bath. Thus
j
a
j
E
j
+E
bath
= c , (3.1.3)
where c is the total energy of systems plus bath. The E
j
s are the energies of the systems and E
bath
is the energy of the constant temperature bath.
Introducing the heat bath keeps our systems at a constant temperature T and also gives us an energy
constraint that we can use. It also introduces a complication. We not only have W
sys
, the number
6
Remember, there is no time variable in a Gibbsian ensemble.
7
That is, the energy levels are degenerate.
8
The author wishes that he knew who invented this idea because that person is the unsung hero of statistical
thermodynamics.
9
This is not trivial. Another approach sometimes taken is to omit the heat bath, allowing the other A1 systems
in the ensemble to form a heat bath for whatever system one is looking at. This works. But the problem is that
the temperature then gets introduced in a decidedly circular manner since the temperature of the bath becomes the
temperature of the system which in turn can be thought of as part of the bath, which sets the temperature of the
bath... Here, as we will see, the temperature is determined by the heat bathas it should be!
of ways of arranging the canonical systems among their energy states, but we also have W
bath
, which
are the number of ways of arranging the heat bath among its internal states.
As a result the total number of ways of arranging system plus bath is:
W
total
= W
sys
W
bath
, (3.1.4)
and it is W
total
that we need to maximize.
The two Ws are multiplied because they are independent. This is so because W
sys
depends only
on the a
j
s and W
bath
doesnt depend on them at all.
We dont want to bother to know too much about the bath. All it needs to do is maintain a constant
temperature. It is enough for us to know that W
bath
must depend on E
bath
.
3.2 The Most Probable Occupation Numbers
So what we have is
W
total
= W
sys
W
bath
=
/!
j
a
j
!
W
bath
(E
bath
) . (3.2.1)
We want to nd the set of as that maximize this subject to the conditions:
j
a
j
= / and
j
a
j
E
j
+E
bath
= c . (3.2.2)
The technique is exactly as for the microcanonical ensemble. First we take logarithms of Equa-
tion (3.2.1) and then apply Stirlings approximation:
lnW
total
= lnW
bath
+ ln/!
j
(a
j
lna
j
a
j
) . (3.2.3)
Then we write the total derivative and set it to zero to nd the maximum.
10
d lnW
total
=
_
lnW
bath
E
bath
_
dE
bath
j
lna
j
da
j
= 0 , (3.2.4)
We have two additional equations obtained by dierentiating the two auxiliary conditions in Equa-
tion (3.2.2):
j
da
j
= 0 and
j
E
j
da
j
+dE
bath
= 0 . (3.2.5)
Because of the auxiliary conditions two of the a
j
s are not independent variables.
We can remove the two extra variables by multiplying the two constraint equations by and ,
which are so far of unknown value, to give:
j
da
j
= 0 and
j
E
j
da
j
+dE
bath
= 0 . (3.2.6)
These are then subtracted
11
from Equation (3.2.4) to get:
d lnW
total
=
__
lnW
bath
E
bath
_
_
dE
bath
j
(lna
j
+ +E
j
) da
j
= 0 . (3.2.7)
Now we can get rid of the bath conditions by setting :
=
_
lnW
bath
E
bath
_
, (3.2.8)
10
We know it will be a maximum because the smallest value W can have is 1.
11
They are subtracted purely for convenience since has a standard denition and wed like to agree with it...
and we can also get rid of one a, say a
1
by picking so that:
lna
1
+ +E
1
= 0 . (3.2.9)
What is left is

j=1
(lna
j
+ +E
j
) da
j
= 0 . (3.2.10)
But now all of the remaining a
j
s are independent of each other. So once again the only way that
Equation (3.2.10) can be zero for all possible choices of the remaining da
j
s is if their coecients in
parentheses above are identically zero. Thus we have:
lna
j
+ +E
j
= 0 , (3.2.11)
for all j, including j = 1, as can be seen by checking Equation (3.2.9). The asterisk, as usual,
indicates that this value for a
j
is the one that maximizes W
total
.
Note that these occupation numbers are not constant, but depend explicitly upon the energies E
j
of the system states.
We can simplify things a good bit. For instance:
a
j
= e
Ej
, (3.2.12)
which can be summed:

j
a
j
= / = e
j
e
Ej
. (3.2.13)
So can be found from
e
=
/
j
e
Ej
, (3.2.14)
once is known.
So, assuming that we will determine , which we will,
a
j
=
/e
Ej
j
e
Ej
. (3.2.15)
Two things: First, the fraction a
j
// is simply the fraction of the ensemble members in state j and
so it is the probability of nding an ensemble member in that state:
P
j
=
a
j
/
. ((3.1.2))
And using Equation (3.2.15) for a
j
P
j
=
e
Ej
j
e
Ej
. (3.2.16)
The sum in the denominator of Equations (3.2.15) and (3.2.16) is like in the treatment of the
microcanonical ensemble. It turns up so often that it is given a special symbol, Q and a special
name, the canonical partition function. It is written:
Q(N, V, ) =
j
e
Ej
, (3.2.17)
because it is a function of N, V , and .
3.3 The Thermodynamics of the Canonical Ensemble
The canonical partition function Q(N, V, ) is directly connected to a thermodynamic quantity, just
as (N, V, E) for the microcanonical ensemble was connected to the entropy.
Indeed, the reader can probably guess which thermodynamic function that is.
Proving it is a bit more dicult. First we will show that we can compute the energy, pressure,
and chemical potential from Q(N, V, ). We will then form the total derivative of Q(N, V, ) and
compare that to the appropriate macroscopic thermodynamic function. From that we hope to not
only identify , but Q as well.
3.3.1 Energy
We calculate the average energy of the systems in the canonical ensemble. The probability that a
system is in energy state E
j
is P
j
. Thus:
E) =
j
P
j
E
j
. (3.3.1)
Since P
j
is given by Equation (3.2.16) on the preceding page, we get:
E) =
1
Q
j
E
j
e
Ej
. (3.3.2)
Now consider ( lnQ/)
N,V
:
_
lnQ
_
N,V
=

_
ln
j
e
Ej
_
=
E
j
e
Ej
j
E
j
e
Ej
,
so
E) =
_
lnQ(N, V, )
_
N,V
. (3.3.3)
We now identify the expected value of the energy E) with the internal energy E.
3.3.2 Pressure
Imagine that in a canonical ensemble the volumes of all the systems is changed reversibly and
adiabatically
12
by the same small amount. Then, from thermodynamics:
dE = dq pdV +dN , (3.3.4)
where q is the heat, p is the pressure, and is the chemical potential. With impenetrable walls there
can be no change in N, so Equation (3.3.4) reduces to
dE = dq pdV . (3.3.5)
When a system is in energy state E
j
and the volume is changed adiabatically, dq is zero and
Equation (3.3.5) can be written:
dE
j
= p
j
dV , (3.3.6)
12
The meaning of this is that since the energy levels change with volume that they do so. But more importantly,
they do so without any system moving from one energy level to another because of this change.
where p
j
is the pressure of the system when it is in the jth energy state.
13
Thus we have:
p
j
=
_
E
j
V
_
N,T
. (3.3.7)
The average pressure is then:
p) =
1
Q(N, V, )
j
__
E
j
V
_
e
Ej
_
. (3.3.8)
Note that:
_
lnQ(N, V, )
V
_
N,
=
1
Q(N, V, )
j
e
Ej
=
1
Q
j
_
(E
j
)
V
_
e
Ej
=

Q(N, V, )
j
__
E
j
V
_
e
Ej
_
, (3.3.9)
so that
p) =
_
lnQ(N, V, )
V
_
N,
. (3.3.10)
3.3.3 Chemical Potential
If we consider the possibility of changing the number of particles in a system we can again use
Equation (3.3.4):
dE = dq pdV +dN , ((3.3.4))
We now hold the volume constant but allow N to change slightly. When N changes slightly we can
expect the internal energy E to change as well since the new particle will have some energy of its
own. The change in will then be:
=
_
E
N
_
. (3.3.11)
We allow all the systems in the ensemble to change the number of particles they contain by dN so
that all the systems remain the same. Since the number of particles in a system has then changed,
the number of systems in a given energy level E
j
will change. The change in the chemical potential
is given by:
j
=
_
E
j
N
_
, (3.3.12)
where dN is meant in the sense of a small change in number of particles when there are a huge
number present and
j
is the chemical potential associated with the change in number of particles
in level E
j
.
The average value of the chemical potential is then given by:
) =
1
Q(N, V, T)
j
_
E
j
N
_
e
Ej
. (3.3.13)
Except for a factor of this is exactly what we get if we dierentiate lnQ with respect to N, so
) =
_
lnQ(N, V, T)
N
_
V,
, (3.3.14)
where the details are left to the reader.
13
There is no reason to suppose that the pressure will remain constant if the energy levels change their numeric
values, and in fact they dont.
3.3.4 The Canonical Partition Function and the Identication of Beta
The canonical partition function Q(N, V, ) is an interesting function. Its derivatives with respect
to , V , and N are E, p, and respectively. In fact the total derivative of its logarithm is:
d lnQ(N, V, T) =
_
lnQ
_
N,V
d +
_
lnQ
V
_
N,
dV +
_
lnQ
N
_
,V
dN , (3.3.15)
or, using the results above
d lnQ(N, V, T) = E)d +p)dV )dN , (3.3.16)
where we expect that is somehow a function of the temperature.
Is there a thermodynamic function with that total derivative? Of course the answer is yes. But it
isnt a standard function. Those all have dimensions. Equation (3.3.16) is dimensionless.
We can get the desired function by starting with
dE = TdS pdV +dN , (3.3.17)
and we make it dimensionless by dividing it by kT where T is the absolute temperature and k is a
constant with the units of energy divided by temperature and number of particles. This gives:
1
kT
dE =
1
k
dS
p
kT
dV +

kT
dN , (3.3.18)
or
1
k
dS =
1
kT
dE +
p
kT
dV

kT
dN . (3.3.19)
To get this into a form similar to that of Equation (3.3.16) we need to do a transformation
14
by
adding d(E/kT) to both sides of Equation (3.3.19) and then transforming the right-hand-side:
1
k
dS d
_
E
kT
_
=
1
kT
dE +
p
kT
dV

kT
dN d
_
E
kT
_
=
1
kT
dE +
p
kT
dV

kT
dN Ed
_
1
kT
_
1
kT
dE
= Ed
_
1
kT
_
+
p
kT
dV

kT
dN , (3.3.20)
and then the left by noting that A = E TS so that:
dS
k
d
_
E
kT
_
=
1
k
dS d
_
A+TS
kT
_
=
dS
k
d
_
A
kT
_
dS
k
d
_
A
kT
_
= Ed
_
1
kT
_
+
p
kT
dV

kT
dN . (3.3.21)
There is in fact a systematic way to derive dimensionless equations of this sort. It will be discussed
in Section 8.5 on page 97.
14
Admittedly this one is unfamiliar, but we will use the same trick as used to get H from E or G from H in classical
thermodynamics. The trick even has a name. It is called a Legendre transformation.
So we now have:
d
_
A
kT
_
= Ed
_
1
kT
_
+
p
kT
dV

kT
dN . (3.3.22)
We compare this with
d lnQ(N, V, ) = E)d +p)dV )dN . ((3.3.16))
First, we recall that a thermodynamic function can have only one total dierential for any partic-
ular set of independent variables. Thus we expect that Equations (3.3.16) on the previous page
and (3.3.22) are the same. And we can make it so by identifying as:
=
1
kT
, (3.3.23)
and lnQ(N, V, ) as:
lnQ(N, V, ) =
A
kT
, (3.3.24)
or
A(N, V, T) = kT lnQ(N, V, T) , (3.3.25)
where A is the Helmholtz free energy.
We have particular satisfaction in identifying with 1/kT since is a property of the heat bath
and the heat bath alone. This can be seen by a quick look at Equation (3.2.8) on page 28.
Further, though we will not explore it here, imagine that one has two dierent systems: 1 and
2. These have parameters N
1
, V
1
, E
1,j
and N
2
, V
2
, E
2,j
. We make ensembles out of both, one
with /
1
ensemble members, the other with /
2
ensemble members. We place all of these ensemble
members into the same heat bath. Equation (3.2.3) on page 28 now has an additional set of terms
corresponding to systems 2. But there is still only one heat bath term.
The result is obvious. The two ensembles come to equilibrium separately, but at the same temper-
ature. And the only parameter common to both is = 1/kT, just as wed expect.
3.4 The Entropy
The entropy can be computed from A(N, V, T) for any particular system if one has an equation for
A explicitly in terms of N, V , and T.
But here are six lines leading to another way of looking at the entropy:
P
j
=
e
Ej
Q
. (3.4.1)
lnP
j
= E
j
lnQ. (3.4.2)
P
j
lnP
j
= P
j
E
j
+P
j
lnQ. (3.4.3)
j
P
j
lnP
j
= E) lnQ
j
P
j
. (3.4.4)
j
P
j
lnP
j
= E) lnQ = E A =
S
k
, (3.4.5)
and nally
S = k
j
P
j
lnP
j
= k lnP) , (3.4.6)
where the last follows from the denition of a property. Thus the entropy can be thought of as
the average value of the logarithm of the occurrence of a particular state. Indeed, we can identify
(k lnP
j
) as the contribution of state j to the total entropy of the system.
15
In words, the entropy of a canonical system depends directly on the probabilities that the system
will be found in each of the various states j. Of course those probabilities in turn depend on the
energies of the states; see Equation (3.4.1) on the preceding page.
Since the probabilities are all less than one, the smaller the probability the larger its logarithm is
in absolute value. So large entropies depend on the average occupancy of state j being very small.
One or a relative few highly probable states contribute very little to the entropy.
Our insight is that even opening up a very high energy state to occupation will increase the entropy
of the system signicantly.
One further thought. Equation (3.4.6) also applies to the microcanonical ensemble. To see this
simply take Equation (2.6.2) on page 19
P
j
=
1
, ((2.6.2))
take the logarithm of both sides, multiply by P
j
and sum over j to get:
j
P
j
lnP
j
=
j
P
j
log
1
= ln , (3.4.7)
from which Equation (3.4.6) follows trivially.
3.5 Degeneracy
It is almost always the case that the energy levels available to a system are degenerate. Lets denote
the degeneracy of level j as (N, V, E
j
). This can be as small as 1 or, much more often, a very large
number.
In the microcanonical ensemble there is only one allowed energy E, so the degeneracy was written
(N, V, E), with no index j.
With this notation we can then write the canonical partition function Q(N, V, T) as
Q(N, V, T) =
j
(N, V, E
j
)e
Ej
, (3.5.1)
where the sum is now over distinct energy levels instead of counting each quantum state separately.
This form suggests that the canonical partition function Q is some sort of transformation
16
of the
partition function for the microcanonical ensemble. This is in fact true and will be explored in a
later chapter.
15
Note that if a given P
j
is zero that term doesnt count in the sum since P
j
ln P
j
goes to zero as P
j
goes to zero
and that state does not enter into the sum in Equation (3.4.6).
16
It looks like a Laplace Transformation, or would if there was an integral instead of a sum.
3.6 A Simple Spin System in an External Field
In Section 2.7 on page 21 we discussed the most simple spin system possible. Here we are going
to add an external magnetic eld that interacts with the spins. This adds an energy to the model.
Otherwise the model remains the same.
Heres the model: Let us assume that we have particles that have two possible spins, which we shall
denote as up and down. There are M spins and N of them are up and the remainder are down.
Again M plays the role of a volume since it gives the extent of the system. We use a symmetric
energy scheme in which the energy of the up spins is and that of the down spins is .
Then for N up spins the total energy E is
E = N (M N) = (2N M) . (3.6.1)
Given M spins with N of them up spins, there are M!/N!(M N)! ways of arranging them. Thus
the canonical partition function for this system is:
Q(, M) =
M
N=0
M!
N!(M N)!
e
(2NM)
, (3.6.2)
which looks dicult to solve. But it isnt. If we factor exp(M) out of the sum we get:
Q(, M) = e
M
M
N=0
M!
N!(M N)!
e
2N
, (3.6.3)
which the experienced will recognize as the series expansion
(1 +x)
M
=
M
N=0
M!
N!(M N)!
x
N
, (3.6.4)
with x = e
2
.
The rest is easy.
Q(, M) = e
M
_
1 +e
2
M
, (3.6.5)
Q(, M) = [e
+e
]
M
, and (3.6.6)
Q(, M) = [2 cosh()]
M
. (3.6.7)
The last equation comes from the denition of the hyperbolic cosine.
17
The average energy of this system can be found from Equation (3.3.3) on page 30. We note that
lnQ(, M) is given by:
lnQ(, M) = M ln[2 cosh()] , (3.6.8)
and dierentiating with respect to (and changing the sign) gives:
E) = M
sinh()
cosh()
, (3.6.9)
E) = M tanh() . (3.6.10)
A graph of the average energy per particle (in units of versus T) is given in Figure 3.2 on the next
page.
17
The hyperbolic functions are usually familiar but in cases like this they simplify things enormously.
Figure 3.2: Average Energy per Particle vs. Temperature
Note that in the gure the units of energy are E)/M and the horizontal axis is drawn with k/
set equal to 1.
This is a somewhat curious result and can use some explanation. At low temperatures all spins are
in the state and hence the average energy per spin is 1 in units of . At very high temperatures
half the spins are in each of the two states and thus the average energy per spin is 0. However it
takes a very high temperature to reach this point, perhaps a temperature of 20 or more in these
units.
The heat capacity C
M
of such systems is also interesting. To obtain it we must dierentiate Equa-
tion (3.6.10) on the preceding page with respect to T. It is best to do this indirectly. Thus we
dierentiate with respect to and then dierentiate with respect to T. This last is easy since
we nd that d/dT = 1/kT
2
. Dierentiating a hyperbolic tangent is a bit more dicult but not
impossible.
18
We nd that d tanh(ax)/dx = a sech
2
(ax). So:
C
M
=
_
E
T
_
M
=
1
kT
2
_
E
_
=
M
kT
2
2
sech
2
(/kT) (3.6.11)
Gaining an understanding of what this function looks like can be helped by looking at Figure 3.3. At
Figure 3.3: Reduced Heat Capacity per Particle vs. Reduced Temperature
low temperatures where all the spins are in the low energy state, the heat capacity is zero because
18
Especially if you look it up as I did.
the slope of the energy against temperature curve is zero (see Figure 3.2). Since the higher energy
level can never have more particles than the lower one, the two levels must eventually have equal
populations and the heat capacity must drop to zero.
4. Independent Subsystems in the
Canonical Ensemble
4.1 Fundamentals
To this point we have focused attention on entire systems. We have assumed that the system energy
levels, E
j
were or could be known. In fact this is almost never the case since to know the system
energy levels wed have to know the system wave function. And in general we almost never know
these since they are usually too dicult to compute.
The primary reason for this is that the particles in a system generally interact. And those interactions
(even if we knew them with great accuracy, which we do not) make Schrodingers Equation impossible
to solve.
1
But there is one class of systems that is simple enough so that we can nd the system energy levels.
This is the case where the system is made up of independent subsystems.
This is the case when the constitutent particles in a system have no interactions with each other at
all.
2
We can assume that the independent subsystems are made up of single particles. This is because
it can be shown that if the subsystems are large molecules, the motion of that subsystem can be
separated into two parts, the motion of its center of gravity and the motion of the constituent
particles around that center of gravity.
Here we are only interested in the motion of the center of gravity, (we will deal with motions about
the center of gravity later) so we will speak of particles rather than independent subsystems.
4.2 Independent Energies
From quantum mechanics we know that an N-particle system in a volume V will have a Hamiltonian
operator H that yields the eigenvalue equation:
H
j
= E
j
j
, (4.2.1)
where
j
is the system wave function corresponding to the jth system state whose energy is E
j
.
3
1
What one has in that case is a system of 3N coupled second order partial dierential equations to solve simul-
taneously. Analytic solution is basically out of the question. Numeric solution, for any value of N wed regard as
reasonable for statistical thermodynamics is just silly as upwards of several moles of memory would be needed, not
to mention the the computer power!
2
Of course all such ideal systems involve a certain amount of magic. If they are to be contained in a volume V , the
particles must interact with the walls. But this contradicts the idea that there are no particle-particle interactions.
The answer to this lies in the fact that wall interactions increase as the square of the number of particles while the
bulk properties increase as the cube. So we simply assume that the system is large enough (i.e. N is large enough)
so that the wall interactions are negligible. This is called the thermodynamic limit.
3
Purists will note that I have slurred over a number of diculties here since it is very unlikely that any such system
will exist in a pure state. In the end that makes no real dierence to the argument.
38
Chapter 4: Independent Subsystems in the Canonical Ensemble 39
If the particles in this system are independent, then the system Hamiltonian H breaks down into a
sum of single particle Hamiltonians h:
H = h
1
+ h
2
+ + h
N
. (4.2.2)
This is so because independent in the quantum mechanical sense means that there are no terms
in the Hamiltonian that couple the behavior of particle i to particle j where i ,= j. Since the
momentum part of the Hamiltonian consists of independent terms, any coupling must come from
the potential energy part of the Hamiltonian. Clearly non-interacting particles will be uncoupled.
But there still can be an external potential (such as a gravitational eld) that gives each particle a
potential energy.
4
If there is no coupling the total energy E
j
becomes the sum of the energies of the individual particles:
E
j
=
j,1
+
j,2
+... . (4.2.3)
We now assume, for simplicity, that the particles are identical.
5
This means that the single particle
Hamiltonians hs for each particle will be identical. Then each particle will have the same set of
single particle energy levels.
4.3 Single Particle Canonical Partition Functions
Now let q
a
be a single particle canonical partition function dened by the following equation:
q
a
(V, ) =
j
e
a,j
, (4.3.1)
where
a,j
denotes the energy of the ath particle when it is in the jth single particle quantum state.
Of course q
a
is not a function of the number of particles, but is a function of both the volume V
and the temperature .
Since each particle has the same set of energy states, all the q
a
s sum to exactly the same value and
so are the same, independent of the subscript a. So we can denote them all by the simple symbol q.
With this denition it will be true that
Q(N, V, ) =
j
e
Ej
=
1
N!
q
N
. or sometimes just q
N
(4.3.2)
To explain the or sometimes just in the equation above takes us ahead of the story. But the short
answer is that the N! occurs if the particles are assumed to be indistinguishable in the sense of
quantum mechanics.
6
If, for some reason, the particles can be distinguished (perhaps by location,
as in a crystal), then there is no N! and we have the or sometimes just part of the equation.
The truth of Equation (4.3.2) is best seen by working backwards from it. Ignoring the factorial for
now, the right-hand side of Equation (4.3.2) can be written
q
N
=
j
e
1,j
k
e
2,k
l
e
3,l

m
e
N,l
(4.3.3)
where the number tells us which particle is having its energy levels summed and the index i, j, etc.,
denotes the actual energy levels.
4
This situation will be examined in a later chapter.
5
This restriction is easily removed.
6
Meaning that they dont just look alike, they are identical. There is a deeply profound dierence between the
two.
The rst term in Equation (4.3.3) on the preceding page refers to the rst particle, whichever one
that is, the second term to the second particle, etc.
We now multiply Equation (4.3.3) on the previous page out:
q
N
= e
1,1
e
2,1
e
3,1
+e
1,1
e
2,2
e
3,1
+e
1,1
e
2,1
e
3,2

+ e
1,2
e
2,1
e
3,1
+e
1,2
e
2,2
e
3,1
+e
1,2
e
2,1
e
3,2

+ e
1,3
e
2,1
e
3,1
+e
1,3
e
2,2
e
3,1
+e
1,3
e
2,1
e
3,2

+ . (4.3.4)
In Equation (4.3.4) above, in each energy subscript the rst index is the particle number and the
second is the energy level. Each line starts with the rst particle in a dierent energy state while
the rest of the line contains all the other particles in all possible energy states.
The exponents in each product in each line can be gathered together:
q
N
= e
(1,1+2,1+3,1+ )
+e
(1,1+2,2+3,1+ )
+e
(1,1+2,1+3,2+ )
+ e
(1,2+2,1+3,1+ )
+e
(1,2+2,2+3,1+ )
+e
(1,2+2,1+3,2+ )
+ e
(1,3+2,1+3,1+ )
+e
(1,3+2,2+3,1+ )
+e
(1,3+2,1+3,2+ )
+ , (4.3.5)
where again the rst subscript is the particle number and the second the energy level. Clearly these
terms include all possible combination of all the possible energies of all the N particles. So this looks
very much like the

j
e
Ej
of Equation (4.3.2) on the previous page. But it isnt!
In fact, Equation (4.3.3) on the preceding page contains a deception.
7
Weve made sure to point
out that the order of the particles was preserved by including the particle index number. However,
in reality there is no way to know which particle is which. They are identical. And you cant paint
numbers on them!
8
If we were to calculate how many particles are in system energy level j, and we
will later on, we would know their number, but not which particles they where.
Look for example at the rst line of Equation (4.3.5). The rst term has all the particles in energy
level 1. The second has one particle in energy level 2 and all the others in energy level 1. And the
third term has exactly the same thing! They dier only as to the identity of the particle in energy
level 2.
In reality there should only be one such term in Equation (4.3.2) on the preceding page instead of
the N terms that actually occur. Because we have ignored the indistinguishability property of the
particles, we have too many terms in Equation (4.3.5)!
Thus this way of making up the system energy levels from the particle energy levels fails. We need
a more sophisticated approach.
But there is a quick x. And that is to realize that we have done the counting by rst lining up
all the particles in a single line (so that there is a particle 1 and a particle 2) and then produced
Equation (4.3.5) from it. But there are N! ways to produce that line of particles, and each and
every one of those will produce Equation (4.3.5).
7
This deception, as I have called it, became apparent to Gibbs (though in a slightly dierent context.) The
equations he derived in the ideal gas case for the entropy were not extensive in the number of particles. The cure was
the one given below although Gibbs had no real idea why it was a cure. Gibbs, of course, used classical mechanics.
Had he lived just a bit longer (he died in 1903) he may well have concluded that the classical mechanics he used was
wrong and that energy had to be quantized, and particle indistinguishability had to be true.
8
There are some exceptional cases, the most common of which is in the treatment of crystals. There we can identify
which particle is which by their location in the crystal.
The quick x is to divide Equation (4.3.5) on the preceding page by N!. This works, although why
it works is a deeper question that we will examine later.
The result of our combinations and permutations is:
Q(N, V, ) =
1
N!
q(V, )
N
. (4.3.6)
This is the proper result for indistinguishable non-interacting particles.
The trick of dividing by N! to account for permutations of indistinguishable particles leads to what
are often called Boltzmann statistics.
4.4 Thermodynamics of Canonical Systems of Indepen-
dent Subsystems
When dealing with canonical systems made up of independent subsystems there is no need to
compute Q(N, V, T) before nding E), p), or ). These can be computed from q(V, T) directly.
The probability that a single independent subsystem will be found in subsystem state j is clearly
P
j
=
e
j
q
(4.4.1)
where here P
j
is the subsystem probability.
9
Thus it is just as easy to nd the appropriate formulas
for independent subsystems directly from those given in Section 3.3 above. I repeat them here:
E) =
_
lnQ(N, V, )
_
N,V
, ((3.3.3))
p) =
_
lnQ(N, V, )
V
_
N,T
, ((3.3.10))
and
) =
_
lnQ(N, V, T)
N
_
V,
, ((3.3.14))
All we need do is substitute q
N
/N! (usually) or q
N
(for crystals and similar systems) as appropriate.
Using the rst form we get simply:
10
E) = N
_
lnq(V, T)
_
V,N
, (4.4.2)
p) = N
_
lnq(V, T)
V
_
T,N
, (4.4.3)
and
) = ln
_
q(V, T)
N
_
(4.4.4)
The entropy is given by:
S = Nk
j
P
j
lnP
j
(4.4.5)
9
Dont confuse this with system probabilities. See the Equation (4.4.5) for the entropy for an example of the
dierence.
10
The N! matters only in the computation of the chemical potential.
where the probability P
j
is for an independent subsystem and the entropy is per mole of such
subsystems.
For crystals and similar systems Equation (4.4.4) on the preceding page changes and becomes
) = lnq(V, T) (4.4.6)
as can be seen by using Equation (3.3.14) on page 31 along with Q(N, V, T) = q(N, V )
N
, since no
N! is needed for such systems.
The Helmholtz free energy is
A = lnQ(N, V, ) (4.4.7)
In the case where Q = q
N
/N!, we have, using Stirllings Approximation:
A = N lnN +N lne +N lnq = N ln(qe/N) (4.4.8)
so that
A = NkT ln(qe/N) (4.4.9)
5. The Ideal Monatomic Gas
5.1 The Partition Function
We now construct a model for a monatomic gas of N independent particles contained in a cubical
box of volume V . The particles have no internal energies except for the possibility of electronic
excitation. An obvious modeling method is to treat the atoms as point particles contained in a
cubical box of volume V . We then need to nd the energy levels for such a a gas. Once we have
them we can use them in the canonical partition function to calculate the thermodynamic properties
of such a gas.
1
Since the atoms are assumed to be independent, each atom acts as if the others were not there. The
quantum mechanical particle in a box then provides a good model for the translational energy levels
in this system. The one added complication is that the atom also has electronic energy levels.
Luckily, the electronic energy levels are completely independent of the kinetic motion of the atom.
So the total energy for any atom is the sum of a translational part and an electronic part or:
=
tr
+
e
, (5.1.1)
and it can easily be seen that the single particle partition function q can be written as:
q =
e
(tr+e)
=
e
tr
e
e
=
e
tr
e
e
= q
tr
q
e
. (5.1.2)
We will ignore q
e
for now and discuss it only at the end of this chapter in Section 5.5 starting on
page 51.
One of the few simple problems of quantum mechanics is the single particle in a cubical box. We
chose a cubical box for convenience. A non-cubical box is a simple adaptation of these calculations
and spherical or spheroidal boxes can also be used with a large increase in mathematical complexity
and no increase at all in insight into what is going on.
2
The energy levels depend upon three quantum
numbers n, r, and s according to:
n,r,s
=
h
2
(n
2
+r
2
+s
2
)
8mL
2
, (5.1.3)
where is the energy, h is Plancks constant, m the mass of the particle, and L the length of one
edge of the cubical box. Since V , the volume of the box is L
3
, we see that the energy levels depend
on the volume to the 2/3 power.
The single independent particle partition function q is then:
q(V, ) =
s
e
h
2
(n
2
+r
2
+s
2
)/8mL
2
, (5.1.4)
or
q(V, ) =
_

k=0
e
h
2
k
2
/8mL
2
_
3
, (5.1.5)
1
Any partition function could be used, the canonical is chosen because it is simple to use here. Indeed, in a later
chapter we will repeat this calculation using classical mechanics and the microcanonical ensemble.
2
One needs to solve Schrodingers equation for the appropriate boundary conditions. A non-cubical box is simple;
it is just a composition of three one-dimensional particle-in-a-box results for three dierent lengths, one in each
dimension. Solutions for a spherical box will involve Bessel functions and a spheroidal box, elliptical functions. The
results will, however, show no dierent general behavior.
43
Chapter 5: The Ideal Monatomic Gas 44
where k is an integer.
In almost all textbooks this sum is done by noting that the energy levels are very densely packed
3
and, using that as an excuse, then converting the sum to an integral. But can one simply change a
sum to an integral? The answer is not always! But it works here.
The correct thing to do is to use the Euler-Maclaurin Summation Formula which is discussed at
some length in an Appendix to this chapter contained in Section 5.6 on page 54.
4
This appendix
should be consulted if the reader is not familiar with the Euler-Maclaurin summation formula.
5
5.2 The Evaluation of the Partition Function
The single-particle canonical partition function is:
q(V, ) =
_

k=0
e
h
2
k
2
/8mL
2
_
3
, ((5.1.3))
so we need to do the sum
k=0
e
h
2
k
2
/8mL
2
. (5.2.1)
How can we do such a sum?
6
To use the Euler-Maclaurin summation formula we need some derivatives. To this end it helps to
gather all the constants together in one bunch and call them, say, a:
a =
h
2
8mL
2
, (5.2.2)
so that Equation (5.2.1) becomes
S =
k=0
e
ak
2
, (5.2.3)
and then tabulate some derivatives (see Table 5.1). All the needed derivatives are zero at both
function at x = 0 at x =
f(x) = e
ax
2
1 0
f
(x) = 2axe
ax
2
0 0
f
(3)
(x) = [4a
2
x(3 2ax
2
)]e
ax
2
0 0
f
(5)
(x) = 8a
3
x(4a
2
x
4
20ax
2
+ 15)e
ax
2
0 0
Table 5.1: Derivatives for the Evaluation of the Translational Partition Function
3
Even though this is not true as astute students note. As k increases the spacing between energy levels increases.
Thus for large enough k the spacing will be as large as one wishes.
4
The author is well aware that almost no textbook ever discusses the Euler-Maclaurin formula. However, he has
gotten tired of simply ailing his arms around in class while pretending that the conversion of a sum to an integral is
obvious! It isnt.
5
Which, in practice, means almost everyone.
6
It clearly can be done wed not have brought it up...
limits. The even numbered derivatives are not zero, but those do not occur in the Euler-Maclaurin
series. That series is then simply:
S =
_

0
e
ax
2
dx 1 . (5.2.4)
The integral is easily done:
_

0
e
ax
2
dx =
1
2
_
a
_
1/2
. (5.2.5)
Putting it all together (and not forgetting to cube the result, the partition function q is then:
q(V, ) =
_
2m
h
2
_
3/2
V , (5.2.6)
where the V comes from the term L
2
that occurs inside a square root and is then cubed.
For argon gas at 300K, m 6.6 10
26
kg, and with the usual values of the natural constants, the
S in Equation (5.2.4) is of the order of 10
10
or so. Thus the 1 in the Euler-Maclaurin summation
result can be neglected.
5.3 The Degeneracy Calculation
There is another way to do obtain the single particle partition function q. It is a bit more complex
but sometimes more useful. Above, in Section 5.2 we recognized that a cubical box has three identical
sets of energy levels, one for each direction. So we summed over all the energy levels, counting each
one separately, and cubed the result.
Instead of doing that we could use the degeneracy of the energy levels to our advantage. Given the
degeneracies we can then need to sum only over the dierent energy levels, each multiplied by the
degeneracy.
The translational energy levels can be degenerate in three ways:
1. Permutation degeneracy it is clear that quantum numbers (5, 3, 1) for example will give the
same energy as quantum numbers (3, 5, 1) This type of degeneracy only happens when two or
more sides of the container are the same length or the lengths are rational multiples of each
other.
2. Accidental degeneracy it is also clear that energy levels can have identical values of energy
even for diering sets of quantum numbers. For example
2,2,5
is identical in energy to
1,4,4
.
3. Neighborhood degeneracy If we were to relax the meaning of identical slightly to allow for
almost identical energies, it will turn out that there are many such degeneracies depending on
what value we use for almost.
7
This last point bears close examination.
We can simplify the energy level expression Equation (5.1.3) on page 43 by introducing a new
variable R:
R
2
= n
2
+r
2
+s
2
, (5.3.1)
(recall that n, r, and s are quantum numbers) then:
=
h
2
R
2
8mL
2
, (5.3.2)
7
The author is not trying to be facetious. The word almost is to be interpreted as a small relative quantity.
and conversely
R
2
=
8mL
2
h
2
. (5.3.3)
If we denote the degeneracy of levels with energy
R
by (), we can write the single particle canonical
partition function as:
q(, V ) =
=0
()e
, (5.3.4)
where we are summing over values of generated by Equation (5.3.2) on the preceding page. Thus
R
2
takes on integer values and takes on corresponding values.
Often there is no energy level for a given value of R
2
. In that case () is zero .
It is very dicult to calculate () exactly. It is not at all a smooth function of R
2
. But it can be
approximated.
To get an approximation we have to have an important realization: R
2
is constructed from n
2
, r
2
,
and s
2
. There is a value of R
2
for each possible value of the sum of n
2
+r
2
+s
2
. Now n, r, and s are
restricted to being positive integers. Imagine a three-dimensional graph with axes named n, r, and
s. Mark out points on this graph where n, r, and s have positive integer values. Those points all lie
in the octant of the graph where all the axes are positive. Each point corresponds to a possible set
of quantum numbers.
8
For example, consider the point at n = 2, r = 2, and s = 5. This corresponds to R =

33 and the
point is at a distance of 5.74456+ from the origin. Another point, this one at n = 1, r = 4, and
s = 4 also has R =

33 and is also at a distance of 5.74456+ from the origin. Hence these two
states have the same energy as does any other point whose squared distance from the origin is 33.
All points that lie on a spherical surface of radius R =

33 will have the same energy.
These points are all degenerate in energy, and the number of such points is the degeneracy
of the energy for which R
2
= 33.
Now the value of 33 was chosen more or less at random. What was written is true for any value of
R
2
you might pick. Of course, for some R
2
s the degeneracy might be zero, but for almost all others
it will not.
That is not what required an amazing insight. It does not make (R
2
) any easier to calculate. Nor
is the insight that understanding that the points at a distance R from the origin lie on a spherical
surface at a radius R around the origin.
Here is the insight: If we knew the volume of a sphere with such a radius we could calculate how
many points that sphere would contain.
And once we have the volume, the number of points lying in a region between R
2
and R
2
+ d(R
2
)
is given for spheres
9
by the derivative of the volume with respect to R
2
. Or, better, we can express
the volume as a function of the energy and then look at the spherical shell lying between and
+d.
Of course this is only approximate since we will include fractional volumes. But the error is not large.
And the average value of R
2
at normal temperatures is huge, making the error even smaller. Further,
the shell has a thickness of d, so the points included in it are only approximately degenerate. But
d can be chosen to be as small as we like so the approximation can be as accurate as we like.
8
If the box is not cubical, the markings on the axes of this graph will be dierently spaced in each direction.
9
This doesnt work for all shapes. Consider a cube, for example. The volume of a cube of side L is L
3
, the
derivative of that is 3L
2
, but the area of the sides is 6L
2
.
Lets now carry out this program. The volume of a sphere of radius R is just 4R
3
/3. We are only
concerned with 1/8th of this, the part where all axes are positive. Calling this volume , we then
have:
=
R
3
6
=

6
_
8mL
2
h
2
_
3/2
. (5.3.5)
The degeneracy factor is then:
()d =
d
d
d =

4
_
8mL
2
h
2
_
3/2
1/2
d . (5.3.6)
It is important to gauge the size of this number. If we take as 3kT/2, T = 300K, m 10
25
kg,
L = 0.01 meters (10 cm), and d to be about 0.0001 (a 0.01% thickness of the shell), then () is
about 10
26
.
Considering that we have a much less than a mole of gas under these conditions, it is easy to see
that there are far more degenerate levels available to a gas molecule than there are molecules. Thus
the average occupation number of a given quantum level is hugely less than 1. This is an important
point because as long as it is true, quantum mechanical eects can be ignored.
10
With this result for the degeneracy, the single particle canonical partition function is then the sum
q(, V ) =
()e
4
_
8mL
2
h
2
_
3/2
1/2
e
. (5.3.7)
It only remains now to do this sum.
Just as before, almost all textbooks do the sum by noting that the energy levels are very densely
packed and, using that as a rationale, then converting the sum to an integral which can be readily
done. Weve already done this more correctly by applying the Euler-Maclaurin summation formula
and discovering that the integral really is an excellent approximation to the sum. The result is the
same one weve already obtained:
q(V, ) =
_
2m
h
2
_
3/2
V . ((5.2.6))
The actual degeneracy bears some examination. It is given by Equation (5.3.6) slightly modied by
turning L into the appropriate power of V :
(1, V, ) =

4
_
8m
h
2
_
3/2
V
1/2
, (5.3.8)
where the 1 as an argument to is a reminder that this is the degeneracy for a single particle in a
box of volume V when the particle has an energy . We have already discovered that this number
is normally huge.
11
10
I dont want to seem mysterious here. Although we will discuss this in more detail later, the point is that all
particles are either fermions or bosons. The former are restricted by the Pauli Exclusion Principle to be alone in an
energy level. The latter are not. But as long as the average occupation of an energy level is much less than one, the
dierence between fermions and bosons can be ignored.
11
It is not huge when we are close to absolute zero. However, in that case we need to take quantum mechanics
more explicitly into account.
5.4 The Thermodynamics of a Monatomic Ideal Gas
With the single particle partition function given:
q(V, ) =
_
2m
h
2
_
3/2
V , ((5.2.6))
it is possible to evaluate the thermodynamics of a monatomic ideal gas.
For convenience (and because they have some physical signicance) it is usual to express q(, V ) in
one of two dierent ways.
The rst is is to let be dened as:
=
_
h
2
2m
_
1/2
=
_
h
2
2mkT
_
1/2
, (5.4.1)
then q is given by:
12
q(, V ) =
V
3
. (5.4.2)
In this case (which has the dimensions of length) is known as the thermal de Broglie wavelength.
The reason is this: quantum mechanically, the de Broglie wavelength is given by:
=
h
m x
=
_
h
2
2mE
_
1/2
. (5.4.3)
When the temperature is T, the typical energy of a monatomic ideal gas molecule is 3kT/2. In these
terms
=
_
h
2
3mkT
_
1/2
, (5.4.4)
which (except for a factor of about the square root of two) is the thermal de Broglie wavelength.
Since quantum mechanically, particles whose de Broglie waves overlap will show quantum eects,
then as long as the typical interparticle distance is greater than , quantum eects can be neglected.
As an example using argon at 300K, the de Broglie wavelength is approximately 7 10
13
meters.
The average separation between argon atoms at 1 atm and 300K is about 3.4 10
9
meters, or
almost 4700 times greater. Thus we expect quantum eects for argon under these conditions to be
negligible.
13
Gas T(K) (m)
He 4.2 4.25 10
10
Ne 27.2 7.46 10
11
Ar 87.4 2.96 10
11
Table 5.2: for Various Monatomic Gases
The second way of expressing q(, V ) involves dening a quantity
t
.
t
=
h
2
2mk
, (5.4.5)
12
The author has always called this the cute formula because it looks a bit like arrow heads balanced on each
other. Your view may dier.
13
Similar computations quickly reveal the conditions under which a given gas (if it remains a gas) will behave
non-classically.
so that:
q(, V ) =
_
T
t
_
3/2
V . (5.4.6)
Here
t
, which has the units of temperature, is a characteristic temperature. As long as T
t
,
quantum eects can be neglected. Of course
t
is trivially related to by:
2
=
h
2
2mkT
=

t
T
. (5.4.7)
Gas
t
(K)
He 7.46 10
19
Ne 1.48 10
19
Ar 2.46 10
20
Table 5.3:
t
for Various Monatomic Gases
As long as T
t
, we can ignore quantum mechanical eects.
Now we evaluate Q(N, V, ) which is:
Q(N, V, ) =
1
N!
q
N
=
1
N!
_
2m
h
2
_
3N/2
V
N
(5.4.8)
and for the rst time we see a real partition function expressed in terms of its natural variables
N, V , and = 1/kT.
The thermodynamics depends on lnQ,
lnQ(N, V, ) = N lnN +N +N lnq(V, ) , (5.4.9)
where Stirlings approximation has been used. Substituting Equation (5.4.1) on the preceding page
into this leads to:
lnQ(N, V, ) = N lnN +N +N lnV 3N ln. (5.4.10)
The expected value of the energy of the system is given by:
E) =
_
lnQ
_
N,V
. (5.4.11)
Since only is a function of , this is:
E) =
3N
_
. (5.4.12)
Now = (h
2
/2m)
1/2
, then
_
_
=
1
2
_
2m
h
2
_
1/2
_
h
2
2m
_
. (5.4.13)
Putting the pieces together and doing some algebra gives:
E) =
3
2
NkT , (5.4.14)
which is a very well-known result. From this the constant volume heat capacity, C
V
is then 3Nk/2,
another well-known result.
We readily nd the pressure since:
p) =
_
lnQ
V
_
N,
=
N
V
, (5.4.15)
so that
pV = NkT . (5.4.16)
We would be very unhappy if this were not the case.
The chemical potential is interesting: We have:
) =
_
lnQ
N
_
V,
, (5.4.17)
and since
lnQ(N, V, ) = N lnN +N +N lnV 3N ln, ((5.4.10))
we then have:
= lnN 1 + 1 + lnV 3 ln , (5.4.18)
= kT ln
_
2mkT
h
2
_
3/2
V
N
. (5.4.19)
Note that the chemical potential is an intensive quantity and has an absolute value. In classical
thermodynamics, the chemical potential depends on an arbitrary choice of standard state. This does
not contradict thermodynamics. It happens because as is usually the case in statistical thermody-
namics, we have implicitly chosen a state of zero energy. We did it here when we picked the zero for
the energy levels for the particle in a box.
The main thermodynamic functions are now easy. Since
A = kT lnQ(N, V, ) , (5.4.20)
then
A(N, V, ) = kT(N lnN +N +N lnV 3N ln) , (5.4.21)
which is in fact an extensive quantity as can be seen with a slight rearrangement:
A(N, V, ) = NkT ln
_
V e
N
3
_
= NkT ln
_
2m
h
2
V e
N
_
, (5.4.22)
where it should be remembered that 1 is identical to lne. Since A = E TS, S works out to be:
S = Nk ln
_
_
2mkT
h
2
_
3/2
e
5/2
V
N
_
, (5.4.23)
which is known as the Sakur-Tetrode equation. While the entropy given there is extensive, it also
has the wrong behavior as T 0. Of course this is because the classical approximation fails as
T 0. As the temperature goes to zero, the essentially quantum mechanical nature of the system
cannot be ignored. This happens when T begins to approach the theta temperature
t
given in
Equation (5.4.5) on page 48.
5.5 The Electronic Partition Function
Except for the hydrogen atom, there is no formula for the electronic energy levels of an atom. Thus
the evaluation of
q
e
=
j
e
j
(5.5.1)
requires us to have values for the actual energy levels. Luckily, this information is readily available
for low lying levels, which are the important ones. Some of this data is reproduced in Table 5.4 on
the following page on page 52.
From an examination of this data several things are clear. One is that electronic energies are
measured from the electronic ground state as zero energy, rather than from the ionized atom as zero
energy. The other is that from the magnitude of the listed energies we note that only the two or
three lowest levels are important at temperatures below 2000 K.
On the other hand, we have an enormous problem. We know that the electronic partition function,
Equation (5.5.1) does not converge!
And how do we know this? Because all atomic energy levels exist between the ground state and the
ionization limit. And there are an innite number of them, each having a nite value. And e
x
is
always non-zero for x having any nite value.
So in Equation (5.5.1) we are adding up an innite number of nite terms. The sum can only be
innite and hence the series of terms in that equation do not converge to a nite value.
This is rather a puzzle, since the electronic partition function is often talked about as if it actually
exists.
The answer is really simple.
14
We have been misled by our use of quantum mechanics! From
quantum mechanics the electronic energy levels of the hydrogen atom are relatively easily found to
be:
n
=
m
e
e
4
8
2
0
h
2
1
n
2
=
e
2
8
0
a
0
1
n
2
, (5.5.2)
with degeneracy
n
= 2n
2
(5.5.3)
where m
e
is the mass of an electron, e the charge on the electron,
0
the permittivity of free space
(also known as the electric constant), h is Plancks constant, and a
0
is the Bohr radius of hydrogen,
5.29177249 10
11
meters.
Here the ground state,
1
is below the ionization limit (n ) which is the standard zero of energy
for monatomic atoms. But it is not the standard used in statistical thermodynamics.
15
We can
convert this to our standard where the ground state is zero by adding
1
(a positive number) to
all values. Then the innite number of energy levels lie between zero and [
1
[, as we said above.
The electronic energy levels for all other atoms behave in a similar manner.
Whats wrong with this is that the radial wave function in quantum mechanics has innity as its
outer boundary condition. That means that we are solving for a hydrogen atom (or any other atom)
alone in an otherwise empty universe. The electron in a real hydrogen atom is always conned in
some way to a nite region of space. Here, in particular, we are dealing with a canonical ensemble
of systems. Thus any electrons in any atoms are certainly conned to the systems volume V .
14
Though the author has never seen this problem discussed in any textbook on statistical mechanics or statistical
thermodynamics.
15
This isnt anybodys fault. So many of these elds grew up independently of each other that reconciling all their
standards is impossible now.
What is needed is to solve the quantum mechanical problem of a hydrogen atom not in an empty
universe but in a box. Since the problem is spherically symmetric, it is easiest to use a spherical
box. The appropriate radial boundary condition is then that the wave function must become zero
at a radius R from the origin.
When we do this we nd that the rst large number of energy levels are identical to those we found
for the unfettered hydrogen atom. But when the radius of the electron orbitals becomes comparable
in magnitude to the radius of the box, the energy levels change to become particle in a box energy
levels.
16
And particle in a box energy levels go to innity as the quantum number increases. This
is exactly what we want since then the exponential terms in Equation (5.5.1) on the previous page
go to zero and the sum then converges.
So the overall behavior of the electronic partition function in real life is that the rst term is always
17
one and that the next few terms get to be quite small compared to one though they never become
zero. But now, after a large but nite number of terms, we start getting particle in a box energy
levels. These increase in energy so the exponentials start rapidly to get smaller and smaller, going
to zero at the limit. And the partition function converges.
And its value is almost always simply the value of the rst few terms. The number of terms needed
depends on the temperature, the higher T, the more terms are needed.
In real life we need only the rst few terms of Equation (5.5.1) on the preceding page. Thus q
e
can
be written as:
q
e
=
1
+
2
e
2
+
3
e
3
+ . (5.5.4)
The table below lists some of the energy levels for selected monatomic gases:
Table 5.4: Monatomic Gas Electronic States. The data in this
Table is not complete. Lines starting with an asterisk (*) indicate
the rst of a number of levels with almost exactly the same energy.
atom electron term E E
cong symbol cm
1
J/mole
H 1s
1 2
S
1/2
2 0. 0.
2p
1 2
P
1/2
2 82258.9206 984035.17
2s
1 2
S
1/2
2 82258.9559 984035.59
2p
1 2
P
3/2
4 82259.2865 984039.55
* 3p
1 2
P
1/2
2 97492.2130 1166265.8
He 1s
2 1
S
0
1 0. 0.
1s
1
2s
1 3
S
1
3 159855.9726 1912302.0
1s
1
2s
1 1
S
0
1 166277.4384 1989120.0
* 1s
1
2p
1 3
P
2
5 169086.7647 2022726.8
Li 1s
2
2s
1 2
S
1/2
2 0. 0.
1s
2
2p
1 2
P
1/2
2 14903.66 178287.4
1s
2
2p
1 2
P
3/2
4 14904.00 178291.4
* 1s
2
3s
1 2
S
1/2
2 27206.12 325457.5
C 2s
2
2p
2 3
P
0
1 0. 0.
2s
2
2p
2 3
P
1
3 16.40 196.2
2s
2
2p
2 3
P
2
5 43.40 519.2
* 2s
2
2p
2 1
D
2
5 10192.63 121930.93
16
The hydrogen atom electronic wave functions in a spherical box involve Bessel functions and the particle in a box
functions are not ordinary sines and cosines either, but the result is basically the same.
17
Unless the ground state is degenerate.
Monatomic Gas Electronic States (continued)
atom electron term E E
cong symbol cm
1
J/mole
N 2s
2
2p
3 4
S
3/2
4 0. 0.
2s
2
2p
3 2
D
5/2
6 19224.464 229975.65
2s
2
2p
3 2
D
3/2
4 19233.306 230081.42
* 2s
2
2p
3 2
P
1/2
2 28838.920 344990.07
O 2s
2
2p
4 3
P
2
5 0. 0.
2s
2
2p
4 3
P
1
3 158.265 1893.30
2s
2
2p
4 3
P
0
1 226.997 2715.49
* 2s
2
2p
4 1
D
2
5 15867.862 189821.77
F 2s
2
2p
5 2
P
3/2
4 0. 0.
2s
2
2p
5 2
P
1/2
2 404.10 4834.1
* 2s
2
2p
4
3s
4
P
5/2
6 102405.71 1226044.3
Ne 2p
6 1
S
0
1 0. 0.
1
S
2
5 134041.8400 1603496.4
1
S
1
3 134459.2871 1608490.2
*
1
S
2
5 134818.6405 1612789.0
Na 2p
6
3s
1 2
S
1/2
2 0. 0.
2p
6
3p
1 2
P
1/2
2 16956.172 202840.85
2p
6
3p
1 2
P
3/2
4 16973.368 203046.56
* 2p
6
4s
1 2
S
1/2
2 25739.991 307918.66
Cl 3s
2
3p
5 2
P
3/2
4 0. .
3s
2
3p
5 2
P
1/2
2 882.3515 10555.27
The energy in wavenumbers in the table above was taken from data supplied by the US National Institute
of Science and Technology (NIST) at the internet URL:
http://physics.nist.gov/PhysRefData/ASD/levels_form.html
The energy in Joules per mole was computed from the wavenumber data using 11.962656 J/cm as the
conversion factor. The conversion value was computed from the data in Appendix 18.3.
Some few details were taken from Norman Davidson, Statistical Mechanics, McGraw-Hill, 1962, page
110. Note that the values given in the table on page 110 of that book are now obsolete. The entries in
the table above are the latest data available.
As an example, let us look at the electronic partition function for hydrogen atoms. The rst two
terms of the sum are:
q
e
= 2 + 2e
984035.17/RT
+ , (5.5.5)
where the factors of two come from the degeneracy of the levels (spin) and the energy is taken from
Table 5.4 on the previous page and is in joules per mole. Note that at 300 K the second term is
about 10
171
in size. On the other hand uorine gives us:
q
e
= 4 + 2e
4834.1/RT
+ = 4 + 2e
1.9380625
+ = 4 + 0.2679756 + (5.5.6)
at 300 K. Luckily the next level is so much higher in energy that it simply doesnt count.
How does all this aect the thermodynamic functions for a monatomic ideal gas? The answer is
simple. One just computes q
e
for the monatomic gas involved. For the ordinary monatomic gases
q
e
is simply 1. In all other cases one denes Q(N, V, T) this way:
Q(N, V, T) =
1
N!
(q
t
q
e
)
N
, (5.5.7)
and then applies the formulas of Section 5.4 to the result.
5.6 Appendix: The Euler-Maclaurin Summation Formula
There is an answer to the problem of converting a sum to an integral (and vice versa). A standard
method for doing this is called the Euler-Maclaurin summation formula.
It can be shown that:
b1
k=a
f
k
=
_
b
a
f(x)dx +
n
k=1
B
k
k!
_
f
(k1)
(b) f
(k1)
(a)
_
+R
n
, (5.6.1)
where f
k
is the kth term in the target series that one is converting to an integral, f(x) is the
corresponding continuous function in the target integral, B
k
is the kth Bernoulli number, f
(k)
(a) is
the kth derivative of f(x) evaluated at a, and R
n
is the the remainder term, an approximation to
the terms that have been ignored. It is assumed that a and b, as well as n of course, are integers.
If f
(m1)
(x) is the last term included in the summation and if the summation goes monotonically
to zero as x goes to innity, then R
n
is smaller than the rst discarded term in the series.
The Bernoulli numbers are given in many sources and occur often in numerical analysis. The rst
few are given in Table 5.5. The Bernoulli numbers look almost as if they are getting smaller as we
B
0
= 1 B
6
= 1/42
B
1
= 1/2 B
8
= 1/30
B
2
= 1/6 B
10
= 5/66
B
4
= 1/30 B
12
= 691/2730
Table 5.5: The First Few Bernoulli Numbers
go on. But this is illusory. Indeed, they have all sorts of values. For example B
24
is 86580.25311+,
which isnt small at all. Also note that, except for B
1
, all odd Bernoulli numbers are zero.
Thus, for our purposes, the Euler-Maclaurin summation formula becomes:
k=b1
k=a
f
k
=
_
b
a
f(x)dx
1
2
[f(b) f(a)] +
1
12
[f
(b) f
(a)]
1
720
_
f
(3)
(b) f
(3)
(a)
_
+
1
30240
_
f
(5)
(b) f
(5)
(a)
_
+... . (5.6.2)
Example 5.1:
Consider the series
S =
9
k=0
e
k
, (5.6.3)
which is easily summed to give the correct answer S = 1.58190. To use the Euler-Maclaurin formula
we will need:
function at x = 0 at x = 10
f(x) = e
x
1 e
10
f
(x) = e
x
1 e
10
f
(3)
(x) = e
x
1 e
10
so that
9
k=0
e
k
=
_
10
0
e
x
dx
1
2
_
e
10
1
+
1
12
_
e
10
+ 1
1
720
_
e
10
+ 1
+R
n
, (5.6.4)
and since f
(4)
= e
x
0 as x , the error is less than
1 e
10
30240
= 3.31 10
5
. (5.6.5)
Evaluating this:
9
k=0
e
k
= e
x
10
0
+
_
1
2
+
1
12

1
720
_
_
1 e
10
= 1.58187 , (5.6.6)
which diers from the right answer by about 3 10
5
, as predicted.
Heres another example:
18
Example 5.2:
Consider the sum:
S =
k=1
k
[1 +k]
3
. (5.6.7)
Dont even bother trying to sum this analytically. Heres the data for the rst terms of the Euler-
Maclaurin expansion:
function at k = 1 at k =
f(k) = k(1 +k)
3
0.014077 0
f
(k) = (1 2x)(1 +x)

4
0.017957 0
f
(3)
= [12
2
(3 2x)](1 +x)
6
0.077050 0
f
(5)
= [360
4
(5 2x)](1 +x)
8
0.519819 0
Then:
19
S =
k=1
k
[1 +k]
3
(5.6.8)
=
_

1
xdx
[1 +x]
3

B
1
1!
1
(1 +)
3

B
2
2!
1 2
(1 +)
4

B
4
4!
12
2
(3 2)
(1 +)
6
+... .
The integral can be done. It is:
_

1
xdx
1 +x]
3
=
1 + 2x
2
2
(1 +x)
2
, (5.6.9)
18
One may certainly wonder why so much time is being spent on a curious but practically unknown formula?
The answer is that in doing theoretical work, one can go for a very long time without needing the Euler-Maclaurin
summation formula. But when you do need it, you need it very badly. So the author feels that it is worth a few
moments of your time!
19
The author admits to using a computer calculus program to do the dierentiations.
the sum can then be evaluated:
S = 0.0215108 + 0.0070383 + 0014964 0.0001070 = 0.0299385 . (5.6.10)
Since the terms in the sums go to zero as the number of terms increase without limit, we can estimate
the error. It will be less than the rst term neglected. That term is:
R
n
=
1
30240
360
4
(5 2)
(1 +)
8
= 1.72 10
5
. (5.6.11)
So we would expect the sum to lie somewhere between 0.0299385 0.0000172.
Using a computer to do the actual summation and denoting the partial sum of the rst n terms by
S
n
:
S
10
= 0.027019
S
20
= 0.028418
S
40
= 0.029168
S
100
= 0.029635
S
200
= 0.029794
S
400
= 0.029874
S
1000
= 0.029922
S
4000
= 0.029947
S
10000
= 0.029951
Partial Sums of Equation (5.6.7) on the preceding page
It is clear that the integral alone (0.0215108) is not a very good representation of the sum, but that
the Euler-Maclaurin summation formula does exceptionally well in estimating the sum using only
four terms.
6. Ideal Polyatomic Gases
6.1 Introduction
An atom has three degrees of freedom. Thats a shorthand way of saying that it can move in any
combination of three independent dimensions. Thus three coordinates are needed to specify its
position.
Two such atoms have a total of six degrees of freedom, three for each atom. A molecule containing
n atoms has 3n degrees of freedom.
The presence of chemical bonds between atoms does not change the number of degrees of freedom.
Think rst of a diatomic molecule. There are still six degrees of freedom because it still takes three
coordinates to specify the actual position of each particle.
If we make use of a notion from mechanics we can gain an important insight. We (mentally) change
coordinates from those that describe the position of each particle in the diatomic molecule to a set
of three coordinates describing the position of the center of mass of the molecule and a set of three
describing the orientation of the molecule about its center of mass.
It can be shown that this can always be done, no matter how many atoms are present in the molecule
or how they are bonded together. So in general we have three coordinates specifying the position of
the center of mass and 3n 3 specifying the positions of the n atoms about the center of mass.
The motion of the center of mass is independent of any motions about the center of mass. This is
easy to see since the relative positions contain no reference to the center of mass at all, and vice
versa.
As a result we can always separate out the motion of the center of mass of any polyatomic molecule.
This motion is the translational motion of the molecule as a whole and weve already seen how to
compute the translational energy and thermodynamics of the center of mass. It is identical to that
for a single atom treated in Chapter 5.
Thus a polyatomic molecule has 3n 3 internal degrees of freedom. Of these, only three or fewer
are rotational degrees of freedom. The remainder are all vibrational degrees of freedom.
A molecular rotation is a rotation about an axis through the center of mass of the molecule. If
the axis did not go through the center of mass, rotation would move the center of mass and thats
forbidden since we consider any motion of the center of mass as a translational motion.
A molecule can have no more than three independent rotational axes through the center of mass.
In general any arbitrary rotation can be decomposed into rotations about these three axes. These
axes are chosen to be the principle axes of the molecule. Those are axes about which the moment
of inertia is a minimum. They can always be found, though we will not have to do that here.
However, linear molecules have fewer rotational axes. For example a diatomic linear molecule (like
H
2
) or a polyatomic linear molecule (like acetylene, HCCH) have only two. The principle axes of a
linear molecule are of necessity one along the bond axis of the molecule and the other two are any
two axes through the center of mass at right angles to each other and at right angles to the bond
axis.
The rotational axis along the bond axis is, in quantum mechanics, not a physical rotational axis.
57
Chapter 6: Ideal Polyatomic Gases 58
Any rotation around that axis results in no change whatsoever to the molecule, its wave function, or
its properties. This is because of the fact that atoms can have no markings and the simple rotation
of an atom cant be distinguished from no rotation at all, even in principle.
1
Thus linear molecules have only two rotational degrees of freedom, all others have three. And
accounting for the three translational degrees of freedom there are then 3n6 or 3n5 vibrational
degrees of freedom, depending on the shape of the molecule.
So polyatomic molecules dier from atoms in that atoms have only translational degrees of freedom
2
while a polyatomic molecule can have all of:
1. translational energy
2. rotational energy
3. vibrational energy
If we assume that these energies are additive, that is that the particle energy is made by adding
the energies of each of these separate contributions:
3
=
translation
+
rotation
+
vibration
, (6.1.1)
then we can treat each of these forms of energy separately and add the results.
It is instructive to think about a typical potential energy for a diatomic molecule.
4
It is characterized
by a steeply rising part as the atoms get closely together, a well around the bond length, and a rising
energy as the atoms are separated.
The zero of energy for a polyatomic molecule is the state in which the atoms are innitely separated.
The depth of the potential energy well is measured down from this zero and is usually denoted by
the symbol D
e
. The reader needs to be warned that various tables typically list D
e
as a positive
quantity. Nevertheless, it represents an energy less than zero.
The quantity r
o
is dened as the equilibrium internuclear bond distance, again always taken as a
positive number.
This model makes an implicit assumption that the electrons in a polyatomic molecule are always
at equilibrium no matter how the nuclei move. This is based on the idea that the electrons are
very light particles and can therefore adjust rapidly enough to stay in their equilibrium states. It is
called the Born-Oppenheimer approximation. Assuming that this approximation holds
5
allows us, to
a very good approximation, to consider that the electrons are always in equilibrium with whatever
positions the nuclei happen to have at any particular moment. If this is true then in vibration there
is no distortion of the electronic energy of the molecule.
A potential energy diagram for a diatomic molecule can then be constructed
6
by xing the internu-
clear distance at some separation r and solving the resultant quantum mechanical problem. Another
1
In other words, this is a quantum mechanical eect. Classical mechanics always implicitly assumes that any
particle can be marked.
2
Electronic energies are not considered degrees of freedom in this sense.
3
This does not have to be true. In fact, vibrational and rotational energies interact since rapid rotation stretches
the bond and increases the bond length and higher vibrational levels change the moment of inertia of the molecule
because the vibrations are only approximately harmonic. However, the translational motion is rigorously always
separate.
4
We will use the results of this as a model for the vibrations of a polyatomic molecule, though those get a bit more
complicated, as we shall see.
5
It doesnt always and is an approximation in any case. Doing quantum mechanics without it is a formidable task.
6
This can be considered a good approximation to the potential energy between bonded atoms in a polyatomic
molecule. Drawing a full potential energy representation for a polyatomic molecule will require graph paper with far
more dimensions than we have available.
value of r is then xed and the solution repeated, etc. In each of these cases the electrons are allowed
to nd their equilibrium positions. A graph of the resultant energy versus internuclear separation
gives a diagram of the potential energy of the diatomic molecule.
Figure 6.1: Typical Potential Energy for a Diatomic Molecule
With these assumptions the Equation (6.1.1) on the previous page then holds.
Assuming that the polyatomic molecule does not interact with any other molecules in the system,
then the intermolecular potential energy is identically zero at all distances. That being assumed,
we can then factor the canonical partition function Q into independent molecular parts q such that
once again:
Q(N, V, T) =
1
N!
q(V, T)
N
, (6.1.2)
with now
q(V, T) =
j
e
j
=
e
(t+r+v)
=
e
t
e
r
e
v
=
e
t
e
r
e
v
= q
t
q
r
q
v
, (6.1.3)
where only q
t
is a function of the volume. The others are functions of temperature alone.
7
This separation has a further eect on the calculation of the thermodynamics of a polyatomic
molecule. Given Equation (6.1.2) we then get:
lnQ(N, V, T) = lnN! +N lnQ = lnN! +N lnq
t
+N lnq
r
+N lnq
v
. (6.1.4)
As a result the total energy of a polyatomic molecule also is, not unexpectedly, a sum such that:
=
t
) +
r
) +
v
) , (6.1.5)
7
Unless the volume of the container is so small that it interferes with vibration or rotation or both. In that case
we are not in a classical situation and quantum statistics must be used.
so that it makes sense to talk about the rotational energy or the vibrational energy of a polyatomic
molecule. But this is an idealization since in reality these energies, especially at higher temperatures,
are not really independent.
By the way, it is customary to include the 1/N! factor with the translational partition function.
Thus we write q
t
/N! for it.
Weve already computed q
t
in Chapter 5 and so dont have to repeat that. Of course the mass m in
Chapter 5 becomes the total mass of the molecule.
It remains now to nd q
r
and q
v
.
6.2 Vibration
6.2.1 Vibration in Diatomic Molecules
Here we discuss vibration in the diatomic molecule. This easily generalizes to vibration in polyatomic
molecules.
The actual potential energy in a diatomic molecule is not symmetric around the equilibrium bond
length. But the bottom of the potential well can be matched to a parabola around r
o
, the location of
the potential minimum. One does this by creating a Taylor series expansion of the potential around
r
o
. If U(r) is the potential energy:
U(r) = U(r
o
) +
dU
dr
ro
(r r
o
) +
1
2!
d
2
U
dr
2
ro
(r r
o
)
2
+ . (6.2.1)
If we assume that the minimum energy is zero, then the rst term on the right in Equation (6.2.1)
is zero. The rst derivative in the expansion is also zero since the potential has zero slope there.
The second derivative is just a constant, the curvature of the potential curve at its minimum. If we
truncate the series at this point we have a formula that is quadratic in (r r
o
)
2
, of the same sort as
in the harmonic oscillator:
U(r) =
1
2
d
2
U
dr
2
ro
(r r
o
)
2
. (6.2.2)
If there is a potential minimum at all,
8
this approximation can always be made, though to what
extent it will be true is subject to analysis in each case in which it is applied.
Now that weve assumed that the potential is harmonic we know the quantum mechanical result.
There are an innite set of non-degenerate vibrational quantum states with energies:
n
= (n + 1/2)h , (6.2.3)
where n = 0, 1, 2, . . . and is the classical fundamental vibrational frequency given by
=
_
1
2
__
k
_
1/2
, (6.2.4)
where , the reduced mass, is:
1
=
1
m
1
+
1
m
2
, (6.2.5)
with m
1
and m
2
the masses of the two atoms; and k, is the classical spring constant.
8
And there must be in order to have a bond and to have vibration.
In the formulas above, the vibrational energies are measured with reference to the potential energy
minimum taken as zero. This is not suitable for statistical mechanical calculations in polyatomic
molecules because the zero of energy for a molecule is taken as the atoms separated by enough
distance to make the interactions between them zero. This is because in any molecule there will
be more than one vibration. And there is no guarantee that all the vibrations will have the same
energy minimum.
9
So some other zero point must be chosen. This is all right because, as we know,
the zero point for energy is arbitrary. Thus the real potential energy that we want is
U(r) = D
e
+
1
2
d
2
U
dr
2
ro
(r r
o
)
2
, (6.2.6)
where D
e
is the (hypothetical) bottom of the potential well for the vibrational potential energy.
Here D
e
is invariably given as a positive energy although it is meant to be an energy below the zero
mark. Which is why it has a negative sign in Equation (6.2.6).
However, D
e
is not experimentally detectable.
10
What is detectable is D
o
, the energy from the
ground vibrational state to the zero point energy. The dierence is, of course:
D
e
= D
o
+
1
2
h . (6.2.7)
For now we will simply compute the pure vibrational partition function and worry about zero
points of energy later. We could include them here, but the energy shifts to the proper zero point
is just a constant and the result obtained here is useful later.
Figure 6.2: Typical Potential Energy for a Diatomic Molecule showing the dierence between D
o
and D
e
.
The vibrational partition function q
v
is:
q
v
=
n=0
e
n
=
n=0
e
(n+1/2)h
, (6.2.8)
which can be re-arranged to be
q
v
= e
h/2
n=0
e
nh
. (6.2.9)
9
Except in diatomics, where, as we will see, there are only two vibrational modes and they are degenerate.
10
All one sees spectroscopically is the dierences between energy levels.
Now if we let x = exp(h), this becomes
q
v
= e
h/2
n=0
x
n
, (6.2.10)
where the sum, a geometric series, is easily seen to be 1/(1 x) if [x[ < 1. And [x[ < 1 since , h,
and are all intrinsically positive numbers.
It then follows that q
v
is:
q
v
=
e
h/2
1 e
h
. (6.2.11)
This can be put into a somewhat more useful form by grouping the constants. If
v
is dened by:
v
=
h
k
, (6.2.12)
where it has the units of Kelvins. In these terms we have for Equation (6.2.11):
q
v
=
e
v/2T
1 e
v/T
. (6.2.13)
Of course
v
is another characteristic temperature. (See, for example Equations (5.4.5) on page 48
and (5.4.6) on page 49. As long as T is small compared to
v
, quantum eects can be ignored. So
it is useful to look at a small table of
v
for various diatomic molecules such as Table 6.1 on the
next page.
Species
v
(K)
H
2
6215
HCl 4227
N
2
3374
CO 3100
Cl
2
810
I
2
310
Table 6.1:
v
for Selected Diatomic Molecules
The probability of nding a molecule in vibrational state n is
p
n
=
e
n
q
v
=
e
(n+1/2)h
q
v
, (6.2.14)
and so the probability of nding a molecule in the n = 0 (ground) state is:
p
0
=
e
(1/2)h
q
vib
=
e
v/2T
q
v
. (6.2.15)
With q
v
given by Equation (6.2.13), this becomes
p
0
= 1 e
v/T
, (6.2.16)
and so the probability of nding a molecule in any vibrational state greater than the ground state
is then:
p
(n>0)
= e
v/T
, (6.2.17)
hence the importance of the characteristic temperature
v
. For example in hydrogen with
v
=
6215K, only about 1 10
7
% of the molecules at 300K are not in the ground state. Even at 1000K
only about 0.2% of hydrogen molecules are not in the vibrational ground state.
For iodine the situation is dierent. With
v
= 310K, at 300K about 36% of the molecules are in
elevated vibrational states.
To the extent that the vibrational partition function can be treated separately, a separate vibrational
energy can be found from q
v
:
v
) = k
v
_
1
2
+
1
e
v/T
1
_
, (6.2.18)
and E
v
) can be found from Equation (6.2.18) by multiplying it by N, the number of diatomic
molecules in the system.
11
With the average vibrational energy per molecule known, the constant volume vibrational heat
capacity per molecule can be found from C
V
=
v
)/T and is:
C
v,V
= k
_
v
T
_
2
e
v/T
[e
v/T
1]
2
, (6.2.19)
which again has to be multiplied by N for the heat capacity of all N particles.
Figure 6.3: Graph of C
V
/k vs T/
V
It is left as an exercise for the reader to show that as the temperature increases without limit, C
v,V
in Equation (6.2.17) on the preceding page approaches k as a limit. The physical explanation for
this is that all the energy levels become uniformly populated as the temperature increases and a
further increase in energy can no longer move a particle to a higher vibrational energy level.
12
In fact this limit is never reached because (1) the potential energy isnt really a parabola and
anharmonicity becomes more and more important as T increases, (2) the bond will break before
the temperature becomes too high, and (3) in most cases electrons will move to higher energy levels
changing the entire potential energy surface of the diatomic molecule.
At typical laboratory temperatures, most diatomic molecules are in their vibrational ground states
(see Table 6.1) and their theoretical behavior in higher states is, well, theoretical.
11
A reminder as to why this works: the molecules are independent and do not aect each other in any way.
12
It is again left to the reader to show that in a canonical ensemble no energy level can have a higher occupation
number than one with less energy.
6.2.2 Vibration in Polyatomic Molecules
Vibrations in polyatomic molecules are, in principle, not in any way dierent than vibrations in
diatomic molecules. The only practical dierence lies in the determination of the normal vibrational
modes of the molecule.
These normal vibrational modes are vibrations that allow the movement of atoms within the molecule
without allowing the center of mass to move. Discovering the vibrational modes is a tedious exercise
in mathematics. Suce it to say that once these are determined and the fundamental frequencies
known, the rest is easy.
13
If we assume that we have the fundamental frequencies
i
for the s vibrational modes, then we have,
from Equation (6.2.12) on page 62 for each mode:
vi
=
h
i
k
, (6.2.20)
and since the vibrational modes are independent, then
q
v
=
s
i=i
e
v
i
/2T
1 e
v
i
/T
. (6.2.21)
The vibrational energy per molecule is
v
) = k
s
i=1
_
vi
2
+

vi
e
v
i
/T
1
_
, (6.2.22)
and the vibrational heat capacity per molecule is:
C
v,V
= k
s
i=1
_
_
vi
T
_
2
e
v
i
/T
(e
v
i
/T
1)
2
_
. (6.2.23)
6.3 Rotation
6.3.1 Rotational in Diatomic Molecules
A diatomic molecule has only two rotational degrees of freedom. There are still three axes, two go
through the center of mass and are perpendicular to the bond axis. These are real rotational axes.
The third rotational axis is the bond axis itself.
However, since the atoms are considered to be spheres (or points, depending on your point of view),
there is innite rotational symmetry for any rotation about the bond axis. Thus any rotation of any
magnitude at all is equivalent to no rotation because the rotation produces no measurable change
in the diatomic molecule.
Thus we are left with only two rotational degrees of freedom in a diatomic molecule.
It is easy to see that the two rotational degrees of freedom are degenerate. The two rotational
axes are arbitrary in the sense that the rst one can be picked to point in any direction that is
perpendicular to the bond axis. Call that direction A. The other axis (call it direction B) is then
13
Actually, determining the fundamental frequencies is a rather large problem, dicult in molecules with tens or
atoms and essentially impossible in molecules like DNA because of the huge number of them!
determined since it must be perpendicular to both already chosen axes. But another viewer can
chose her rst axis to point in the direction B and then her second will point in the direction A. Or
the choice of the rst could have been in a dierent direction altogether. Clearly the choice of which
axis is A and which is B cannot aect the description of the system, so the two rotations must be
totally equivalent.
So for rotation in a diatomic molecule we have two rotational motions which are degenerate.
If the molecule is heteronuclear, thats the end of the story. If, however, the molecule is homonuclear,
the complications continue. Thus, after some general remarks, we will consider the heteronuclear
case rst and then deal with homonuclear diatomics.
6.3.2 Evaluation of the Rotational Partition Function
Generalities: Diatomic Molecules
There is no potential energy associated with rotation about the center of mass in a diatomic
molecule.
14
The rotation is free. If rotation is independent of vibration, then the problem is one of
the rotation of two atoms of masses m
1
and m
2
separated by a xed distance.
This problem is one of the few exactly soluble problems in quantum mechanics. The results are
simple: there are an innite number of rotational quantum states indexed by a quantum number
traditionally called J, where J = 0, 1, 2, . . . . The energy in quantum state J is given by:
J
=
h
2
J(J + 1)
8
2
I
, (6.3.1)
where I is the moment of inertia
I = r
2
o
, (6.3.2)
with being the reduced mass
1
=
1
m
1
+
1
m
2
, (6.3.3)
and r
o
the equilibrium internuclear distance.
15
Each of these energy levels is degenerate with a degeneracy factor g
J
:
g
J
= 2J + 1 , (6.3.4)
with the states with odd J being anti-symmetric
16
and those with even J being symmetric.
17
In almost all cases, rotational energies are relatively small and at any reasonable temperature most
diatomic molecules are in a fairly high rotational state. This means that the rotational partition
function is normally classical. But hydrogen is an exception.
18
To jump ahead of the story, there
is a characteristic rotational temperature
r
for each molecule. This is analogous to the vibrational
characteristic temperature
v
. If the actual temperature is higher that
r
, the resultant partition
function can be taken to be classical. Otherwise a more detailed analysis is needed. For hydrogen
14
Indeed, in principle there is no potential energy associated with any pure rotation. However, free rotation is
often inhibited due to the presence of other atoms in the same molecule. Rotation of one methyl group in ethane
with respect to the other is an example. Thus rotations in polyatomic molecules often does have a potential energy
associated with it.
15
Known to chemists as the bond length.
16
Anti-symmetric means that the wave function associated with the odd quantum number J will change sign when
the molecule is rotated by 180
.
17
Symmetric is the opposite of anti-symmetric; the wave function does not change sign on a 180
rotation.
18
Indeed, the only exception.
r
is 85.4 K. Hydrogen is still a gas at this temperature. The next highest
r
is for HCl, 15.2 K.
But HCl has long since stopped being a gas, and by 15.2 K is rmly a solid.
Table 6.2 gives some values of the characteristic rotational temperatures of simple diatomic molecules.
As we should expect, the lighter the molecule, the higher the characteristic rotational temperature.
Species
r
H
2
85.4
HCl 15.2
HI 9.0
N
2
2.86
O
2
2.07
Cl
2
0.346
I
2
0.054
Table 6.2:
r
for Selected Diatomic Molecules
This is primarily due to the lesser moment of inertia of light molecules.
Heteronuclear Diatomic Molecules
The rotational partition function q
r
, including the degeneracy of each state J, is:
q
r
=
J=0
(2J + 1)e
J(J+1)h
2
/(8
2
I
. (6.3.5)
It is best to simplify this by setting
r
, the characteristic rotational temperature to be
r
=
h
2
8
2
Ik
, (6.3.6)
so that Equation (6.3.5) becomes
q
r
=
J=0
(2J + 1)e
J(J+1)r/T
, (6.3.7)
which, as we ought now to expect, can not be summed in any convenient way. Thus we once more
need the Euler-Maclaurin summation formula.
Function at J = 0 at J =
f(J) = (2J + 1)e
J(J+1)s
1 0
f
(J) = [(4J
2
+ 4J + 1)s 2]e
J(J+1)s
2 s 0
f
(3)
(J) s(s
2
12s + 12) 0
f
(5)
(J) s
2
(s
3
30s
2
+ 180s 120) 0
f
(7)
(J) s
3
(s
4
56s
3
+ 840s
2
3360s + 1680) 0
Table 6.3: Derivatives for the Evaluation of the Rotational Partition. Note: s has been written for
r
/T and the actual equations omitted for the third and fth derivatives.
Generating these derivatives is tedious.
19
19
I cheated. I used a computer symbolic math program called Derive, which, among other things, makes far fewer
dierentiating errors than I do.
The integral can be done by substituting x for J(J + 1) whence we then obtain:
_

0
e
rx/T
dx =
T
r
=
8
2
I
h
2
. (6.3.8)
With s =
r
/T we then have for the rotational partition function:
q
r
=
1
s
+
_
1
2
_
+
_
1
6
+
s
12
_
+
_
s
3
720
+
s
2
60

s
60
_
+
_
s
5
30240

s
4
1008
+
s
3
168

s
2
252
_
+
_
s
7
1209600
+
s
6
21600

s
5
1440
+
s
4
360

s
3
720
_
+ . (6.3.9)
The rst term above comes from the integral and each of the others (in square brackets) from the
corresponding term of the Euler-Maclaurin expansion. Note that we can see which powers of s are
involved in each term in square brackets. The next term, not shown in Equation (6.3.9), would
involve the ninth derivative (the eighth evaluates to zero) and would contain s to the ninth power
and conclude with s to the fourth power. Thus Equation (6.3.9), as shown, contains all the terms
involving s to the third power. If we truncate there and gather like powers of s together we get:
q
r
=
1
s
+
1
2

1
6
+
s
12

s
60
+
s
2
60

s
2
252

s
3
720
+
s
3
168

s
3
720
+ , (6.3.10)
which reduces to:
q
r
=
1
s
_
1 +
s
3
+
s
2
15
+
4s
3
315
+
s
4
315
+
_
, (6.3.11)
or, in terms of
r
/T:
q
rot
=
T
r
_
1 +
1
3
r
T
+
1
15
_
r
T
_
2
+
4
315
_
r
T
_
3
+
1
315
_
r
T
_
4
+
_
, (6.3.12)
so that as the temperature increases the terms in the square brackets rapidly become small. Thus the
high temperature approximation for the vibrational partition function for a heteronuclear diatomic
molecule is simply
q
r
=
T
r
. (6.3.13)
Equation (6.3.12) can be seen to be the high-temperature or classical result T/
r
multiplied by
a correction series. This form is very accurate if the temperature is reasonably higher than
r
. In
this range convergence is aided both by the powers of
r
/T, which will be less than 1, and by the
coecients in front of each power, which can be shown to converge even if
r
/T is one.
To see this consider hydrogen chloride, a gas condensing to a liquid at 187.9 K. For this gas
r
is
15.2. So s = 0.0809. Then
q
r
= 12.362[1 + 0.0270 + 4.4 10
4
] = 12.697
so one never needs more than the rst correction term (and often not even that) for a heteronuclear
diatomic molecule. Indeed, HI, the next highest
r
has s = 0.03 and the classical result, q
r
= 33.33
has an error of one in the fourth signicant gure at 300 K.
But for completeness, and to show how things are handled, below the temperature where
r
/T
becomes 1, the convergence depends totally on the ratio of the coecients to the powers of
r
/T.
For that case we can develop another series, one specically aimed at low temperatures by simply
taking Equation (6.3.7) on page 66 and writing out the rst few terms:
J=0
(2J + 1)e
J(J+1)r/T
= 1 + 3e
2r/T
+ 5e
6r/T
+ 7e
12r/T
+ . (6.3.14)
As T 0, rotation becomes frozen out, which is what wed expect. That is, there is not enough
thermal energy to kick systems into higher rotational levels.
The regions of utility of our two formulas, (6.3.12) on the previous page and (6.3.14) overlap and
in the region where the temperature is near the characteristic temperature, both should be used to
check each other.
Homonuclear Diatomic Molecules
This case presents diculties. For quantum mechanical reasons (discussed in boring detail in Section
6.6 on page 75), only half the possible rotational quantum numbers actually occur.
20
There are two cases here: even rotational quantum number only and odd rotational quantum num-
bers only. We take the case of even J rst. In hydrogen, where these eects can be measured, such
hydrogen molecules are called para-hydrogen or p-hydrogen for short. To treat this we modify the
equation for the rotational partition function by substituting 2k for J, thus ensuring that only even
numbers can occur as we increment k.
Doing this and also once again writing s for
r
/T, we get:
q
r,p
=
k=0
(4k + 1)e
2k(2k+1)s
. (6.3.15)
We still need to use the Euler-Maclaurin summation formula. The details will not be written out
here as they closely follow the heteronuclear case. The integral (which is the high-temperature limit)
is:
_

0
(4x + 1)e
2x(2x+1)s
ds =
1
2s
, (6.3.16)
or exactly half of the high-temperature limit for the heteronuclear case.
The rotational partition function expression analogous to Equation (6.3.9) on the previous page is
q
r,p
=
1
s
+
_
1
2
_
+
_
s
6

1
3
_
+
_
s
3
90
+
2s
2
15

2s
15
_
+
_
s
5
945

2s
4
63
+
4s
3
21

8s
2
63
_
+
_
s
7
9450
+
4s
6
675

4s
5
45
+
16s
4
45

8s
3
45
_
+ , (6.3.17)
which reduces to:
q
r,p
=
1
2s
_
1 +
s
3
+
s
2
15
+
4s
3
315
+
s
4
315
+
_
, (6.3.18)
which can be seen to be exactly 1/2 of Equation (6.3.11) on the preceding page. Thus, unsurprisingly
q
r,p
=
1
2
q
rot
. (6.3.19)
20
The short explanation for this is that the overall wave function of the molecule is either symmetric or antisym-
metric with respect to a 180
rotation of the molecule. The rotational wave functions are symmetric for even J and
antisymmetric for odd J. Only one set can occur in a given homonuclear molecule in a given electronic state.
A low temperature series can be developed as before. We have, by expansion of Equation (6.3.15)
on the previous page:
q
r,p
=
k=0
(4k + 1)e
2k(2k+1)s
= 1 + 5e
6r/T
+ 9e
20r/T
+ . (6.3.20)
The other case we have to examine is the one where only odd values of J occur. In hydrogen this is
called ortho-hydrogen or o-hydrogen for short.
We can repeat the analysis above or we can simply write down the answer by inspection. Since
the two homonuclear partition functions must add up to the heteronuclear case, we have for the
high-temperature partition function:
q
rot,o
=
1
2
q
rot
, (6.3.21)
with the series expansion
q
r,o
=
1
2s
_
1 +
s
3
+
s
2
15
+
4s
3
315
+
s
4
315
+
_
. (6.3.22)
The only dierence comes in the low-temperature expansion which is:
k=0
(4k + 3)e
(2k+1)(2k+2)r/T
= 3e
2r/T
+ 7e
12r/T
+ 11e
30r/T
+ . (6.3.23)
And so all is well from the point of view of the theoretician.
But there is one glaring problem. In the real world,
21
the zero of molecular energy in diatomics
is taken as that of the lowest energy state the system can reach. And for o-hydrogen, that isnt
the same zero as that for p-hydrogen. The former is exp(2
r
/T) above the latter. (Look at
Equation (6.3.23)!).
So when doing low temperature calculations involving Equation (6.3.23), a factor of exp(2
r
/T)
is removed from the partition function, thus making it
q
r,o
= 3 + 7e
10r/T
+ 11e
28r/T
+ , (6.3.24)
where the * on the partition function indicates that it is an energy corrected partition function.
6.3.3 Rotation in Polyatomic Molecules
Principle Axes and the Moment of Inertia Tensor
In general a molecule has three axes of rotation. These are always chosen so that they meet at
the center of mass of the molecule. This ensures that rotation will not move the center of mass.
22
In rotation moment of inertia plays a role in rotation analogous to the role played by mass in
translation. The smaller the moment of inertia about a given axis, the easier it is to accelerate the
rotation about that axis.
The moment of inertia about an axis for a molecule with n atoms is given by:
I
xx
=
n
i=1
m
i
(y
2
i
+z
2
i
) , (6.3.25)
21
The one in which experimentalists live.
22
Otherwise rotation would move the center of mass and so rotation would not be independent of translation.
where I
xx
is the moment of inertia about the x-axis, m
i
is the mass of the ith atom, and y
i
and z
i
are the respective distances of the ith atom from the y and z axes.
Similarly we have I
yy
, involving distances x
i
and z
i
, and I
zz
involving distances x
i
and y
i
.
However, there are other components to the inertia. These are the products of inertia I
xy
, I
yz
, and
I
zx
given by:
I
xy
=
n
i=1
m
i
x
i
y
i
, (6.3.26)
and two similar equations for I
yz
and I
zx
. Here again x
i
and y
i
are the distances to the x and y
axes, respectively.
The result is the moment of inertia tensor:
I =
I
xx
I
xy
I
xz
I
yx
I
yy
I
yz
I
zx
I
zy
I
zz
, (6.3.27)
which is symmetric since, as can be seen from Equation (6.3.27), I
xy
= I
yx
.
If the coordinates are rotated with respect to the molecule, the value of the elements change. Since
the matrix is symmetric and real, a position of the coordinates can always be found that makes all
the o-diagonal elements zero. The axes are then called the principal axes of the molecule.
There are three such choices, corresponding to a dierent (right-handed) permutation of the coor-
dinate names. The standard choice is that
I
xx
I
yy
I
zz
. (6.3.28)
The Rotational Partition Functions
In general, as weve noted before, linear polyatomic molecules have two rotational degrees of freedom,
non-linear polyatomic molecules have three.
In the case of non-linear polyatomic molecules, there are three moments of inertia. The standard
notation has them arranged in the following order:
23
I
A
I
B
I
C
. (6.3.29)
These are related to the spectroscopic symbols

A,

B, and

C by:
A =
h
8
2
cI
A
,

B =
h
8
2
cI
B
,

C =
h
8
2
cI
C
. (6.3.30)
where c is the speed of light.
Thus relationship (6.3.29) can be written:
A

B

C . (6.3.31)
Keeping this in mind, we have the standard denitions given in Table 6.4.
A prolate spheroid is one that is (American) football shaped. And oblate spheroid is one that is a
slightly squashed basketball.
23
Here Ive used A, B, and C instead of x, y, and z, to follow the standard notation.
I
A
= I
B
= I
C
symmetric top

A =

B =

C
I
A
= I
B
< I
C
oblate spheroidal top

A =

B >

C
I
A
< I
B
= I
C
prolate spheroidal top

A >

B =

C
I
A
< I
B
< I
C
asymmetric top

A >

B >

C
Table 6.4: Names for Rigid Rotators
The energy state structure and degeneracies for the rst three types can be found fairly easily from
quantum mechanics.
24
Symmetric Top:
J
=

BJ(J + 1), g
J
= (2J + 1)
2
, J = 0, 1, 2, . . . . (6.3.32)
Equation (6.3.32) also holds for linear molecules but with degeneracy g
J
= 2J + 1.
Oblate:
J,K
=

BJ(J + 1) + (

C

B)K
2
, g
J
= 2J + 1,
J = 0, 1, 2, . . . , K = 0, 1, 2, . . . , J , (6.3.33)
and
Prolate:
J,K
=

BJ(J + 1) + (

A

B)K
2
, g
J
= 2J + 1,
J = 0, 1, 2, . . . , K = 0, 1, 2, . . . , J . (6.3.34)
Note that these have two associated quantum numbers, J and K, both of which aect the actual
energy.
The case of the asymmetric top is dierent. There is no simple formula for the energy levels.
25
and
energy level calculations have to be done for each specic molecule.
The canonical rotational partition functions for the four varieties of rotator presented here (the
linear molecule has already been discussed) are simple in the high temperature limit. The rst three
can easily, and the asymmetric top with some diculty, can be shown to give:
q
r
(T) =

1/2
_
T
r,A
_
1/2
_
T
r,B
_
1/2
_
T
r,C
_
1/2
, (6.3.35)
where the
r
are the rotational characteristic temperatures dened by
r,A
=
h
2
8
2
kI
A
, (6.3.36)
and where A could be A, B, or C with a corresponding change in the moment of inertia around that
axis, I
A
, I
B
, or I
C
. The equation contains a , which is the usual symmetry number.
In using Equation (6.3.35) one must remember to make the corresponding s equal if the corre-
sponding moments of inertia I are the same. Thus the symmetric top, with all moments of inertia
the same, becomes
q
r
(T) =

1/2
_
T
r,A
_
3/2
. (6.3.37)
24
Though most authors are reluctant to discuss the general cases, Kenneth S. Pitzer, in Quantum Chemistry,
Prentice-Hall, 1953, does give two references, Reiche, F. and Rademacher, H., Z. Physik, 39, 444 (1926); and Kronig,
R. de L. and Rabi, I. I., Phys. Rev., 29, 262 (1927). The latter paper is in English and both use Schrodinger methods.
25
Kenneth S. Pitzer, Quantum Chemistry, Prentice-Hall, 1953, cites King, G. W., Hainer, R. M., and Cross, P. C,
J. Chem. Phys., 11, 27 (1943).
6.4 The Electronic Partition Function in Polyatomic
Molecules
The electronic partition function for polyatomic molecules is, unsurprisingly:
q
e
=
_
1
+
2
e
2
+
3
e
beta3
+
e
De
, (6.4.1)
where
j
is the degeneracy of the jth electronic state and
j
is the corresponding electronic energy.
The quantity D
e
is the (negative) binding energy of the molecule.
26
As is well-known, it is not possible to solve the correct quantum mechanical equations for any
polyatomic molecule. Of course electronic energy levels for such molecules can be approximately
computed in various ways. But it is far more simple (and accurate) to use energies obtained by
spectroscopy.
6.5 The Thermodynamics of Polyatomic Molecules
The following section is probably redundant, but it is useful to have all of these formulas together
in one spot.
The partition function for a polyatomic molecule is not comples, but it is quite long.
27
It is best to
consider that q for an ideal polyatomic molecule in the rigid-rotor harmonic oscillator approximation
as a product of partition functions for each separate type of motion. Thus
q = q
t
q
v
q
r
q
e
(6.5.1)
where the terms are, from left to right, the translational, vibrational, rotational, and electronic
molecular partition functions.
In these terms we can write:
Q =
q
N
t
N!
q
N
v
q
N
r
q
N
e
. (6.5.2)
6.5.1 Translation
We have for translation:
28
q
t
=
_
2MkT
h
2
_
2/3
V , (6.5.3)
and, using obvious notations:
A
t
(N, V, T) = NkT ln
q
t
N
, (6.5.4)
Then, using Equation (6.5.3)
E
t
=
3
2
NkT C
V,t
=
3
2
Nk , (6.5.5)
p =
NkT
V
. (6.5.6)
26
Remember: as is true for diatomic molecules, the zero of energy for polyatomic molecules is taken to be the energy
of the separated atoms. This energy is negative in magnitude.
27
Indeed, the author used to love arranging his lectures so as to leave the completely written out partition function
on the classroom blackboard for the edication of the class next using the room.
28
This is the classical result. Of course the volume would have to be very small or the temperature microscopic to
have to use the appropriate quantum expression.
Note that the pressure does not have a subscript because V occurs only in the translational partition
function. Hence neither vibration nor rotation (or electronic excitation) aects the pressure of an
ideal gas.
Also:
t
= kT ln
q
t
N
, (6.5.7)
and
29
S
t
= Nk ln
_
_
2MkT
h
2
_
3/2
V e
5/2
N
_
. (6.5.8)
6.5.2 Vibration
The vibrational partition function q
v
is given by:
q
v
=
r
j=1
e
v,j/2T
(1 e
v,j/T
)
, (6.5.9)
where r is either 3n 5 or 3n 6 depending on the molecules being n atoms in a line or not,
respectively.
Then:
A = NkT lnq
v
, (6.5.10)
E
v
= Nk
r
j=1
_
v,j
2
+

v,j
e
v,j/T
(1 e
v,j/T
)
_
, (6.5.11)
and
C
V,v
= NK
r
j=1
_
_
v,j
T
_
2
e
v,j/T
(1 e
v,j/T
)
2
_
, (6.5.12)
where
v,j
= h
j
/k.
There is no pressure due to vibration as vibration does not depend on volume. The vibrational
contribution to the chemical potential is:
v
= kT ln
q
v
N
(6.5.13)
and the entropy due to vibration is:
30
S = Nk
r
j=1
_

v,j
/T
e
v,j/T
1
ln(1 e
v,j/T
)
_
, (6.5.14)
6.5.3 Rotation
Since rotation is almost always classical, the partition function is simple:
q
r
=

1/2
_
T
3
C
_
1/2
, (6.5.15)
29
Derivation of this formula is left to the reader.
30
And so is the derivation of this one.
where is the symmetry number of the molecule and the s are the rotational temperatures. Two
or even all three s may be the same in a given molecule. In case of a linear molecule, one will
be missing.
In these terms the thermodynamic functions are simple. The Helmholtz free energy contribution
from rotation, A
r
is:
E
r
=
3
2
NkT . (6.5.16)
This is E
r
= NkT in the case of a linear molecule due to the loss of one rotational degree of freedom.
The heat capacity due to rotation is:
C
V,r
=
3
2
Nk , (6.5.17)
or, for a linear molecule, C
V,r
= Nk.
The chemical potential for rotation is, as usual:
v
= kT ln
q
v
N
. (6.5.18)
The entropy contribution from rotation is:
S
r
= Nk ln

1/2
e
3/2
_
T
3
C
_
1/2
(6.5.19)
6.6 Appendix: Homonuclear Rotation
6.6.1 Singlets and Triplets
From quantum mechanics we have the following rules:
1. All nuclei have spin quantum numbers of magnitude J and degeneracy 2J + 1.
2. Allowed values for spin:
(a) The allowed values of J for bosons are the integers J = 0, 1, .
(b) The allowed values of J for fermions are the half-integers J = 1/2, 3/2, .
3. Interchange of particles:
(a) In any molecule interchange of symmetrically placed bosons leaves the wave function
unchanged, i.e. .
(b) In any molecule interchange of symmetrically placed fermions results in the wave function
changing sign, i.e. .
If we apply this to nuclei we then have the following rule:
The overall wave function for a homonuclear diatomic molecule whose nuclei are fermions, i.e. H
2
,
will be antisymmetric with respect to a rotation of 180
.
The overall wave function for a homonuclear diatomic molecule whose nuclei are bosons, i.e. D
2
, will
be symmetric with respect to a rotation of 180
.
Note that D
2
stands for deuterium.
Now lets think about a homonuclear diatomic molecule where the nuclei are fermions. If we inter-
chage two electrons in H
2
the result must change the sign of the overall wave function. So:
(6.6.1)
But the translational, rotational, vibrational, and nuclear spin wave functions are unchanged by
this. So the overall electron wave function must change sign:
e
=
e
(6.6.2)
Now the wave function for the electron is itself composed of two parts,
orbit
and
spin
:
e
=
orbit
spin
(6.6.3)
Thus since the interchange must be antisymmetric since electrons are fermions:
if
orbit
is anti then
spin
must be sym
if
orbit
is sym then
spin
must be anti
For H
2
, the ground state
orbit
is symmetric and so
spin
must be antisymmetric. The two electrons
then have a net spin of zero and the spin degeneracy
spin
= 2J + 1 = 1. This state is called a
singlet
On the other hand, the rst excited state has
orbit
antisymmetric, so
spin
must be symmetric,
the electron spins are parallel, and the net spin is 1. Hence the spin degeneracy
spin
= 2J +1 = 3.
This state is called a triplet.
6.6.2 Rotational Symmetry
Let us consider a molecule made up of two identical atoms. The rules for the construction of allowed
wave functions for this molecule are as given in 6.6.1. Thus the overall wave function for fermions
must be antisymmetric while that for bosons must be symmetric.
In the rst approximation, the wave function of such a system is the product of translational,
vibrational, electronic, rotational, and nuclear wave functions. The exact wave function may have a
slightly dierent form than this product, but no perturbation which may have been neglected in this
factoring can change the symmetry character of the overall] wave function.
31
The coordinates of the center of mass of the molecule are unchanged by an interchange of two
identical nuclei, so the translational wave function must also be completely unchanged (symmetric)
with this interchange.
The vibrational coordinate is also unaected by the interchange of the nuclei. The vibrational wave
function is also then symmetric with regard to the exchange of nuclei.
The electronic wave function is already antisymmetric with regard to the interchange of electrons.
It may be either symmetric or antisymmetric with regard to the interchange of nuclei.
For most diatomics, the ground state electronic wave function is symmetric with regard to nuclear
interchange.
Thus each of these three functions is unaected by the exchange of nuclei so their product must also
be unaected by this permutation.
The symmetry character of the wave functions for the entire molecule will
be that of the product of the rotational and nuclear spin wave functions.
What happens to the rotational wave function if nuclei are exchanged?
It turns out that this is a fairly easy question to answer as the rotational wave function contains
sines and cosines of these angles. Simple examination shows that rotational wave functions having
quantum number J even are symmetric with respect to nuclear interchange, while rotational wave
functions having quantum number J odd are antisymmetric with regard to nuclear interchange.
The next task is to examine the symmetry character of the nuclear spin wave functions.
If the magnitude of the spin (in units of ) is I, then there are 2I + 1 possible spin wave functions.
For two particles there are (2I + 1)
2
possible spin functions.
Of the total it can be shown that (I + 1)(2I + 1) are symmetric and I(2I + 1) are antisymmetric.
If one considers the simple case if I = 0, then only one spin function exists for each nuclei and
consequently only one for the molecule. This single function is symmetric. Most diatomic molecule
ground states are singlets and have I = 0.
If the nuclei are bosons, the total wave function for the molecule must be
symmetric, and with a symmetric electronic wave function, only symmetric
rotational wave functions will be allowed. Thus only even values of J will
be allowed.
31
Most of the material here is simply presented as fact and no proof is oered. Details can be found in any reasonable
quantum mechanics text.
If the nuclei are fermions, the total wave function for the molecule must
be antisymmetric, and with an antisymmetric wave function only antisym-
metric rotational wave functions will be allowed. Thus only odd values of
J will be allowed.
With antisymmetric electronic wave functions, as happens in the case of electronic triplets,
32
the
situation would be reversed and only odd Js would appear with bosons and even Js with fermions.
In any case only half of the possible rotational energy levels will exist.
Thus, for determining rotational energies, the nuclear spin wave functions and the rotational wave
functions are inextricably linked.
32
Which are often the rst excited states of diatomic molecules
7. The Grand Canonical Ensemble
7.1 Introduction
We have looked at systems with independent variables N, V , and E (the microcanonical ensemble),
and systems with independent variables N, V , and T (the canonical ensemble). These are by far
not the only possible ensembles.
Indeed, there are six common (but not equally useful) ensembles for single component, single phase
systems alone. If more components are added, the number of possible ensembles increases dramati-
cally.
This chapter discusses an ensemble that allows the number of particles in a system to vary. This is
often quite useful, as shall be seen.
7.2 The Equations for the Ensemble
If we pick , V , and T as the independent variables, the resulting ensemble is called the grand
canonical ensemble.
Clearly this is an open system since the number of particles is allowed to vary. But it is a feasible
ensemble and systems with such independent variables can be constructed experimentally. The
walls need to be xed and rigid as well as diathermal. And the walls also have to be permeable to
the particles that make up the system. Perforated steel sheeting would suit admirably, as would a
number of other choices.
To control the temperature and the chemical potential we need both a constant temperature bath
and a bath at a constant chemical potential. These can even be combined. it isnt always easy to
control chemical potential, but it can be done in several way. The most obvious is to place our
system into a huge bath, much larger than the system, containing the same particles that are in
the system but at the required chemical potential. Setting that chemical potential in the bath can
almost always be done. Electrochemical means is but one technique.
1
We now (mentally) replicate this system a huge number of times, and once again reduce the problem
to that of the microcanonical ensemble by wrapping the ensemble and its assorted baths in an
adiabatic impermeable wrapper.
Our problem then is to once again nd the set of occupation numbers that maximizes W
total
the
number of ways of arranging / ensembles among the possible system quantum states subject to
both the constraint of constant temperature and constant chemical potential.
The occupation numbers are a bit more complex here than previously. Since the number of particles
is variable, we have a whole series of possible energy levels, one for each possible number of particles
in the system, including zero particles in the system!
1
What can and cannot be controlled depends both on the chemical species in the system and the state of current
technology. We will take the view here that it can be done for any system of interest and leave the details to be
examined at the time particular systems are treated.
78
Chapter 7: The Grand Canonical Ensemble 79
For instance if we have 14 particles, then we might use a
14,3
to denote the number of ensemble
members with 14 particles which is in the third quantum state for such systems. And a
254,7
would
be the number of ensemble members with 254 particles in the seventh quantum state for such
systems.
The set of occupation numbers can be written:
a
0,0
a
1,1
, a
1,2
, ... a
1,j
, ...
a
2,1
, a
2,2
, ... a
2,j
, ...
...
a
k,1
, a
k,2
, ... a
k,j
, ...
...
where the term a
0,0
is not a mistake. It is perfectly possible for a system to contain no particles at
all. Of course then its energy is zero.
What we have to maximize is:
W
total
= W
tbath
W
bath
W
sys
. (7.2.1)
Now W
tbath
, the ways of internally arranging the constant temperature bath, depends on the energy
of the bath E
tbath
And W
bath
, the ways of internally arranging the constant chemical potential
bath, depends on the number of particles N
bath
in the bath. And W
sys
, the number of ways of
internally arranging the members of the system, depends on the usual multinomial coecient. These
three terms are otherwise independent of each other. So Equation (7.2.1) becomes
W
total
= W
tbath
(E
tbath
) W
bath
(N
bath
)
/!
a
0,0
!a
1,1
! a
N,j
! . . .
, (7.2.2)
and we have to maximize with respect not only to the as but to the energy of the temperature bath
and to the number of particles in the chemical potential bath. Thus there are / + 2 variables.
But not all of these are independent. There are three constraints:
N=0
j=1
a
N,j
= /, (7.2.3)
N=0
j=1
a
N,j
E
N,j
+E
tbath
= c , (7.2.4)
N=0
j=1
a
N,j
N
j
+N
bath
= ^ , (7.2.5)
where /, c and ^ are constants.
This is ne if each system in the ensemble is in two separate baths (chemical potential and heat)
separately. However, in the real world such a system would be in only one bath which would serve
the functions of the two separate baths. In that case then E
tbath
is the energy of both baths and
N
bath
is the number of particles in both baths.
We denote them separately above for convenience. As will be seen each separate bath contributes
its own special properties to the systems.
7.3 Solution of the Equations
The solution takes place essentially by the same steps that were used for the canonical ensemble,
except now there is one more constraint.
The rst step is to convert Equation (7.2.2) on the preceding page to logarithmic form. This gives
(using Stirlings Approximation):
lnW
total
= lnW
tbath
(E
tbath
) + lnW
bath
(N
bath
)
+ ln/!
j
[a
N,j
lna
N,j
a
N,j
] , (7.3.1)
where Stirlings approximation has been used on the last term.
To maximize this we take the total derivative of W
total
and set it to zero:
d lnW
total
=
_
lnW
tbath
(E
tbath
)
E
tbath
_
dE
tbath
+
_
lnW
bath
(N
bath
)
N
bath
_
dN
bath
j
lna
N,j
da
N,j
= 0 . (7.3.2)
We now take the total derivative of each of Equations (7.2.3) to (7.2.5) on the previous page and
multiply them by , , and , respectively.
N=0
j=1
da
N,j
= 0 , (7.3.3)
N=0
j=1
E
N,j
da
N,j
+dE
tbath
_
= 0 , (7.3.4)
N=0
j=1
Nda
N,j
+dN
bath
_
= 0 . (7.3.5)
These are then subtracted
2
from Equation (7.3.2) to give:
d lnW
total
=
__
lnW
tbath
(E
tbath
)
E
tbath
_
_
dE
tbath
+
__
lnW
bath
(N
bath
)
N
bath
_
_
dN
bath
j
[lna
N,j
+ +E
N,j
+N] da
N,j
. (7.3.6)
We can make the tbath term disappear by setting
=
_
lnW
tbath
(E
tbath
)
E
tbath
_
, (7.3.7)
and the bath term do the same by setting
=
_
lnW
bath
(N
bath
)
N
bath
_
, (7.3.8)
2
Again, this is done for convenience. To add instead just change the signs of , , and .
and one of the as vanish by taking, for example,
lna
3,1
+ +E
3,1
+ 3 = 0 . (7.3.9)
Equation (7.3.6) on the previous page then reduces to:
j
_
lna
N,j
+ +E
N,j
+N
da
N,j
= 0 , (7.3.10)
for all N and j except for (3,1). But since Equation (7.3.9) also ts Equation (7.3.10) we can include
(3,1) here.
Now we make our usual argument in which we claim that since all of the das are independent, the
only way Equation (7.3.10) can be satised for all values of the da
N,j
will be if the quantity in the
square brackets is identically zero. This give us:
lna
N,j
+ +E
N,j
+N = 0 , (7.3.11)
for all N and j so that
a
N,j
= e
e
E
N,j
e
N
. (7.3.12)
As before, can be found from the rst constraint equation, Equation (7.3.3) on the preceding page:
j
a
N,j
= e
j
e
E
N,j
e
N
= /, (7.3.13)
so
e
=
/
j
e
E
N,j
N
. (7.3.14)
It is convenient to dene
(, V, ) =
j
e
E
N,j
N
. (7.3.15)
This quantity is known as the grand canonical partition function. The symbol for it, , is
pronounced ksi with a long i. It plays the same role in the grand canonical ensemble that Q(N, V, )
does for the canonical ensemble.
With this denition Equation (7.3.14) becomes
e
=
/
(, V, )
. (7.3.16)
The probability if nding a system in the ensemble with N particles and energy level j is then:
P
N,j
=
a
N,j
/
=
e
E
N,j
N
(, V, )
. (7.3.17)
7.4 Thermodynamics of the Grand Canonical Ensemble
Given Equation (7.3.17) it is simple to work out the thermodynamics of the system and, at the same
time, identify and . The method follows that for the canonical ensemble.
The average value of the energy E) is
E) =
j
P
N,j
E
N,j
=
1
(, V, )
j
E
j
e
E
N,j
N
(7.4.1)
which can be seen to be the same as:
E) =
_
ln
_
V,
. (7.4.2)
We can immediately identify this energy with the internal energy of the system.
In the same manner it can be shown that
p) =
_
ln
V
_
,
, (7.4.3)
where we identify p) with the thermodynamic pressure p, and
N) =
_
ln
_
,V
. (7.4.4)
where we identify N) with the thermodynamic number of particles in the system.
We now have what we need to write down the total derivative of the grand canonical partition
function (, V, ):
d ln(, V, T) =
_
ln
_
V,
d +
_
ln
V
_
,
dV +
_
ln
_
,V
d . (7.4.5)
or, in terms of Equations ((7.4.2) (7.4.4)):
d ln(, V, T) = Ed +pdV Nd . (7.4.6)
The equivalent thermodynamic equation is
3
pV
kT
= Ed
_
1
kT
_
+
p
kT
dV Nd
_

kT
_
, (7.4.7)
from which we can make the identications:
=
1
kT
and =

kT
, (7.4.8)
and, most importantly:
pV = kT ln(T, V, ) . (7.4.9)
Last, we nd a formula for the entropy. This is most easily done from the thermodynamic relation-
ship:
E = TS pV +N, (7.4.10)
which can be rewritten as:
S =
E
T

N
T

pV
T
, (7.4.11)
and so S can be found from
S =
1
T
_
ln
_
+

T
_
ln
_
+k ln, (7.4.12)
although it is almost always easier to compute E, N, and p and then compute S from Equa-
tion (7.4.11)
3
This may not be familiar to the reader. See the Appendix in Section 8.5 on page 97 for details.
7.5 Entropy and Probability
Equation (7.3.17) on page 81 is:
P
N,j
=
a
N,j
/
=
e
E
N,j
N
(, V, )
. ((7.3.17))
We can manipulate this to get:
lnP
j,N
= E
j,N
N ln, (7.5.1)
P
j,N
lnP
j,N
= E
j,N
P
j,N
+NP
j,N
P
j,N
ln,
j,N
P
j,N
lnP
j,N
=
j,N
E
j,N
P
j,N
+
j,N
NP
j,N
ln
j,N
P
j,N
,
j,N
P
j,N
lnP
j,N
= E) +N) ln.
Given Equation (7.4.7) on the preceding page and our other identications we have:
j,N
P
j,N
lnP
j,N
=
E
kT
+
N
kT
+
A
kT
(7.5.2)
We had previously gotten Equation (1.2.3) on page 2:
S
k
=
1
kT
U +
p
kT
V

kT
N ((1.2.3))
so that we have, once again
4
S = k
j,N
P
j,N
lnP
j,N
(7.5.3)
7.6 The Relationship to the Canonical Ensemble
There are some interesting relationships among our ensembles.
The denition of the grand canonical partition function is:
(, V, ) =
j
e
E
N,j
N
((7.3.15))
If we do the sums separately we can write:
(T, V, ) =
N
e
N/kT
j
e
E
j,N
/kT
(7.6.1)
which simplies to be:
(T, V, ) =
N
Q(N, V, T)e
N/kT
(7.6.2)
Since Q itself can be written in terms of , we have
(T, V, ) =
N
_
j
(N, V, E
j
)e
Ej/kT
_
e
N/kT
(7.6.3)
This sort of summation of one quantity (for example to get another (for example Q) is called a
Laplace Transformation.
4
Indeed, it is always true for all ensembles
7.7 A Direct Consequence of Having Independent Sub-
systems
We have seen earlier (Chapter 4) that having independent subsystems leads to expressing the canon-
ical partition function in terms of single subsystem partition functions.
Using this in the grand canonical partition function leads directly to an interesting and very general
result.
The grand canonical partition function can be written as:
(, V, ) =
N
Q(N, V, )e
N
, (7.7.1)
where = /kT. Now for convenience let
= e
= e
/kT
, (7.7.2)
where is known as the absolute activity because
ln = /kT (7.7.3)
and so
= kT ln. (7.7.4)
Since a chemical potential is normally written in terms of the thermodynamic activity a as:
=
+kT lna , (7.7.5)

(where the activity is given per molecule instead of per mole), it is clear that is an activity per
particle in a system in which
is identically zero in the standard state. Thus the name absolute

activity.
With this denition of in mind (and, by the way we shall use it often in the rest of this work), we
have:
(, V, ) =
N
Q(, V, N)
N
=
N
1
N!
q(V, )
N
N
=
N
1
N!
(q)
N
= e
q
, (7.7.6)
where the last line comes from comparison of the line just above it to the series expansion of expx.
So we have evaluated the grand partition function on the assumption that (1) the particles are
independent and (2) that Boltzmann statistics apply.
But we are not done.
ln(, V, ) = q = qe
. (7.7.7)
The expected number of particles in a member of a grand canonical ensemble is:
N) =
_
ln
_
, (7.7.8)
so we get in this case:
N) = qe
. (7.7.9)
So, comparing to Equation (7.7.7) on the preceding page we have
N) = ln (, V, ) . (7.7.10)
But ln = pV , so we arrive at
N) = pV , (7.7.11)
which is:
pV = N)kT , (7.7.12)
the ideal gas equation!
Now this isnt just a cute result. It is in fact profound. What weve shown
5
is that if a system is
made up of independent particles and if the system obeys Boltzmann statistics, the particles obey
the ideal gas law. Nothing is said about the state or condition of the particles, their energy levels,
or whatever. We come away with the conclusion that it is independence that makes a system ideal
and nothing else. This is a good thing to remember.
7.8 A General Result for Independent Sites
We have already derived a relationship between q, the single independent subsystem canonical
partition function and Q, the canonical partition function for an entire ensemble. It would be useful
to see if there is a similar decomposition for the grand canonical ensemble.
It turns out that there is. It isnt quite the same as q, but it is useful nevertheless. Instead of
indpendent subsystems we have to think in terms of independent sites or independent energy levels.
That requires a bit of explanation. An independent site might be a site on the surface of a solid
onto which molecules (or other entities) can be adsorbed. There could be any number of particles
adsorbed onto a given site, from zero on up.
We could also think in terms of independent energy levels where there could be any number of
particles in a given energy level, as long as quantum mechanical constraints are obeyed. Particles
in one energy level dont change the energy of another level in any way. In other words, the energy
levels are independent.
In the discussion below we will further confuse the issue by using the term box instead of site
or energy level. Box k contains n
k
particles. This number could be anything from zero up to a
maximum of n
k,max
particles. Of course n
k,max
could be innite.
6
The energy of the kth box is
k
.
Let us return to the grand canonical ensemble:
(, V, ) =
N
Q(, V, N)
N
, (7.8.1)
where
Q(, V, N) =
j
e
Ej
. (7.8.2)
If the system is composed of independent boxes the total energy of the system with a particular set
of n
k
s is then E
j
:
E
j
=
k
n
k
and (7.8.3)
and
N =
k
n
k
. (7.8.4)
5
After what must be the longest derivation of pV=NkT in history.
6
We will need to make n
k,max
nite when we talk about quantum statistics in Chapter 9 and again in Chapter 11.
Then:
Q(, V, N) =
{n
k
}
e
P
k

k
n
k
, (7.8.5)
where the star on the sum and the funny subscript n
k
is a reminder that the summation is only
over the sets of ns, n
k
such that

k
n
k
= N. This restriction is necessary because Q(, V, N)
requires a xed N.
But the grand canonical ensemble does not require a xed N. So we have:
(, V, ) =
{n
k
}
e
P
k
n
k
k
=
{n
k
}
P
n
k
e
P
k
n
k
k
=
{n
k
}
k
_
e
n
k
. (7.8.6)
The last equation here comes from a rearrangement of
P
n
k
e
P
k
n
k
k
that goes like this:
P
n
k
e
P
k
n
k
k
=
n1+n2+
e
(n11+n22+ )
=
n1
e
n11
n2
e
n22
=
_
n1
e
n11
_
n2
e
n22
=
_
e
1
n1
_
e
2
n2
=
k
_
e
n
k
,
which may not be obvious at rst glance.
7
Since

N
{n
k
}
sums over all values of n
k
. So every possible value of n
k
occurs in

k
n
k
for some
value of N. we can then write:
(, V, ) =
n1,max
n1=0
n2,max
n2=0
...
n
k,max
n
k
=0
...
k
_
e
{n
k
}
=
n1,max
n1=0
_
e
1
n1
n2,max
n2=0
_
e
2
n2
. . .
n
k,max
n
k
=0
_
e
n
k
...
=
k
n
k,max
n
k
=0
_
e
n
k
. (7.8.7)
What we have done here is, for independent boxes, transformed from a sum over N, the number
of particles in a system, to a product over sums of single box energy states.
If we let be dened by:
k
=
n
k,max
n
k
=0
_
e
n
k
, (7.8.8)
we end up with
(, V, ) =
k
, (7.8.9)
7
It certainly wasnt to me when I rst saw it!
where is a grand partition function per box.
The math here is admittedly hairy
8
but can be understood by a concrete example followed by reread-
ing the material above.
Example 7.1:
As an example let us work out a simple situation in which there are only two boxes, each of which
can have 0, 1, or 2 particles in it. In this case Equation (7.8.6) becomes:
(, V, ) =
{n
k
}
k
_
e
n
k
. (7.8.10)
To make things easier, lets denote e
k
by x
k
. That way Equation (7.8.10) reads:
(, V, ) =
{n
k
}
k
x
n
k
k
, (7.8.11)
which certainly looks simpler even if it is just the same thing. Now in our example there can only
be two dierent xs since there are only two dierent boxes So we can multiply out the product to
get:
(, V, ) =
{n
k
}
x
n1
1
x
n2
2
. (7.8.12)
Since the condition on the ns requires that each of them not exceed 2 and that their sum must be
at most 4, since that is the largest number of particles we can have here (two in each of two states.)
So I can change the notation slightly
9
to give:
=
4
N=0
n1+n2=N
x
n1
1
x
n2
2
=
4
N=0
4
n1=0
x
n1
1
x
Nn1
2
. (7.8.13)
This is still Equation (7.8.6). All Ive done is simplify the notation.
Now Im going to convert Equation (7.8.13) to the same form as Equation (7.8.7). The big step is
in expanding Equation (7.8.13). I do it rst for N = 0, then N = 1, and so on. The result (which
readers should verify for themselves) is:
= 1 + (x
1
+x
2
) + (x
2
1
+x
1
x
2
+x
2
2
) + (x
1
x
2
2
+x
2
1
x
2
) + (x
2
1
x
2
2
) . (7.8.14)
The rst term corresponds to N = 0. This is the term for no particles in any box at all. The rst
set of terms in parentheses corresponds to N = 1. This corresponds to a single particle that might
be in box 1 or it might be in box 2.
The next set of parentheses corresponds to two particles and denotes two particles in box 1 or one
in each box, or both in box 2. The other sets of parentheses follow the same pattern, subject to the
condition that no x can have an exponent greater than 2.
Study this expansion because it is the secret to the entire process.
Now I factor the terms. Heres the result:
= (1 +x
1
+x
2
1
)(1 +x
2
+x
2
2
) . (7.8.15)
8
An ancient American slang term meaning rather dicult
9
Im a great believer in the idea that simple notation makes problems simpler to solve.
It is easy to see if you work back from Equation (7.8.15) to Equation (7.8.14).
We are about done. Equation (7.8.15) can be written as a product of two quadratics:
=
2
k=1
(1 +x
k
+x
2
k
) , (7.8.16)
and the quadratic as a sum:
=
2
k=1
2
r=0
x
r
k
. (7.8.17)
Translating back into our original terms, we have:
(, V, ) =
k
n
k,max
n
k
=0
_
e
n
k
. (7.8.18)
That, basically, is what everything in this section was about. What has been done is to convert
a messy sum over a product in Equation (7.8.6) to a rather more useful product over a sum as in
Equation (7.8.7). In fact, the result is more useful in general. In this example the number of particles
per box was articially restricted to two. If we let it run to innity, then the restriction on the sum
disappears because the n
k
can add up to anything and we are guaranteed that whatever value they
add up to will correspond to some value of of total number of particles in an energy level.
8. The Equivalence of Ensembles
It is useful to stop for a moment amidst our mad generation of partition function after partition
function and take a more general look.
If we have a system at equilibrium, then all of its thermodynamic variables have xed values. We
know this experimentally. It should not matter what partition function we use to compute those
values, we must always get the same answers.
So thermodynamically if we take a one-component single phase system and consider it to have xed
independent variables N, V , and T, it also has xed dependent variables variables , p and E. The
values we calculate statistically for these three should agree with the measured values.
We could also consider that same system as having a xed , V and T, in which case the values of
N, p, and E we calculate should agree with the experimental values.
As a practical matter then, it should not matter which ensemble we use for computations. And
indeed this is true. One gets the same results no matter what ensemble is used. In practice we to
use the ensemble that makes the computations the easiest.
However, there is another implication here. That is that all ensembles should be equivalent in some
sense. Yet they certainly look dierent.
In this chapter we will investigate the relationship among ensembles, old and new.
8.1 Expansion of the Grand Canonical Partition Function
Weve already seen that the grand canonical partition function can be written as:
(, V, ) =
N
Q(N, V, )e
N
, (8.1.1)
so that the grand canonical partition function can be considered as being built up of canonical
partition functions for systems of every possible particle number.
To make this explicit can be written:
(, V, ) = Q(0, V, ) +Q(1, V, )e
+Q(2, V, )e
2
+... , (8.1.2)
which points up the close relationship between the two ensembles. But in fact we know that there is
a single most probable value of N, so only one of the Qs is really important. The others just dont
matter! How can this be? To understand this we have to make several detours.
8.2 Generalized Laplace Transform
A type of transformation that is often very useful in statistical thermodynamics is the generalized
Laplace transform.
An ordinary Laplace transform is an integral transform in which one function is transformed into
another (with dierent independent variables) by integration.
89
Chapter 8: The Equivalence of Ensembles 90
Thus if f(x) is a function, then
F(s) =
_

0
f(x)e
sx
dx, (8.2.1)
is its Laplace transform. The function F(x) can be transformed back into f(x) by means of the
inverse Laplace transform
f(x) =
1
2i
_
c+i
ci
F(s)e
sx
ds , (8.2.2)
where the integration is carried out in the right-hand half of the complex plane.
1
As a trivial example, if f(x) = 1, then its Laplace transform F(s) is:
F(s) =
_

0
e
sx
dx =
1
s
. (8.2.3)
A generalized Laplace transform is a Laplace transform in which one integrates over x if x is a
continuous variable and sums over x if it is a discrete variable.
The case in which x is discrete is of great interest to us.
Consider f(n) = n where n is an integer. Then the generalized transform of f(n) is
F(s) =
n=0
ne
sn
. (8.2.4)
This sum can be found in closed form.
The rst step in solving Equation (8.2.4) is to recognize that e
s
is just a constant as far as the
summation is concerned. Let us call that constant a. Then Equation (8.2.4) becomes:
F(a) =
n
na
n
. (8.2.5)
The second step is to realize that the related series
G(a) =
n
a
n
, (8.2.6)
is a geometric series and converges to a known sum as long as a satises 1 a < 1. Here a cant
be less than 0 as long as s is positive. And since s is arbitrary, all we need do is ensure that we
never use a negative s.
The sum in Equation (8.2.6) is:
G(a) =
n
a
n
=
1
1 a
1 a < 1 . (8.2.7)
The two series, Equations (8.2.5) and (8.2.7) are related by dierentiation. If we dierentiate
Equation (8.2.7) with respect to a we get
dG(a)
da
=
n
na
n1
(8.2.8)
which, when multiplied by a gives
a
dG(a)
da
=
n
na
n
= F(a) (8.2.9)
1
Students reading this should not worry overmuch about it. I include the inverse transformation only for the sake
of being complete. We are not going into the general theory of Laplace transforms.
Since G(a) = 1/(1 a), then it is simple to determine that F(a) is
F(a) =
a
(1 a)
2
(8.2.10)
An example of a generalized Laplace transform as part of statistical thermodynamics is:
(, V, ) =
N
Q(N, V, )e
N
(8.2.11)
where plays the role of s and N plays the role of x.
8.3 Transformations Among Ensembles
We start with an example. We will transform the canonical partition function Q to the grand
canonical partition function .
The appropriate starting point is:
A(N, V, ) = lnQ(N, V, ) (8.3.1)
or, what is the same thing:
e
A(N,V,)
= Q(N, V, ) (8.3.2)
Lets now do a generalized Laplace transform on this, replacing the variable N with the variable .
To do this we multiply both sides of Equation (8.3.2) by e
N
and sum over all N:
N
e
A(N,V,)
e
N
=
N
Q(N, V, )e
N
(8.3.3)
The right-hand side is the grand canonical partition function (, V, ).
N
e
A(N,V,)N
= (, V, ) (8.3.4)
On the left-hand side we recognize that the exponent is pV (which is a function of N)
2
so the
result of the Laplace transform is:
N
e
PV
= (, V, ) (8.3.5)
If the left-hand side were simply e
pV
without the summation wed have the result we expected, for
then wed have
pV = ln
But we have a summation sign. Thats not quite what we expected. How can:
N
e
pV
= e
pV
When written down it even looks silly.
2
This may not be obvious. In that case refer to Section 8.5.3 on page 99 for details. Also, pV is a function of N
because p is a function of N. And p depends on E
j
/V and E
j
is a function of N.
8.3.1 The Maximum Term Method
It turns out that the summation sign that was a problem in the previous section can be dropped
under special circumstances. Or, put another way, sometimes a single term in a summation is as
large as the entire sum! Seeing this requires another digression.
3
We are dealing with very large numbers here. And very large numbers sometimes have interesting
properties.
Lets rst see how large e
pV
really is. Take a typical system with a pressure p of 101325 pascals,
4
a volume V of 0.020 cubic meters,
5
and a temperature T of 300 K. The Boltzmann factor k is
1.38 10
23
in these units. Thus pV is about 4.9 10
23
.
What we are dealing with is not 4.9 10
23
, it is
e
4.910
23
which is a number that is a power of ten with about 2 10
23
digits in its exponent! And thats the
value of just one term in the sum in Equation (8.3.4) on the previous page.
Now we need to digress a bit further and consider the following theorem:
Let
S =
M
N=1
T
N
(8.3.6)
where we assume that the terms T
N
> 0 for all N. Note that our sum of e
pV
ts this condition.
All the terms are positive, Hence S is surely greater than or equal to the largest term T
max
in the
sum. And it is surely smaller than or equal to M times T
max
. So
T
max
S MT
max
(8.3.7)
We now take logs to get
lnT
max
lnS lnT
max
+ lnM (8.3.8)
Now assume that T
max
is roughly as large as e
M
(or even larger), then
T
max
e
M
(8.3.9)
and since clearly for large M
M lnM (8.3.10)
so
lnT
max
M lnM (8.3.11)
and the factor of lnM on the right-hand side of Equation (8.3.8) is then negligible compared to
lnT
max
. Compare, for example 10
23
and ln 10
23
= 52.96. Surely 52.96 is enormously smaller than
10
23
.
Equation (8.3.8) now becomes the remarkable
lnT
max
lnS lnT
max
(8.3.12)
3
Yes, the author knows that these frequent digressions interrupt the ow of the text. But he knows that the
digressions involve new material. And he choses not to simply lump these mathematical details together in separate
chapter that nobody would read (at least without falling asleep). What hes doing is called motivation...
4
One atmosphere to you chemists.
5
Thats 20 liters.
so that we inescapably have the conclusion that to a fantastically good approximation
lnS = lnT
max
(8.3.13)
and so
S = T
max
(8.3.14)
This is a rather astounding result. It does not hold in general. It only holds for series of all positive
terms where the the largest term is huge. But that is exactly the case in Equation (8.3.5) on page 91!
Thus we can simply replace the sum on the left-hand side of that equation with its largest term to
get:
e
p
V
= (, V, ) (8.3.15)
where p
*
is the value of p that causes e
pV
to be a maximum. So now we have:
p
V = ln (, V, ) (8.3.16)
But what value does p
have? First, on a microscopic level p
must be dened here by

p
=
_
lnQ
V
_
N,
(8.3.17)
Now that pressure would have the numerical value appropriate for a system with the given values
of N, V , and . That is, it would be the equilibrium value of p. Which is exactly what we want.
So we come out of this with a simple rule: When the terms of a sum (such as in a partition function)
are all positive and at least some of the terms are huge in numerical value, we can replace the sum
with the largest term if we also replace the dependent variable by its equilibrium value as given by
the independent variables.
8.3.2 A Cautionary Example: The Isobaric-Isothermal Ensemble
In general it is not necessary to go through the entire Lagrangian multiplier derivation in order to
nd new ensembles. It is quicker simply to apply the transforms weve discussed above.
But this does not always work smoothly. To illustrate this lets nd a partition function whose
independent variables are N, p, and T. Actually, we need the variables N, p, and , because we
need dimensionless variables.
6
However, There is a diculty. As was rst pointed out by W. Byers Brown,
7
while temperature
and chemical potential are true ensemble variables having no existence for individual molecules,
pressure is in fact a mechanical variable and can be dened quite well both classically and quantum
mechanically for single systems.
8
All this is in a way quite besides the point, though it is a warning ag. The problem in the Isobaric-
Isothermal ensemble lies in the volume. Volume is a continuous variable. This leads to some serious
problems with Gibbsian ensembles
9
even if we articially force the volume to be quantized.
6
The idea of using ensembles having constant pressure rather than constant volume was rst advanced by E. A.
Guggenheim, J. Chem. Phys. 7, 103 (1939).
7
W. Byers Brown, Mol. Phys. 1, 68 (1958).
8
Note that p
j
= (E
j
/V ), so that quantum mechanically even a single particle in a box of volume V can have a
pressure.
9
The problem is more tractable when using the Boltzmann approach as will be seen later when we discuss classical
statistical thermodynamics.
Nevertheless, it is instructive to try to carry out the procedure developed in this chapter and to try
to generate the appropriate partition function for this ensemble.
We can start our example with any partition function we know, but it is best to start with one that
is close to what we want. Here we start with the canonical partition function:
A(N, V, ) = lnQ(N, V, ) or e
A
= Q(N, V, ) (8.3.18)
and now transform it to our desired variables.
To replace V with p, we multiply both sides by e
pV
and apply a generalized Laplace transform.
Since V is a continuous variable, we will integrate instead of sum:
_
e
ApV
dV =
_
Q(N, V, )e
pV
dV (8.3.19)
But look at what has happened! Equation (8.3.19) now has the units of volume! Partition functions
need to be dimensionless if for no other reason than we know that we will be taking logarithms of
them.
We replace the integral by its largest term:
10
e
(A+pV )
= =
_
Q(N, V, )e
pV
dV (8.3.20)
where, as usual, we call this partition function because thats what it is called in the literature.
And now we are in terrible trouble. Multiplying by dV and integrating has left the right-hand-side
of Equation (8.3.20) with units. And partition functions are dimensionless.
11
Hill
12
has argued that this can be xed by multiplying not by dV but by d(V/V
o
), where V
o
is a
suitably small reference volume. In fact he suggests using
V
o
=
3
where =
h
(2mkT)
1/2
(8.3.21)
If we do this then Equation (8.3.20) becomes instead:
e
(A+pV )
= =
1
V
o
_
Q(N, V, )e
pV
dV (8.3.22)
Now taking logs we get:
(A+pV ) = ln = ln
_
Q(N, V, )e
pV
dV lnV
o
(8.3.23)
where we assume that weve left the units of V
o
inside the integral (making it dimensionless) while
now having V
o
represent only a numerical value.
We now dispose of V
o
by noting that it is of the order of N times smaller than the integral.
This actually works. Well postpone a discussion of why for a few moments.
10
Wait a minute! How can we do that? What we just proved was for sums, not integrals! Thats true. But if we
consider an integral as a sum of thin strips of integrand, then our result applies for such integrals as well.
11
Note that taking the largest term in the integral on the right-hand-side of Equation (8.3.20) has removed the
problem there.
12
T. L. Hill, Statistical Mechanics, McGraw-Hill, 1956, page 62.
It remains to see what thermodynamic function Equation (8.3.23) on the previous page really is (if
any).
13
We have:
(A+pV ) = (E TS +pV )
= (TS pV +N TS +pV )
= N
= G (8.3.24)
where G is the Gibbs free energy. The result is:
G = ln = ln
_
Q(N, V, )e
pV
dV (8.3.25)
Note that in Equation (8.3.25) we are treating dV as dimensionless.
Why does this work? Because what weve done is essentially make the volume discrete and then
to sum over the discrete volumes rather than integrate. Weve introduced a basic quantum of
volume, and measured volumes in terms of it.
This is not a satisfactory solution and the reader should not think that it is.
What is interesting is that the author can nd no proper treatment of the Isobaric-Isothermal system
in any detail
14
in the available literature today. This is, among other things, rather unconscionable.
It is his personal opinion that the proper way to deal with any ensemble involving integrating over
the volume is to assume that volume is quantized. That is, that there exists a smallest volume
element v
o
and that all other volumes are simply multiples of this. Then instead of integrating over
the volume, we sum and have no problems at all!
15
8.4 Summary: The Relationship Among Ensembles
One way to think about this is to realized that we can start with any ensemble whose thermodynamic
connection we know and transform it to any other set of thermodynamic variables. We can nd
the partition function for a two-component system with variables p, T, N
1
and
2
, where
2
is the
chemical potential associated with component 2.
Or we can show that the partition function for the ensemble with the independent variables p, T,
and is zero.
So we are done with undetermined multipliers and the like. If we need an ensemble, well just
transform one of our regular ones to t the job.
There is a good thermodynamic reason as to why this works. If one has a (say) single component
single phase system one wants to describe, three variables will do it. Which three are chosen does
not matter.
16
One person can choose N, V , and T. Another can chose N, p, and T. If the pressure
13
It is not hard to come up with partition functions that do not correspond to any thermodynamic quantity known
in the literature. Thats not because there is no such thermodynamic function; it is because nobodys found a major
use for such a function yet.
14
Many authors mention it and give (without derivation) Equation (8.3.25) as the denition of without realizing
that Equation (8.3.25) is incomplete as it stands.
15
This is, in eect, what happens when we quantize energy and sum over energies instead of integrating. If we did
integrate, wed have a dE in our integrals and the resulting partition function would have the units of energy.
16
Unless all three are intensive in which case the volume of the system is not dened and we are in some trouble.
Indeed, the partition function for such a system is zero.
experimentally found by the rst person is the pressure used by the second, the two systems will be
identical in all thermodynamic respects.
And so mathematically. A system itself has no particular preference for any set of independent
variables. Those are a choice imposed by the human studying the system on the system. Two
dierent humans can choose two dierent sets of independent variables. And wed be shocked if the
results they calculated for other variables (say a heat capacity, for instance) were dierent.
8.5 Appendix: Legendre and Massieu Transforms
Most descriptions of classical thermodynamics begin with a consideration of the internal energy E.
After some preliminary discussion the internal energy is written as:
dE = TdS pdV +dN , (8.5.1)
where the entropy S, the volume V , and the number of particles N are the natural variables for the
internal energy.
17
Each thermodynamic function has a set of natural variables. When expressed in
terms of those variables, the thermodynamic function is an extremum at equilibrium. In the case of
the internal energy, it is a minimum at constant S, V , and N.
8.5.1 Eulers Theorem on Homogeneous Functions
Eulers Theorem on homogeneous functions is discussed in many thermodynamics text books. It is
fairly simply stated, but depends on the denition of a homogeneous function. If one has a function
of several variables, say f(x, y, z), and if that function obeys the equation
f(x, y, z) =
n
f(x, y, z) . (8.5.2)
then the function f(x, y, z) is said to be homogeneous of degree n.
Operationally homogeneity is easy to determine. One takes the function, replaces x by x, etc, and
then sees if one can factor all the s out. If one can, then the function is homogeneous.
Most functions are not homogeneous.
Eulers Theorem on homogeneous functions is simple to state:
If f(x
1
, x
2
, . . .) is a homogeneous function of degree n, then
x
1
_
f
x
1
_
+x
2
_
f
x
2
_
+ = nf(x
1
, x
2
, . . .) (8.5.3)
The importance of this here is that most of the functions of thermodynamics are homogeneous.
In fact all extensive thermodynamic functions are homogeneous of degree 1, while all intensive
thermodynamic functions are homogeneous of degree zero.
18
In particular the internal energy is homogeneous of degree 1, so that:
E(S, V, N) = E(S, V, N) . (8.5.4)
From this we can apply the method of Eulers Theorem.
19
We dierentiate Equation (8.5.4) with
respect to using the chain rule:
_
E
S
__
S
_
+
_
E
V
__
V
_
+
_
E
N
__
N
_
= E , (8.5.5)
17
I am cheating slightly here since classical thermodynamics knows nothing of actual particles but instead deals
with macroscopic quantities of matter, usually moles. Since we are concerned with particles and the transformation
between moles and particles is trivial, my preference is to work with particle and, as a result, the energy per particle,
the entropy per particle, etc.
18
When we mentally test to see if a function is extensive, we mentally perform the operations of Equation (8.5.2)
on it. Thats what, for example, doubling the size of a system does; it uses two for .
19
What we do amounts to a proof of that theorem for the case of n = 1 in Equation (8.5.2). It is trivial to extent
this to any n, including n = 0.
we then rst do the derivatives with respect to and then set = 1. This gives the result:
_
E
S
_
S +
_
E
V
_
V +
_
E
N
_
N = E (8.5.6)
The derivatives are known see Equation (8.5.1) on the preceding page, so
E = TS pV +N . (8.5.7)
This is not only useful on its own, but leads directly to the Gibbs-Duhem equation. To derive that
all we need do is form the derivative of Equation (8.5.7):
dE =
_
TdS pdV +dN
_
+
_
SdT V dp +Nd
_
. (8.5.8)
Comparison of the rst term on the right to Equation (8.5.1) on the preceding page shows that the
second term on the right must be zero. Thus:
SdT V dp +Nd = 0 , (8.5.9)
which is the Gibbs-Duhem equation.
20
8.5.2 The Legendre Transformations
Legendre transformations are common in classical thermodynamics; they are simply not called that
in most texts.
As an example the second thermodynamic function usually introduced in thermodynamics courses
is the enthalpy, H. This is done to have a thermodynamic function that depends on pressure and
not volume as the internal energy does. The enthalpy is formally dened by:
21
H = E +pV . (8.5.10)
Now E is a function of S, V , and N. What we want is a function of S, p, and N. The Legendre
transformation accomplishes this in a way that preserves all the information contained in E.
To do the transformation we simply take the derivative of Equation (8.5.10) and write:
dH = dE +pdV +V dp , (8.5.11)
and insert Equation (8.5.1) on the preceding page for dE. The result is
dH = TdS +V dp +dN . (8.5.12)
where H is now a function of S, V , and N.
Information is preserved in the following two senses. First, given a formula for E in terms of S,
p, and N, all the other thermodynamic quantities, i.e. T, p, and can be determined from it by
simple dierentiation.
Exactly the same information is now contained in H. Only now one needs a formula for H in terms
of S, p, and N. Now T, V , and p can be obtained from that by simple dierentiation.
22
20
This equation is useful in showing that, for example, in a three variable system, changes in the intensive variables
are not independent.
21
A formal denition is one that cannot in itself be used to actually do numerical computations, but leads to forms
that can be so used.
22
In fact, given the formula for E in terms of S, V , and N, one can obtain the appropriate formula for H by doing
the exact same Legendre transformation on the formula for E that we did to derive H.
We can proceed systematically to generate other thermodynamic functions. The rst term in Equa-
tion (8.5.1) on page 97 is TdS and we can swap those to form the Helmholtz free energy A:
A = E TS dA = SdT pdV +dN . (8.5.13)
The third term (weve already done the second term to get H) is dN, so we can swap and N to
get:
J = E N dJ = TdS pdV Nd (8.5.14)
which isnt often used in thermodynamics and has neither a name nor a symbol.
23
And we can swap things in pairs. One can generate a huge number of such functions, especially
when dealing with systems having more than one component. Then you can swap some of the Ns
for s and leave others alone.
One important thermodynamic function is obtained by swapping T and S and and N. Well call
this one I, again for no very good reason:
I = E TS N dI = SdT pdV Nd (8.5.15)
The importance of this is that its independent variables correspond to those of the grand canonical
ensemble. The symbol used for it is pV . This arises from a combination of Equation (8.5.15) and
Equation (8.5.7) on the previous page:
I = E TS M = pV (8.5.16)
One last function, the Gibbs free energy G, which is obtained from E via:
G = E TS +pV dG = SdT +V dp +dN (8.5.17)
Of course no discussion of this sort would be complete without mention of the entropy itself. This
is obtained from Equation (8.5.1) on page 97 by a simple rearrangement:
TdS = dE +pdV dN (8.5.18)
8.5.3 The Massieu Transformations and Dimensionless Equations
Unfortunately, while the thermodynamic functions of Section 8.5.2 are quite useful in classical me-
chanics, they are in the wrong form for statistical mechanics.
24
All the partition functions we have developed (and all those we will develop) are dimensionless. Each
corresponds to a thermodynamic function that is also dimensionless. Thus we need to develop an
entire set of unfamiliar thermodynamic functions that are dimensionless.
These are manipulated in the same way that the standard thermodynamic functions are, but
the transformations in this case are called Massieu Transformations.
25
These are really Legendre
transformations introduced by Massieu in 1869 for just this purpose.
We begin with Equation (8.5.18) and rewrite it in dimensionless form:
dS
k
=
1
kT
dE +
p
kT
dV

kT
dN , (8.5.19)
23
The author simply assigned it the symbol J, for no very good reason.
24
Which does not stop people from using them anyway, of course after the appropriate rearrangements.
25
Massieu, M. F., Sur les functions des divers uides, Comptes Rendus Acad. Sci., 69, 858-862, (1869).
and note that
26
the entropy is a function of E, V , and N. Applying the Euler Theorem on Homo-
geneous Functions to this we quickly nd that:
S
k
=
1
kT
E +
p
kT
V

kT
N , (8.5.20)
and that the Gibbs-Duhem equation for dimensionless equations is:
Ed
_
1
kT
_
+V d
_
p
kT
_
Nd
_

kT
_
= 0 . (8.5.21)
There is no standard notation to the functions derived from Equation (8.5.19) on the preceding
page. We here will adopt, with small changes, the notation recently given by Planes and Vires.
27
To that end we introduce three denitions:
28
= 1/kT , (8.5.22)
= p/kT , and (8.5.23)
= /kt . (8.5.24)
In these terms Equation (8.5.19) on the preceding page becomes
do = dE +dV +dN , (8.5.25)
which looks far more simple. The entropy here is denoted o instead of S because this is a dimen-
sionless quantity, not the ordinary entropy of classical thermodynamics.
29
The result corresponding to (8.5.20) is
o = E +V +N . (8.5.26)
It is now possible to dene seven new dimensionless thermodynamic functions starting from the
denition of the dimensionless entropy, (8.5.26):
(, V, N) = o E , (8.5.27)
(E, , N) = o V , (8.5.28)
(E, V, ) = o N , (8.5.29)
(, , N) = o E V , (8.5.30)
(E, , ) = o V N , (8.5.31)
(, V, ) = o E N , and (8.5.32)
(, , ) = o E V N . (8.5.33)
Here has been used instead of , the symbol used by Planes and Vires because Xi is used for the
partition function in the grand canonical ensemble.
Equations ((8.5.27) (8.5.33)) are once again formal denitions. Indeed, Equation (8.5.33) is iden-
tically zero as can be seen by inserting Equation (8.5.26) into it.
Useful relations can be obtained from the formal ones by a very simple procedure. This will be
illustrated for (, V, N). First we take the total derivative of Equation (8.5.27):
d = do dE Ed (8.5.34)
26
Surprise! Surprise!
27
Planes, Antoni and Vives, Eduard, Entropic Formulation of Statistical Mechanics, J. Stat. Phys. 106,
Nos. 3/4, 827 (2002).
28
Planes and Vires use in Equation (8.5.24) instead of . I have made this change because is most often used
in the derivation of the grand partition function.
29
There is much less to this than meets the eye. The only dierence between the two is that S has been divided by
k to get S.
and then insert Eq. (8.5.25) on the preceding page into it. The result is
d = Ed +dV +dN (8.5.35)
This is, in fact, the proper equation to use for identifying the canonical ensemble, and indeed we
used it in a disguised manner in Equation (3.3.22) on page 33 where we used d(A/kT) in place of
d.
To see that these are in fact equal we must go back to the formal denition of in Equation (8.5.27)
on the preceding page and reconvert that to the usual dimensioned form by multiplying through by
kT. This gives:
kT = TS E , (8.5.36)
and since A = E TS we see that is indeed A/kT.
All of the dimensionless potentials can be converted to dierentials in the same way that was
converted. Here is a listing of them:
d = Ed +dV +dN , (8.5.37)
d = dE V d +dN , (8.5.38)
d = dE +dV Nd , (8.5.39)
d = Ed V d +dN , (8.5.40)
d = dE V d Nd , and (8.5.41)
d = Ed +dV Nd (8.5.42)
Some, but not all of these are recognizable thermodynamic functions. The manipulations to prove
these are much like the one above for the Helmholtz free energy A above. When these are done the
following connections are found:
= A = (TS pV ) = H (8.5.43)
= G = U = pV (8.5.44)
Other Notations
We have used the notation of Planes and Vives (slightly modied). Other notations for the Massieu
functions exist.
The most important of these is the notation of Callen,
30
in which he uses S[1/T] to stand for the
Massieu transformation in which replaces E.
S[1/T] =
S
k

1
kT
E (8.5.45)
Weve used for this function.
The reader should note that S[1/T] is not the entropy. S[1/T] is a symbol for the entropy transformed
as:
dS[1/T] =
dS
k

1
kT
dE Ed
_
1
kT
_
(8.5.46)
and substituting Equation (8.5.19) on page 99 for dS/k leads to:
dS[1/T] = Ed
_
1
kT
_
+
p
kT
dV

kT
dN (8.5.47)
30
H. B. Callen, Thermodynamics, First Edition, Wiley, 1960, page 101.
Weve not adopted the Callen notation here,
31
because even though it shows what is being used
in the transformation, it is still too easy to confuse it with the entropy. The Planes and Vives
notation has the advantage of forcing the reader
32
to actually look the symbol up, thus reducing the
possibility of error.
31
Though it was used in earlier versions of this work.
32
The reader may not agree with this...
9. Simple Quantum Statistics
9.1 Quantum Statistics
It is a fact of quantum mechanics that a wave function must be either symmetric or antisymmetric
to exchanges in the positions of two identical particles.
This happens because although the square of the wave function can not change when identical
particles are interchanged, the wave function itself could change sign.
Both cases are known. If the sign changes, the wave function is antisymmetric and the particles are
said to obey Fermi-Dirac statistics and are called fermions.
If the sign does not change, the wave function is symmetric and the particles are said to obey
Bose-Einstein statistics and are called bosons.
Fermions turn out to have spins that are multiples of a half integer. Bozons have spins that are
multiples of a whole integer.
Fermions are common in chemistry. Electrons, protons, and neutrons are fermions. Fermions have
the property that no more than one fermion may be in a given state at a given time.
1
Bosons are less common in chemistry. The most common fundamental boson is the photon. Bosons
have the property that any number of them may be in the same state at the same time.
Quantum mechanically, the identity of particles results in a situation unknown in classical mechanics.
Two particles trading places does NOT result in two states that are formally identical. Trading places
results in no change whatsoever!. There is no sign that any change in the universe has taken
place at all.
Put in a more familiar context, we are used to a circle being made up of 360
. That is, a rotation

of 360
is needed to return a macroscopic object to its starting position.

To see this imagine a mark placed on the macroscopic object. There is no other way to return a
macroscopic object to its starting position than to rotate it 360
.
Now think about isotopically pure benzene. Rotation of a benzene molecule around an axis perpen-
dicular to its plane through 60
results in returning it to its starting position. There is no way to

mark an isotopically pure benzene molecule. Thus a benzene circle contains only 60
.
To illustrate the fundamental dierence between fermions, bosons, and classical particles obeying
Boltzmann statistics,
2
consider a system having only two particles and ve possible states:
1
This is the source of the Pauli exclusion principle, among other phenomenon.
2
I reject out of hand the term Boltzons. It does not roll readily o the English-adjusted tongue...
103
Chapter 9: Simple Quantum Statistics 104
state 1 2 3 4 5
1 +
2 +
3 +
4 +
5 +
where + indicates a legal boson state and a a legal state for both fermions and bosons. All 25
states are legal classical states. But there are only 15 Bose-Einstein states and just 10 Fermi-Dirac
states.
There are two things going on. Take, for instance, the state 1,1. It would have both particles in
state 1. Thats legal for classical and boson particles, but not for fermions. Then look at states such
as 1,2 and 2,1. Those are both legal classical states but they are identical fermion and boson
states. So while each has only one particle per state, there is only only one such state.
We have dealt with this situation in Section 7.8. There we spoke of boxes. Here we have energy
levels. One result in that section was Equation (7.8.7):
(, V, ) =
k
n
k,max
n
k
=0
_
e
n
k
, ((7.8.7))
This contains a sum over n
k
, running from 0 to n
k,max
for each box. In the case of fermions each
box can hold at most 1 particle, so n
k,max
is 1. And Equation (7.8.7) becomes simply
FD
(, V, ) =
k
_
1 +e
, (9.1.1)
But for bosons any number of particles can go into a box. So n
k,max
is innity. We then get:
BE
(, V, ) =
n
k
=0
_
e
n
k
. (9.1.2)
This last equation can be simplied. If x
k
is substituted for e
k
, we get:
BE
(, V, ) =
n
k
=0
x
n
k
k
, (9.1.3)
and we see that the sum is a geometric series whose sum is:
k=0
x
k
=
1
1 x
1 x < 1 ,
as long as x
k
= e
k
is less than 1. This is clearly the case for exp(
k
) since both and
k
are
positive. However, what about ?
The largest exponential term is the ground state (k = 0) so must be less than exp(
0
). Since
= exp(), this is equivalent to saying that must be less than the ground state energy
0
. If, as
is often the case,
0
= 0, then < 0 or < 1.
With this restriction on , Equation (9.1.3) reduces to:
BE
=
k
_
1 e
1
. (9.1.4)
These two equations can be written together as:
(, V, ) =
k
_
1 e
1
. (9.1.5)
We will use this equation with the convention that the Fermi-Dirac sign is on the top and the
Bose-Einstein sign is on the bottom.
9.2 Simple Results
Some simple results stem from Equation (9.1.5). If we take logs we get
ln = ()
k
ln
_
1 e
. (9.2.1)
Now since:
N) =
_
ln
_
=
_
ln
_
, (9.2.2)
it follows that
N) =
k
_
e
k
1 e
k
_
. (9.2.3)
Note the form of this. The average number of particles in the system is a sum over the quantum
states k of the system. Each term in the sum pertains to a dierent quantum k. Now each quantum
state also has an average population. In fact it must be true that:
N) =
k
n
k
) , (9.2.4)
where n
k
) is the expected number (average) of particles in quantum state k. Then from Equa-
tion (9.2.3) we can infer that n
k
) is given by:
n
k
) =
e
k
1 e
k
. (9.2.5)
It can be seen that for Fermi-Dirac statistics (top sign) the denominator is always larger than the
numerator and the average number of particles in a state is one or less. For Bose-Einstein statistics
(bottom sign), the numerator is less than one and so the average number of particles in a state can
be greater than one. This is exactly what wed expect.
Another similar result is a simple way to calculate the average energy E):
E) =
k
n
k
k
=
k
_

k
e
k
1 e
k
_
. (9.2.6)
The simplest result is saved for last. Since PV = ln ,
p)V =
k
ln
_
1 e
. (9.2.7)
This is not the Ideal Gas law!
9.3 The Ideal Gas Limit
The equations derived above up to this point are exact for systems made up of independent sub-
systems. Comparison to the corresponding expressions for classical statistics (the Boltzmann ex-
pressions) shows that these are dierent. Thus under conditions that favor the display of quantum
eects (low temperatures or high densities) it must be expected that the classical equations will not
be obeyed.
But in the other limit, high temperatures and low densities, we can expect that the classical results
will be obtained. In particular, it is to be hoped that both the Fermi-Dirac and the Bose-Einstein
equations, dierent though they may look, will reduce to Boltzmann statistics. And we hope that
Equation (9.2.7) on the previous page will turn into the ideal gas law.
In fact, it does. Any situation that results in the average occupation of a quantum energy level
being small will do it. Consider Equation (9.2.5) on the preceding page:
n
k
) =
e
k
1 e
k
. ((9.2.5))
The only way n
k
) will be much less than 1 for any energy level
k
will be for to be much less
than 1. This works for both Fermi-Dirac and Bose-Einstein statistics. And when it is true that is
much less than 1, the denominator in both cases becomes essentially 1. Then Equation (9.2.5) on
the previous page becomes
n
k
) = e
k
, (9.3.1)
which has to be less than 1 too since is small. Note that the right hand side of Equation (9.3.1)
will be small for k large since our energies are numbered in ascending order. So if
k
is large, n
k
)
is small in any case. It is only the situations in which k = 1 or some other very small integer that
we have to worry about .
If Equation (9.3.1) is now summed over all energy levels we get:
N) =
k
n
k
) =
k
e
k
= q , (9.3.2)
where q is the classical single particle partition function. Thus this equation is identical to Equa-
tion (7.7.9) on page 84, the classical result.
We can solve Equation (9.3.2) for :
=
N)
q
. (9.3.3)
Using this value of in Equation (9.3.1) then yields:
n
k
)
N)
=
e
k
q
, (9.3.4)
the right-hand side of which will be recognized as the Boltzmann expression for the probability of
nding a given particle in quantum energy level k. This is, of course, a classical result.
Once Equation (9.3.4) is obtained, all other results obtained at small must also be classical. Just
to conrm this, when is small, Equation (9.2.7) on the preceding page can be expanded in a power
series. The series needed is:
ln(1 x) = x +x
2
x
3
+x
4
... , (9.3.5)
which is valid for 1 x < 1. The result of the expansion, carrying only the rst term in is:
p)V =
k
ln
_
1 e
k
e
k
= q , (9.3.6)
which, by Equation (9.3.2) is equal to N). So the result is:
p)V = N) , (9.3.7)
which is clearly the ideal gas law.
So the quantum statistical expressions, by some colossal coincidence, both give rise to the same high
temperature low density result, the classical statistical thermodynamics of Boltzmann and Gibbs,
at least for systems composed of independent subsystems.
Is there really magic in this? No, at least if it is looked at in the right way. It is an experimental fact
that systems of independent subsystems, under classical conditions, do obey the same laws. There
were not two or more dierent sets of ideal laws discovered, there was only one. Thus, had one put
the question to Gibbs or Boltzmann, they would have replied that, if low temperature high density
laws were dierent from the laws they knew, they would nevertheless have to come down to the
classical laws in the classical region. Well, there are, and they do.
10. The Ideal Crystal
10.1 Introduction
The ideal crystal is an example of a system that is not made up of independent subsystems but
which can be converted to independent subsystems if we take a dierent view of the system.
The mechanics of doing this are a bit dicult and so we will approach this topic in a somewhat
unconventional way.
1
The rst thing to say is that the ideal crystal is much more of a hypothetical system than an ideal
gas. But we shall ignore this and pretend that real crystals are ideal enough for this model to work.
2
We will assume that we have a complete crystal with no imperfections and no missing atoms or
molecules. And we will assume that all the atoms or molecules are identical. We will also assume
that any motion of the atoms or molecules is insucient to cause them to leave their positions within
the crystal and wander around inside of it. And we will assume that the forces between particles
in the crystal are harmonic. Last, we shall ignore any internal structure of any of the constituents.
The last assumption isnt necessary, but it does allow us to focus on the main problem, that of
vibrations in a crystal. If one wants also to consider the internal structure, one simply puts it in
with the assumption that the internal structure (internal rotations and vibrations) do not inuence
the vibrations of the particles making up the crystal itself.
At any nite temperature (and, considering zero point energy, even at absolute zero) all the particles
of the system are vibrating around their equilibrium position in the crystal. And we have assumed
that the magnitude of the motion is such that the particle never is very far from its equilibrium
position.
The forces felt by any one of these particles depends not only on the position of the particle itself,
but on the position of its nearest-neighbor particles and, perhaps, even on particles further away. It
is not possible to consider any one of the basic atoms or molecules of the crystal as independent of
any of the others.
Thus the crystal has to be considered as a whole. And if we do this and treat the system classically,
we will end up with simultaneous equations of motion involving all the particles of the system. This
huge number of equations will have to be solved in order to know the properties of the system.
This is an example of a strongly interacting system. And in general such systems cannot be treated
exactly.
However, in this case we can simplify things greatly by focusing not on the particles, but on their
vibrations.
It seems clear that the vibrations are concerted. That is, the motion of one particle is communicated
to its neighbors and cause them to move in reaction. And that motion is communicated to their
neighbors and so on until the entire crystal is involved.
If the forces acting on the particles is harmonic as we have assumed, then it is always possible to
mathematically decompose the motions of the particles into a number of independent vibrations
1
But one, I hope, that will provide motivation for the diculties that will come.
2
Which seems to be true, at least in a rough way.
108
Chapter 10: The Ideal Crystal 109
existing within the crystal called normal vibrational modes. Each particle is aected by all the
normal mode vibrations, but the normal mode vibrations are independent of each other.
What that means is that if the amplitude of one normal mode is changed (or perhaps the mode is
eliminated in some way), the remainder of the normal vibrational modes is not aected in any way.
Thus we change our viewpoint and focus on the normal modes as independent subsystems and
pay no attention to the actual particles making up the crystal and without which no vibration could
take place.
This change in viewpoint leads to some very simple and very useful models of crystals.
10.2 The Models
10.2.1 Common Features
So we shall focus on the vibrations themselves and not on the particles making up the crystal. The
rst question that then comes up is: how many normal mode vibrations are there?
As weve discussed before (see Chapter 6), classical mechanics tells us that there are three degrees
of freedom for every particle. That means that each particle is free to move in three independent
directions. Thus a diatomic molecule has a total of six degrees of freedom. Three are manifested
as motions of the center of mass in the x-, y-, and z-directions. Two of the remaining three are
rotations about the center of mass and the last is a vibration about the center of mass.
If we have N particles in our crystal, we then expect to have 3N degrees of freedom. Three of these
are again the motion of the center of mass of the crystal and three more are rotations of the crystal
about the center of mass. The remaining 3N 6 are vibrations.
Since N is often of the order of Avogadros number, there is no real dierence between 3N and
3N 6 and we shall often simply speak of the 3N normal vibrational modes.
3
To allow for the possibility that not all of these normal vibrational modes have the same frequency,
we shall dene g() to be the degeneracy factor for the vibrations. Thus g()d will be the number
of normal mode vibrations that occur between frequency and frequency +d.
There are some restrictions on g(), the most important of which is that there must be nite number
of them, 3N, in particular. So we can write:
_

0
g()d = 3N (10.2.1)
We must also understand that g() is not really a continuous function. It is in fact highly discontin-
uous since there are only a nite number of vibrations between 0 and innity. So we might better
have written Equation (10.2.1) as a sum, but as long as we understand what that equation means,
there should be no confusion.
The rest is (almost) easy. We already know the single particle partition function for a vibration:
q(V, T) =
e
/2T
1 e
/T
(10.2.2)
where is given by:
=
h
k
(10.2.3)
3
As shall be seen later on when we discuss how we get from particles to vibrations, it will be seen that 3N is
actually correct!
with h Plancks constant, k Boltzmanns constant and the fundamental frequency of the harmonic
vibrator.
The canonical partition function is
Q(N, V, T) = e
NUo/2kT
(V, T) (10.2.4)
where the term exp(NU
o
/2kT) accounts for the binding energy of the crystal. The binding en-
ergy of any particular particle
4
is due to its interactions with all the other particles. Counting
these up leads to each particle being counted twice, hence the factor of two in the denominator of
Equation (10.2.4).
From Equation (10.2.4) we can get all of the thermodynamic properties. In fact, for historical
reasons, we are most interested in the heat capacity.
A long while back Dulong and Petit formulated a rule that the heat capacities of elements made up
of single atoms
5
had constant pressure molar heat capacities of 3R, where R is the gas constant.
That is still a useful rule.
However, when it became possible to make low temperature measurements, it became clear that this
law was disobeyed in that region. Heat capacities fell o toward zero, becoming proportional to T
3
at very low temperatures.
Classical statistical mechanics was unable to account for this. In fact the problem is similar to that
of black body radiation. And indeed, the same solution works for both. If one assumes that the
vibrations are quantized, the fall o to zero in the heat capacity becomes simple to demonstrate.
6
There is one catch. We cant do anything unless we know g(). It is in the value of g() that most
of the simple theories vary.
10.2.2 The Einstein Theory
Einstein, with characteristic ability to cut to the heart of a problem, assumed that all of the vibra-
tional frequencies were identical. That is, he took g() to be a constant.
With that, things are fairly simple. We have only one and Equation (10.2.4) is then:
Q(N, V, T) = e
NUo/2kT
q(V, T)
3N
(10.2.5)
where the factor 3N is because thats the number of independent vibrations we have.
7
With q(V, T
given by Equation (10.2.2) on the preceding page we get
Q(N, V, T) = e
NUo/2kT
_
e
/2T
1 e
/T
_3N
(10.2.6)
And now the thermodynamics of the Einstein model are very simple.
A) = kT lnQ =
NU
0
2
3NkT ln
_
e
/2T
1 e
/T
_
(10.2.7)
4
We must not forget that underneath the vibrations we are still dealing with particles.
5
Which leaves out elements such as sulfur but includes most metals.
6
And was demonstrated by A. Einstein, Ann. Physik, 22: 180 (1907)
7
Actually we have 3N 6 if we take three translational and three rotational degrees of freedom for the crystal into
account. However, with N of the order of Avogadros number, six is quite negligible.
E) =
_
lnQ
_
=
NU
0
2
+
3Nh
2
+ 3NkT
_
/T
e
/T
1
_
(10.2.8)
Only the last term in the expression for the energy depends on temperature, so the heat capacity is
then:
C
V
=
_
E
T
_
= 3Nk
_
T
_
2
e
/T
(e
/T
1)
2
(10.2.9)
V
vs T/ for an Einstein Crystal
Equation (10.2.9) does contain the law of Dulong and Petit. One has to take the limit as T .
In that limit /T gets very small so that the exponentials can be expanded in a power series:
e
/T
= 1 +

T
+
So the numerator in Equation (10.2.9) can be replaced by 1, but since one is subtracted from the
exponential in the denominator, the denominator must be replaced by /T. The result is then
immediate:
C
V
3Nk as T (10.2.10)
This is a Good Thing; had not the result agreed with the known experimental results wed have
heard no more of the theory.
Encouraged, let us look at the low temperature behavior of the heat capacity. Here /T becomes
very large, the one in the denominator can be neglected as small and we end up with
C
V
3Nk
_
T
_
2
e
/T
as T 0 (10.2.11)
This clearly goes to zero as the temperature goes to zero. So we have done the primary job, the
overall behavior of the heat capacity is found and we can be happy.
There is another test. Does the entropy obey the Third Law? It should go to zero as the temperature
goes to zero.
The entropy is easy to nd:
S =
E
T

A
T
= 3Nk
_
/T
e
/T
1
ln(1 e
/T
)
_
(10.2.12)
which indeed
8
does go to zero as T goes to zero.
Much happiness reigns.
But wait. There is a dark cloud.
Experimentally the heat capacity goes to zero as T
3
, and not as given in Equation (10.2.12) on the
previous page, which has an exponential fall-o to zero.
Of course, this is not surprising. The Einstein theory is crude and certainly not all the vibrational
modes have the same frequency. And so we look for a better theory.
9
10.2.3 The Debye Theory
We nd a better theory in the Debye model.
10
Debye approached the problem from the standpoint
of the continuum mechanics of solids. He knew that for a continuous medium g() was proportional
to
2
for low frequencies. And those are precisely the frequencies that govern the low temperature
behavior because they contain the least energy.
And he realized that the classical crystal becomes continuous in the limit of very long wavelength
vibrations.
A short wavelength vibration, say one with a wavelength of three or four interparticle distances,
cant really be treated as a wave in a continuous medium. At some point in time it might have one
particle at a node, another at the wave maximum, a third at a node again, and the fourth at the
wave minimum with nothing in between.
Yet a long wavelength vibration might span 10
19
particles, which would be a very dierent story.
Particles would be almost continuously placed along the wave.
Of course the long wavelength waves have the lowest energy, and those would be the important ones
at low temperatures.
With these insights Debye essentially set g() equal to A
2
where A is a constant. We can determine
A from Equation (10.2.1):
_
max
0
g()d = A
_
max
0
2
d = 3N (10.2.13)
where
max
is used instead of innity since with a nite number of normal modes there must be an
upper limit to the vibrational frequencies allowed.
Equation (10.2.13) gives us
A
3

3
max
= 3N
from which
A =
9N
3
max
(10.2.14)
so that we end up with
g()d =
_
9N(
2
/
3
max
) d 0
max
0 otherwise
(10.2.15)
8
Though it is left as an exercise for the reader.
9
Einstein was well aware of this deciency. He wrote in the early days of quantum theory (1907 in this case) and
was interested in showing that quantization would solve the heat capacity problem. To do this he used the crudest
model that contained the basic features needed.
10
P. P. Debye, Ann. Physik, 39, 789 (1912)
The energy is then:
E) =
NU
0
2
+
9NkT
max
_
max
0
_
h
2kT
+
h/kT
e
h/kT
1
_
2
d (10.2.16)
Letting x = h/kt and u = h
max
/kT
E) =
NU
0
2
+
9Nk
u
3
_
u
0
_
x
2
+
x
e
x
1
_
x
2
dx (10.2.17)
The rst integral in Equation (10.2.17) is easy but the second is not. Because of this the energy is
usually written as:
E) =
NU
0
2
+
9Nh
max
8
+ 3NkTD(u) (10.2.18)
where D(u) is called the Debye function.
11
The Debye function is dened as:
D(u) =
3
u
3
_
u
0
x
3
dx
e
x
1
(10.2.19)
and is tabulated in many places.
12
Some properties of the Debye function are easy to establish.
13
Although we will not show it, it is
not hard to prove that
D(u)
3
u
3
_
u
0
x
3
dx
e
x
1

4
5u
3
as T 0 and u (10.2.20)
and
D(u) =
3
u
3
_
u
0
x
3
dx
(1 +x + ) 1
= 1 as T (10.2.21)
Thus as T 0
E)
NU
0
2
+
9Nh
max
8
+
3N
4
h
max
5
_
kT
h
max
_
4
as T 0 (10.2.22)
from which it is easy to see that the heat capacity is going to go as T
3
, as expected. At high
temperatures we have:
E )
NU
0
2
+ 3NkT as T (10.2.23)
which will give us, as it should, the law of Dulong and Petit once again.
The heat capacity can be computed more completely from the Debye function. We need to dier-
entiate Equation (10.2.16) with respect to T. Since the rst two terms of that equation are not
functions of T we are left with:
C
V
=
_
E
T
_
V,N
=

T
_
3NkTD(u)
_
= 3NkD(u) + 3NkT
_
D(u)
T
_
(10.2.24)
Using the method explained in Appendix 10.5 on page 121 we get:
_
D(u)
T
_
=
3
T
_
D(u)
u
e
u
1
_
(10.2.25)
11
It isnt too shabby to have a function named after you...
12
One needs to be careful with such tables. There are several things called Debye functions, all rather simply
related to each other and to Equation (10.2.19). Check the denitions of the function in any table that you might
use.
13
It needs to be said that the Debye function is as much a function as the sine or cosine or exponential, for that
matter. Those have to be tabulated too.
so that the result is:
C
V
= 3Nk
_
4D(u)
3u
e
u
1
_
(10.2.26)
Following convention we dene a Debye Theta Temperature
D
by
u =
h
max
kT
=

D
T
so that
D
=
h
max
k
(10.2.27)
Graph of C
V
vs T/
D
for a Debye Crystal
V
vs T/
D
for a Debye Crystal
In these terms we have
C
V

12Nk
4
5
_
T
D
_
3
as T 0 (10.2.28)
This reproduces the experimental results quite well, so well that in fact it is used to compute Third
Law entropies. It is not easy to get heat capacities below about 10K, so all one needs to do is to
assume that C
V
goes as CT
3
between 0 and 10K. The constant C can be found from the measured
heat capacity at 10K.
One nal note, the entropy is given by:
S =
9Nk
u
3
_
u
0
_
x
e
x
1
ln(1 e
x
)
_
x
2
dx (10.2.29)
At low temperatures u gets very large and nothing is lost by setting it to innity. In that limit S
becomes
S
4
4
Nk
5
_
T
D
_
3
(10.2.30)
which can be seen to go to zero with temperature, as it should.
10.3 The One-Dimensional Crystal
Imagine a collection of N atoms arranged such that each has two nearest neighbors, except possibly
for atoms 1 and N. This is a one-dimensional crystal. A line through all the atoms doesnt have to
be straight, what is important is the interactions.
In fact here we will assume that the atoms are arranged on a circle so that the nearest neighbors of
atom 1 are atom 2 and atom N.
And of course the particles dont have to be atoms. They can be molecules just as easily. However,
we dont want to have to worry about the details of the internal motions of molecules or electronic
excitations of either molecules or atoms, for that matter. If one really needs to incorporate internal
motions, thats easy enough to do. The math gets no harder, but the equations get signicantly
longer.
Let x
i
denote the position of the ith atom. And let the potential energy of interaction be u(x
i+1
x
i
).
Then the total potential energy is:
U =
N
i=1
u(x
i+1
x
i
) (10.3.1)
where it is understood that the subscripts wrap at atom N so that atom N + 1 is atom 1.
For simplicity right now let x
i+1
x
i
= r so that r is the distance between neighboring atoms, then
the potential energy between pairs of atoms is u(r).
Now for a key assumption. We assume that u(r) has a single minimum at r = a. We also assume
that u(r) rises rapidly enough from that minimum so that the atoms can never move past each other.
From the rst assumption we have that:
du(r)
dr
= 0 at r = a (10.3.2)
where a is the equilibrium distance between any pair of atoms.
For simplicity well let the rst particle be at x
1
= 0. Thus x
i
= (i 1)a. These particles are moving
so at any time their actual positions are:
x
i
= (i 1)a +
i
(10.3.3)
where
i
is the deviation of particle i from its equilibrium position. With this we have:
x
i+1
x
i
= ia +
i+1
(i 1)a
i
=
i+1
i
+a (10.3.4)
and
u(x
i+1
x
i
) = u(
i+1
i
+a) (10.3.5)
so that
U =
N
i=1
u(
i+1
i
+a) (10.3.6)
To simplify things let
y
i
=
i+1
i
(10.3.7)
We now expand U in Equation (10.3.6) in a power series around y
i
= 0:
U = U
o
+
N
i=1
_
U
y
i
_
0
y
i
+
1
2
N
i=1
N
j=1
_

2
U
y
i
y
j
_
0
y
i
y
j
+. . . (10.3.8)
The constant term is easy to evaluate since all the ys are zero there. So
U
0
=
N
i=1
u(a) = Nu(a) (10.3.9)
The rst derivative is zero by virtue of Equation (10.3.2).
We evaluate the second derivative by noting that:
2
U
y
i
y
j
=

2
y
i
y
j
N
i=1
U(y
i
+a) (10.3.10)
This is zero if and only if i = j so only the pure second degree derivatives survive.
Thus to terms in the second order
14
we have:
U = Nu(a) +
k
2
N
i1
y
2
i
(10.3.11)
14
What weve done is to take an arbitrary potential with a single minimum and derive the harmonic approximation
to it.
where k is the value of the second derivative at equilibrium.
Reverting now to our notation, we have:
U = Nu(a) +
k
2
N
i1
(
i+1
i
)
2
(10.3.12)
To see how the atoms move, we nd the force F
i
on the ith atom:
F
i
=
_
U
i
_
= m
d
2
i
dt
2
(10.3.13)
Dierentiation of Equation (10.3.12) gives us several terms in :
m
d
2
i
dt
2
= k(
i+1
2
i
+
i1
) (10.3.14)
which is true for all atoms. If we had not put atom N next to atom 1, wed have had to take the end
atoms into account separately. This is no problem, but it just adds complication without adding
any insight.
We assume that the solution to Equation (10.3.14) is:
i
= Asin
_
2x
i
+
_
sin(2t) = Asin
_
2a(i 1)
+
_
sin(2t) (10.3.15)
where A is the amplitude of the vibration, is the phase of the vibration, is the wavelength of
the wave, and is its frequency. The rst term on the right-hand side of Equation (10.3.15) is the
vibrational amplitude at point x
i
= (i 1)a in the lattice and the second term is the time variation
of that amplitude.
We now dierentiate Equation (10.3.15) twice with respect to time to get:
d
2
i
dt
2
= 4
2
2
Asin
_
2a(i 1)
+
_
sin(2t) (10.3.16)
Inserting Equation (10.3.16) into the left-hand side of Equation (10.3.14) and using Equation (10.3.15)
with suitable values of i in the right-hand side of Equation (10.3.14) we get, after some manipulation
4
2
2
m
i
= 2k
_
cos
2a
1
_
i
= 4k
i
sin
2
_
a
_
(10.3.17)
from which we see that the condition under which Equation (10.3.15) is a solution to Equa-
tion (10.3.14) is
2
=
1
2
k
m
sin
2
_
a
_
(10.3.18)
Now the sinusoidal solution in Equation (10.3.15) has meaning only at the actual atoms themselves.
Or, put more exactly, only at the positions x
i
. If we had other waves with wavelength
satisfying:
a
= 1 +
a
or
a
= 1
a
(10.3.19)
these would result in exactly the same solutions as before, except that in the second case in Equa-
tion (10.3.19) the displacement would have the opposite sign.
Thus we can restrict our wavelengths to satisfy a/ = 1/2 or:
2a (10.3.20)
We see that the waves in our crystal can be innitely long
15
but they cannot be shorter than 2a.
Put another way it does not matter if the wave makes one oscillation between adjacent atoms or ten
oscillations. The eect on the atoms is the same and hence the waves are the same.
We can use this restriction in Equation (10.3.18) on the preceding page to get:
2
=
1
2
k
m
sin
2
_
2
_
or
=
1
_
k
m
_
1/2
=
max
(10.3.21)
In the long wavelength limit, becomes large and hence sina/ becomes very small. In that limit
sina/ = a/ and Equation (10.3.18) on the previous page becomes:
2
=
1
2
k
m
_
a
_
(10.3.22)
and
=
_
k
m
_
1/2
a = c (10.3.23)
where here c is the speed of the waves.
What weve shown just above is that the speed of the waves
16
is constant at the long wavelength
end of the spectrum. This is the region of low temperature excitations of the crystal. At shorter
wavelengths is not constant, indeed their relationship is contained in Equation (10.3.18) on the
preceding page. If one expands the sine function (after taking the square root of each side) in that
equation as a function of its arguments, one can see that the product will depend on and the
smaller the more rapidly c will change.
17
We can derive the degeneracy function g() for the one-dimensional crystal from Equation (10.3.18)
on the previous page, or rather its square root:
=
1
_
k
m
_
1/2
sin
a
(10.3.24)
by rst noting that the wavelengths in our crystal are aN/1 units long (a is the interatom spacing),
aN/2 units long, aN/3 units long, etc., or in general = aN/n units long. This rearranges to
a/ = n/N and so we can write Equation (10.3.24) as:
=
1
_
k
m
_
1/2
sin
n
N
(10.3.25)
If we treat n as a continuous variable we have that:
d =
1
N
_
k
m
_
1/2
cos
n
N
dn (10.3.26)
Noting that sin = sin( ) there are two identical values of , one for a given value of n and the
other for n
= N n. So we can (as we have) limit n to be half of the allowed range and multiply
by two to compensate.
15
Assuming an innitely long one-dimensional crystal.
16
These are, of course, sound waves not electromagnetic ones.
17
This phenomenon, called dispersion can be seen (or rather heard) during thunderstorms when sharp high pitched
sound of a thunderclap reaches your ears faster than the low rumbles that come later.
We now identify dn with g(), the number of frequencies in the range from to + d. We then
get:
g()d =
2N(m/k)
1/2
cos(n/N)
d (10.3.27)
where it should be remembered that n N/2. The notation here can be simplied a good deal.
First we let
0
=
_
k
m
2
_
1/2
(10.3.28)
where
)
is the maximum allowed frequency. Then we rewrite Equation (10.3.25) on the previous
page as
=
0
sin
n
N
(10.3.29)
then
cos
n
N
=
1
0
(
2
0

2
)
1/2
and the nal form for the degeneracy function is:
g()d =
2N
(
2
0

2
)
1/2
d (10.3.30)
As the frequency increases, the denominator gets smaller and smaller until at the limit there is an
innite discontinuity. But that doesnt matter. This result could have been deduced from classical
mechanics. What matters is the low frequency long wavelength limit where becomes negligible
compared to
0
. When that approximation can be made the right-hand side of Equation (10.3.30)
becomes a constant. The Debye assumption for g() under those conditions is also a constant.
A very similar thing happens in three dimensions. The Debye assumption, Equation (10.2.15) on
page 112, is correct at low frequencies, which are the important ones at low temperatures, but not
correct at high frequencies. But that doesnt matter because at temperatures where high frequencies
become important, the behavior is classical and the details of g() are not so important.
10.4 Appendix: Behavior of the Debye Function
Since the behavior of the Debye function with temperature is not obvious from its denition except
at the extremes of temperature discussed above, it is useful to see how series expansions of the Debye
function can be developed.
As a reminder the Debye function is:
D(u) =
3
u
3
_
u
0
x
3
dx
e
x
1
((10.2.19))
At high temperatures u is small and since 0 x u, then x must be small too. The exponential in
the denominator can then be expanded in a power series to get:
x
3
e
x
1
=
x
3
x +x
2
/2! +x
3
/3! +
=
x
2
1 +x/2! +x
2
/3! +
(10.4.1)
where the denominator was divided by x to get it into standard form. The reciprocal of the series
in the denominator is easily taken. The general formula for it is
1
1 +a
1
x +a
2
x
2
a
3
x
3
+
= 1 +b
1
x +b
2
x
2
+b
3
x
3
+ (10.4.2)
where the a
n
on the left are assumed known and the b
n
are to be determined. All one needs to do
is multiply it out to get:
1 +b
1
x +b
2
x
2
+b
3
x
3
+ +a
1
x +a
1
b
1
x
2
+a
2
b
2
x
3
+
+a
2
x
2
+a
2
b
1
x
3
+ = 1 (10.4.3)
Grouping these gives:
1 + (a
1
+b
1
)x + (a
2
+a
1
b
1
+b
2
)x
2
+
(a
3
+a
2
b
1
+a
1
b
2
+b
3
)x
3
+ = 1 (10.4.4)
and the only way this equality can hold is if each of the coecients of x vanishes. Thus we have the
bs in terms of the as:
b
1
= a
1
b
3
= 2a
1
a
2
a
3
a
3
1
b
2
= a
2
1
a
2
b
4
= 2a
1
a
3
3a
2
1
a
2
a
4
+a
2
2
+a
4
1
(10.4.5)
and
x
2
1 +x/2! +x
2
/3! +
= x
2
_
1
1
2
x +
1
12
x
2
1
720
x
4
+
_
=
x
2
1
2
x
3
+
1
12
x
4
1
720
x
6
+ (10.4.6)
where there is no typographical error, one term has a zero coecient.
So the Debye function at high temperatures can be written as:
D(u) =
3
u
3
_
u
0
x
3
dx
e
x
1
=
3
u
3
_
u
0
_
x
2
1
2
x
3
+
1
12
x
4
1
720
x
6
+
_
dx (10.4.7)
which can now be integrated term by term to give:
D(u) = 1
3u
8
+
u
2
20

u
4
1680
(u 1) (10.4.8)
which in fact has an error of only 1 in the fth decimal place at u = 1.
At low temperatures u is large and another trick can be used. We write the Debye function as:
D(u) =
3
u
3
_

0
x
3
dx
e
x
1

3
u
3
_

u
x
3
dx
e
x
1
(10.4.9)
The rst integral can be done and has the value
4
/15. In the second term we can multiply top and
bottom of the integrand by exp(x) to get
x
3
dx
e
x
1
=
x
3
e
x
dx
1 e
x
(10.4.10)
The trick is now to recognize that the denominator is a geometric series
1 +y +y
2
+y
3
+ =
1
1 y
y < 1
and since y is our exp(x) it is always less than one, so we can write the second integral in Equa-
tion (10.4.9) as:
x
3
e
x
1 e
x
= x
3
e
x
_
1 +e
x
+e
2x
+
(10.4.11)
which is always valid since x is always positive.
So D(u) is then:
D(u) =

4
5u
3

3
u
3
_

u
x
3
e
x
_
1 +e
x
+e
2x
+
dx (10.4.12)
This can be re-arranged to
D(u) =

4
5u
3

3
u
3
n=1
_

u
x
3
e
nx
dx (10.4.13)
The integral is
_

u
x
3
e
nx
dx =
_
x
3
n
+
3x
2
n
2
+
6x
n
3
+
6
n
4
_
e
nx
u
=
_
u
3
n
+
3u
2
n
2
+
6u
n
3
+
6
n
4
_
e
nu
(10.4.14)
When all the pieces are put together the Debye function is then:
D(u) =

4
5u
3

n=0
_
3
n
+
9u
n
2
+
18u
2
n
3
+
18u
3
n
4
_
e
nu
(10.4.15)
This series in principle is exact. Keeping only one or two correction terms suces for values of u > 5.
For smaller values of u more and more correction terms are needed.
18
The diculty in using this
result for moderate u lies primarily in the fact that as u decreases, the rst term increases greatly in
value. And so does the sum of the correction terms. The result then is the small dierence between
two large quantities. Numerically this can be a recipe for disaster unless great care is taken.
19
18
I have followed the derivation of Mayer and Mayer, Statistical Mechanics, First Edition, John Wiley and Sons,
1940. The only real dierence is that Equation (10.4.15) is more general, Mayer and Mayer having stopped with
n = 2.
19
If one needs to do this, one should consult a text on numerical analysis.
10.5 Appendix: Dierentiating Functions Dened by Inte-
grals
It sometimes happens that one needs to nd the derivative of a function dened by an integral. For
example, consider the integral
I(r) =
_
r
0
xdx (10.5.1)
and let us assume that one wants to know dI/dr. Here it is easy. One rst does the integral:
I(r) = r
2
/2 (10.5.2)
and then dierentiates with respect to r to get:
dI(r)
dr
= r (10.5.3)
But what if the integral cannot be done?
The general case of this is an integral that is a function of r that could be given by:
I(r) =
_
b(r)
a(r)
f(r, x)dx (10.5.4)
where f(r, x) is some function that cannot be readily integrated and a and b as well as f(r, x) depend
on r.
The solution is based on Leibnitz rule and is given by
20
d
dr
_
b(r)
a(r)
f(r, x) dx = f
_
r, b(r)
_
b
(r) f
_
r, a(r)
_
a
(r) +
_
b(r)
a(r)
f(r, x)
r
dx (10.5.5)
where a
(r) and b
(r) are the rst derivatives of the limits a and b with respect to r.
Here are some examples:
Example 10.1:
Consider the integral
I =
d
dr
_
2r
2
1
r
x
dx (10.5.6)
The integral can be done easily enough and the result then dierentiated. Doing so we get:
I =
d
dr
_
2r
2
1
r
x
dx =
d
dr
r lnx
2r
2
1
=
d
dr
(r ln2r
2
) = 2 + +ln2 + 2 lnr (10.5.7)
Or we could use Equation (10.5.5):
I =
d
dr
_
2r
2
1
r
x
dx =
r
2r
2
4r
r
1
0 +
_
2r
2
1
_
d
dr
r
x
_
dx
= 2 +
_
2r
2
1
dx
x
= 2 + lnx
2r
2
1
= 2 + ln2 + 2 lnr (10.5.8)
20
Im not going to prove this. The proof is not at all hard but would take us too far aeld. My source for this is
W. Kaplans Advanced Calculus, Addison-Wesley, 1952, a book published over 50 years ago but still in print as the
5
th
edition published in 2002. The page reference is 220 in the original edition. The proof can likely be found in any
reasonable calculus book.
which gives the same result.
Heres another example using the Debye Function:
Example 10.2:
Consider
I =
3
u
3
_
u
0
x
3
dx
e
x
1
(10.5.9)
where u = h/kT and x = h
max
/kT which is the Debye function. This example is a bit more
complex than the example above rst because the T in x is not the T that is being dierentiated.
It is a dummy variable of integration and could be changed to another letter entirely. And second
it is more complex because of the terms out in front of the integral.
All these complications do is require us to be more careful. I rewrite Equation (10.5.9) to show the
explicit dependence on T using u = b/T with b = h/k:
I =
3T
3
b
3
_
b/T
0
x
3
dx
e
x
1
(10.5.10)
Now
dI
dT
=
d
dT
_
3T
3
b
3
_
b/T
0
x
3
dx
e
x
1
_
=
3
T
3
u
3
_
b/T
0
x
3
dx
e
x
1
+
3
u
3
d
dT
_
b/T
0
x
3
dx
e
x
1
(10.5.11)
which is:
dI
dT
=
3
T
D(u) +
3
u
3
d
dT
_
b/T
0
x
3
dx
e
x
1
(10.5.12)
Now we can concentrate on dierentiating the integral itself. We have:
d
dT
_
b/T
0
x
3
dx
e
x
1
=
u
3
e
u
1
_
du
dT
_
+
_
b/T
0
d
dT
x
3
dx
e
x
1
(10.5.13)
where since the integral is not a function of T the last term is zero. The derivative du/dT is simply
u/T so the result is
d
dT
D(u) =
3
T
_
D(u)
u
e
u
1
_
(10.5.14)
11. Simple Lattice Statistics
11.1 Introduction
Now that ideal gases have been dealt with, it might be instructive to look at other simple ideal models.
One set of such models arise from the consideration of independent uid molecules sticking to, or
being adsorbed
1
onto a site without that aecting any other side.
Quotes were used around site because the nature of the site can vary enormously. It could be a
position on a crystal surface that can cause another molecule to stick to it, or it could be a similar
position along a polymer such as DNA. The sites could be organized in a regular manner or they
could be random. Each model is dierent, but (almost) all can be treated by similar means.
11.2 Langmuir Adsorption
One should start any problem in statistical thermodynamics by specifying the model one is going to
use. And that specication should be as precise as one can make it so as to avoid later confusion.
The model we consider here is often called a model of Langmuir Adsorption.
Here we will consider the surface of a crystal that contains M independent adsorption sites. The
sites do not inuence each other, and while we do not have to specify exactly how that happens,
it often pays to have something specic to think about. So we will stipulate that the sites are far
enough apart so that they are independent.
Each site has a potential energy U(x, y, z) associated with it. The x and y directions lie in the
surface of the crystal while the z direction is perpendicular to the surface.
We also assume that the potential U(x, y, z) has a single minimum at the place we call the site and
rises in all directions from that position. The zero of energy will be taken as the energy of a gas
molecule at rest innitely far from the surface. The energy of the minimum of U(x, y, z) will be
denoted as U
0
, which is thus a negative number.
A particle bound to the site is free to vibrate in all three directions, quite possibly with dierent
vibrational frequencies in each direction.
Thus the single particle partition function for a particle bound to the site is q(T) given by:
q(T) = q
x
q
y
q
z
e
U0/kT
(11.2.1)
where q
x
, q
y
, and q
z
are the partition functions associated with particle vibration in the correspond-
ing directions.
The heat of adsorption is the dierence between the energy a particle has at innity (zero) and the
energy it has at the potential minimum, which is
U
adsorption
= U
0
+h(
x
+
y
+
z
)/2 (11.2.2)
1
Note that the word adsorbed means stuck onto while the similar sounding word absorbed means to be taken into.
123
Chapter 11: Simple Lattice Statistics 124
Here the sites are distinguishable since they are xed on the surface of a macroscopic object, the
crystal. Thus wed expect that Q would be q
N
, but wed be wrong! We are wrong because not all
of the M independent adsorption sites need be occupied. If we let the number of occupied sites be
N, then we have what is known as a positional degeneracy due to the fact that for N < M, there is
more than one way to arrange the N particles on the M sites.
However, the problem is easy to solve. There are M! arrangements of occupied and unoccupied sites.
The order of the N occupied sites does not matter, nor does the order of the (M N) unoccupied
sites. So the positional degeneracy
pos
is:
pos
=
M!
N!(M N)!
(11.2.3)
and thus Q(M, N, T) should be written as:
Q(N, M, T) =
M!
N!(M N)!
q
N
(11.2.4)
where M occurs in Q(M, N, T) because here M plays the role of a volume.
The thermodynamics of this system depends on the logarithm of Q:
lnQ = M lnM N lnN (M N) ln(M N) +N lnq (11.2.5)
and instead of
dE = TdS pdV +dN
we have
dE = TdS dM +dN (11.2.6)
Here M plays the role of a surface area, and indeed if we make the reasonable assumption that M
is proportional to the surface area / with proportionality factor a, then aM = /. We can then
interpret /a as a spreading pressure
s
.
Using the abbreviation:
=
N
M
where 0 1 (11.2.7)
it is then easy to show that:
kT
=
_
lnQ
M
_
N,T
= ln(1 ) = +
1
2
2
+
1
3
3
+ (11.2.8)
and
kT
=
_
lnQ
N
_
M,T
= ln

(1 )q
(11.2.9)
The rst of these two equations has the property that as 0, kT. If we restrict ourselves
temporarily to small we then have = kT or a
s
= aNkT//, or
s
/ = NkT , N/M very small. (11.2.10)
which is a two-dimensional form of the ideal gas law.
The second of the two equations is very interesting. Recalling that in thermodynamics
=
o
+RT lnp/p
o
where is the chemical potential of a substance at the current pressure p, p
o
is the standard pressure,
and
o
is the chemical potential at the standard pressure. Then if we have an ideal gas at a pressure
p in equilibrium with the gas adsorbed on the surface, then the chemical potential of the adsorbed
gas and the chemical potential
gas
of the ideal gas must be equal. So:
kT
= ln

(1 )q
=

gas
kT
=

o
kT
+ lnp
and
ln

(1 )q
=

o
kT
+ lnp (11.2.11)
If we let:
= qe
o
/kT
(11.2.12)
and solve Equation (11.2.11) for we nd that:
=
p
1 +p
(11.2.13)
Graph of vs. p for Langmuir Adsorption
Figure 11.1: Graph of vs. p for Langmuir Adsorption
This is the Langmuir Adsorption Isotherm which gives the fraction of sites covered with adsorbed
gas as a function of the gas pressure p in the system. In reality, many adsorbing systems follow this
equation, at least approximately.
11.3 The Grand Canonical Site Partition Function
We have dealt with the grand canonical partition function for sites in the general case in Section
7.8. But now we worry about the interactions between the particles attached to the same site.
Here we start with (, M, ) where M is the number of sites. It plays the role of the volume V .
Now we let a
n
be the number of sites having n particles attached. How they are attached is of
no importance. Then we can write q(n)
an
as the canonical partition function for a single site with
exactly n particles attached to it.
Then the canonical partition function for the entire system is:
Q(N, M, ) =
M!q(0)
a0
q(1)
a1
q(m)
am
a
0
!a
1
! a
m
!
(11.3.1)
where m is the maximum number of particles that can be absorbed at a site.
2
The sum is over all
sets of as that satisfy the following conditions:
m
n=0
a
n
= M and
m
n=0
na
n
= N (11.3.2)
Why does Equation (11.3.1) have this form? Because we have M boxes (sites), a
n
of which contain
n particles. The particles themselves dont matter. They only serve to distinguish one box from
another. So an a
10
box is distinguishable from an a
17
box, but the a
10
boxes are all the same.
2
Of course m could be innite.
There are M! ways of arraning the M boxes, but rearranging the a
17
boxes, for example, changes
nothing. So the total number of distinct ways of arranging the M boxes is:
M!
a
0
!a
1
! a
m
!
(11.3.3)
Expression (11.3.3) can be thought of as the degeneracy factor associated with the product of qs in
Equation (11.3.1) on the previous page.
We cannot directly evaluate Equation (11.3.1) on the preceding page. But we can if we form the
grand canonical partition function (, M, ):
(, M, ) =
mM
N=0
Q(N, M, )e
N
=
mM
N=0
Q(N, M, )
N
(11.3.4)
where we have made the substitution of for e
in the last equation.

Substituting for Q(N, M, ) in Equation (11.3.1) on the previous page:
(, M, ) =
M!q(0)
a0
q(1)
a1
q(m)
am
N
a
0
!a
1
! a
m
!
(11.3.5)
which can be quickly rewritten as:
(, M, ) =
M![q(0)
0
]
a0
[q(1)
1
]
a1
[q(2)
2
]
a2
[q(m)
m
]
am
a
0
!a
1
! a
m
!
(11.3.6)
If you are familiar with the multinomial theorem, which is like the binomial theorem but for more
variables, the right-hand side of Equation (11.3.6) can be recognized as the Mth power of some
polynomial sum. Here we have:
= q(0)
0
+q(1)
1
+q(2)
2
+ +q(m)
m
(11.3.7)
and
(, M, ) =
M
(11.3.8)
and, lastly:
=
m
n=0
q(n)
n
(11.3.9)
The average number of molecules N) on all the sites (adsorbed) is
N) =
_
ln
_
= M
_
ln
_
(11.3.10)
and s), the average number of molecules per site is then
s) =
N)
M
(11.3.11)
11.4 Examples
Here are three examples of the use of the grand site partition function.
11.4.1 The Langmuir Adsorption Isotherm Again
If we set the maximum number of molecules per site to be 1, q(0) to be 1 and q(1) to be simply q,
we get:
= 1 +q, and =
M
(11.4.1)
so that, with Equation (11.3.11) on the previous page
N) = M
_
ln
_
=
Mq
1 +q
(11.4.2)
and since
=
N)
M
=
q
M
(11.4.3)
we have recovered Equation (11.2.13) on page 125.
Recognizing that
M = ln (11.4.4)
we then get
= kT ln(, T) (11.4.5)
11.4.2 Independent Pairs of Sites
Let us consider a more complex model. We will have M pairs of sites, each pair being made up of
two dierent sites which we will call 1 and 2. A molecule bound to a site of type 1 will have partition
function q
1
. A molecule bound to a site of type two; q
2
. If both sites of a pair are occupied, there
will be an interaction energy w between them.
This is our rst example of an interaction energy.
Our system then is made up of M independent pairs of sites with s = 0, 1, or 2. If we consider all
possible arrangements we see that:
q(0) = 1 , q(1) = q
1
+q
2
, q(2) = q
1
q
2
e
w
(11.4.6)
and so
= 1 + (q
1
+q
2
) +q
1
q
2
2
e
w/kT
(11.4.7)
We nd N) in the normal way and then
s =
N)
M
=
(q
1
+q
2
) + 2q
1
q
2
2
e
w/kT
1 + (q
1
+q
2
) +q
1
q
2
2
e
w/kT
(11.4.8)
Plot of the Equation Above
Figure 11.2: Isotherm for Pairs of Sites
11.4.3 Brunauer-Emmett-Teller Adsorption
We can also deal with cases where more than one molecule is adsorbed at a site. The model is of
a surface with M independent adsorbing sites, each of which can adsorb an indenite number of
molecules in a stack. We let q
1
be the partition function for the rst molecule in the stack, q
2
for
the second, q
3
for the third, and so on. Of course this isnt too realistic a model because adsorption
does not take place in deep stacks this way.
We then have:
= 1 +q
1
+q
1
q
2
2
+q
1
q
2
q
3
3
+ (11.4.9)
and then
s =
N)
M
=
q
1
+ 2q
1
q
2
2
+ 3q
1
q
2
q
3
3
+cdots
1 +q
1
+q
1
q
2
2
+q
1
q
2
q
3
3
+
(11.4.10)
Lets now specialize the model a bit. The rst molecule in the stack will still have partition function
q
1
, but let us assume that all the other molecules in the stack have partition function q
2
. Then we
have:
s =
N)
M
=
q
1
(1 + 2q
2
+ 3q
2
2
2
+ )
1 +q
1
(1 +q
2
+q
2
2
2
+ )
(11.4.11)
Some manipulation gives:
s =
q
1
(1 q
2
+q
1
)(1 q
2
)
(11.4.12)
If now we let
r = q
1
/q
2
and x = q
2
= q
2
gas
= q
2
p exp(
o
/kT) (11.4.13)
we end up with
s =
rx
(1 x +rx)(1 x)
(11.4.14)
which is the well-known Brunauer-Emmett-Teller adsorption isotherm, a name usually abbreviated
as BET.
3
Plot of the BET Isotherm for r = 200
Figure 11.3: Plot of the BET Isotherm for r = 200
11.5 Lattice Gas in One Dimension
Here we study a one-dimensional system of M sites on a line. Each site may be empty or occupied
by a particle. These particles do not move. If it seems unrealistic to call such a system a gas,
think of it this way: the particles really do move, but every time we look at the system, the particles
are found in discrete boxes (sites) on the line. We look often and average our results.
Each site or box on the line may be either empty or contain a single particle.
This model is a particular instance of an Ising Model, more normally set up as a system of spins,
spin up corresponding to our occupied site and spin down to an empty site.
Each of the particles is assumed to have an internal partition function q(T) accounting for any
internal vibration or rotation. We will generally ignore this.
The one complication in our system is that we assume that two nearest neighbor occupied sites have
a binding energy w between them. Thus for positive w, the occupation of one site encourages
the occupation of its nearest neighbors, while negative w discourages the occupation of its nearest
neighbor sites.
3
Brunauer, Emmett, and Teller, Adsorption of Gases in Multiple Layers, JACS, 60, 309, (1938).
We assume that there are N molecules distributed among the M sites. Since we assume a nearest
neighbor interaction, we need to know how many nearest neighbor pairs there are in the system. We
will call this quantity N
11
. There are two other quantities related to this. One is N
00
the number
of completely unoccupied pairs of sites, and N
01
, the number of pairs of sites with one site occupied
and the other empty.
With this notation, the total interaction energy will be N
11
w.
There seem to be a lot of variables here. Of course M (which plays the role of a volume) and N are
given quantities. But we still have three other terms, N
00
, N
01
, and N
11
, to deal with. But it seems
obvious that these must be related to M and N somehow.
Some insight into the situation can be obtained by considering a (small) example. Let X mark an
occupied site and O an unoccupied one. Then one possible conguration of a lattice gas could be:
O X X O O X X X O (11.5.1)
In this example M = 9 and N = 5.
Now starting from the left, draw a line from each occupied site to its two nearest neighbors.
4
Thus
nothing is drawn from the rst site (O) but a line is drawn to the left and to the right from the
second (X) site. When we are done, the example looks like this:
O X = X O O X = X = X O (11.5.2)
The rst thing to notice is that there are 2N lines in total. This is true because there are two lines
drawn from each occupied site and there are N occupied sites. This will always be true, no matter
what the conguration. Looking at this another way, there will always be two lines drawn between
an X X pair and one line between an O X pair. Thus it must always be true that:
2N = 2N
11
+N
01
(11.5.3)
If we now erase our lines and draw instead one line to each neighbor from each empty (O) site, we
will have 2(M N) lines total, because M N is the number of empty sites. Similarly there will
always be two lines drawn between a O O pair and one for each O X pair. This leads to:
2(M N) = 2N
00
+N
01
(11.5.4)
One should pay attention to the symmetry of this situation.
There are now two relationships among our three variables N
00
, N
01
, and N
11
, So only one is
independent. It doesnt matter much which one we choose, but it is convenient to pick N
01
, since it
occurs on both Equation (11.5.3) and Equation (11.5.4).
So we can replace N
11
with
N
11
=
_
N
N
01
2
_
(11.5.5)
and similarly for N
00
:
N
00
=
_
M N
N
01
2
_
(11.5.6)
The total interaction energy N
11
w is then
N
11
w =
_
N
N
01
2
_
w (11.5.7)
4
I have no idea who rst thought this scheme up. But it is a very clever device and should not be forgotten!
Weve been looking at one particular conguration. There are not only many dierent congurations,
there are many ways to obtain even the one we are looking at. There are, all together, g(N, M, N
01
)
congurations with any given values of N, M, and N
01
. And for any given value of N
01
there is a
term:
g(N, M, N
01
)e
N11w
= g(N, M, N
01
)e
[(NN01/2)w]
(11.5.8)
in the partition function for this system.
To include all congurations with given values of N, M, and T we have for the canonical partition
function Q(N, M, T):
Q(N, M, T) = q
N
N01
g(N, M, N
01
)e
[(NN01/2)w]
(11.5.9)
where q is the partition function for any internal motions of the particles such as vibration or
rotation.
The partition function in Equation (11.5.9) can be simplied slightly by factoring out the term in
exp(w), which is not aected by the summation:
Q(N, M, T) =
_
qe
w
_
N

N01
g(N, M, N
01
)
_
e
w/2
_
N01
(11.5.10)
Of course to do anything with this we need an actual expression for g(N, M, N
01
). First, we must
have:
N01
g(N, M, N
01
) =
M!
N!(M N)!
(11.5.11)
But that doesnt get us very far. But there is another graphical calculation that helps.
We now assume, for the time being, that N
01
is odd and that the left-hand site is occupied. Then,
for each conguration, we draw a vertical line between every O X junction. Heres an example:
XXX [ OO [ X [ O [ X [ OO [ XX [ O (11.5.12)
In this example we have M = 13 and N = 7. The vertical lines separate the sites into groups of all
Xs or all Os.
Given that the left-hand group is always Xs, each line marks the position of the start of another
group. Half the time that group will be Xs. In fact, since we assumed that N
01
is odd and the
left-hand group is always of Xs, there will always be (N
01
+ 1)/2 groups of Xs and (N
01
+ 1)/2
groups of Os and the right-hand group will always be Os.
One can see what is happening by adding another group to the right. It must be a group of Xs since
the last group was of Os. Now N
01
is no longer odd, so to compensate for that we add a second
group, this time of Xs. Thus we see that the two groups either occur in equal numbers or the Os
exceed the Xs by 1. If we had assumed that N
01
was even, then the formulas would be N
01
/2
instead and the groups would again occur either in equal numbers or with the Xs in the majority
by one.
So the quest to evaluate g(M, N, N
01
) now comes down to this:
How many ways are there to arrange N Xs in (N
01
+ 1)/2 groups?
Now each X group must contain at least one X, This uses up (N
01
+ 1)/2 of the N Xs. There are
then N (N
01
+ 1)/2 Xs left and there are (N
01
+ 1)/2 places to put them.
This is a bit of a tricky problem. To see how it works out we have to digress a bit. Let us divide n
particles into b groups. One way to look at this is to take n Xs and b 1 lines and arrange them in
a row. Heres one arrangement of 10 Xs and 4 lines:
XXX[XX[[XXXXX[ (11.5.13)
There are 4 lines, but they divide the Xs into ve bins: 3 in the rst bin, 2 in the second bin, 0 in
the third bin, 5 in the fourth bin, and 0 in the fth bin (at the end).
So we can solve the problem of having n Xs and in b bins by taking n + (b 1) objects and asking
how many permutations there are of the entire list. There are clearly (n+b 1)! such permutations.
This number must be divided by n!, since the order of the Xs doesnt matter and by (b 1)! since
the order of the lines doesnt matter. The result is:
(n +b 1)!
n!(b 1)!
(11.5.14)
With this in mind we can solve our original problem. We had N (N
01
+ 1)/2 = x Xs and
(N
01
+ 1)/2 = b places to put them. Plugging these numbers into Expression (11.5.14) gives:
[N (N
01
+ 1)/2 + (N
01
+ 1)/2 1]!
[N (N
01
+ 1)/2]![(N
01
+ 1)/2 1]!
=
[N 1]!
[N (N
01
+ 1)/2]![(N
01
+ 1)/2 1]!
(11.5.15)
Or, if we recognize that the numbers are large and drop the 1s, we get the more simple expression:
N!
[N (N
01
/2]![N
01
/2]!
(11.5.16)
But we are not done! We have to consider the possible ways of arranging the MN Os among the
(N
01
+1)/2i groups of Os, taking into account the fact that each O group must contain at least one
O. This is entirely symmetric with what was done with the Xs, so only the result will be given. It
is:
[M N]!
[M N N
01
/2]![N
01
/2]!
(11.5.17)
after once again dropping 1s in comparison to large numbers.
The total number of congurations is the product of Expressions (11.5.16) and (11.5.17). Why?
Because for each conguration of the Os, there are Expression (11.5.16) congurations of the Xs.
The product of these is then g(N, M, N
01
):
g(N, M, N
01
) = 2
N![M N]!
[N (N
01
/2)]![M N N
01
/2]![[N
01
/2]!
2
(11.5.18)
The two multiplying the formula above comes from the fact that we insisted that the rst group be
Xs. Of course it could just as well have been Os, hence the multiplication by 2.
We now look back at the Equation (11.5.10) on the preceding page for Q. This can be evaluated
using matrix techniques
5
but here we will use the maximum term method (see Subsection 8.3.1 in
Chapter 8) to evaluate Equation (11.5.18).
Each term of the sum for Q(N, M, T) will be called t
n
(N
01
, N, M, T):
t
n
(N
01
, N, M, T) = g(N, M, N
01
)(e
w/2
)
N01
(11.5.19)
5
See Hill, Terrell L., Statistical Mechanics, McGraw-Hill, 1956, pp 312-314 and 323-324.
We now nd the value of N
01
that maximizes t
n
by dierentiating Equation (11.5.19) on the preceding
page, or, what is the same thing, dierentiating its logarithm:
_
lnt
n
N
01
_
=
_
lng
N
01
_
+
w
2
= 0 (11.5.20)
After some serious algebraic pain caused by the fact that g is such a messy function of N
01
, the
results can be put into the form:
( a)(1 a)
a
2
= e
w/kT
(11.5.21)
where
=
N
M
and a =
N
01
2M
(11.5.22)
where N
01
indicates the value of N
01
that maximizes the term.
We can solve the maximization equation for N
01
as a function of N, M, and T to get:
a =
N
01
2M
=
2(1 )
b + 1
(11.5.23)
where
b = [1 4(1 )(1 e
w/kT
)]
1/2
(11.5.24)
where the sign ambiguity in the quadratic can be resolved by looking at the case w = 0.
The thermodynamics of this system are a bit messy, even with the abbreviations weve introduced.
First, we nd the chemical potential. Since
=
_
lnQ
N
_
M,T
(11.5.25)
and
lnQ = N lnqe
w
+ lnt(N, M, T, N
01
) (11.5.26)
we get
= lnqe
w
+
_
lnt
N
_
M,T,N01
+
_
lnt
N
01
_
N,M,T
_
N
01
N
_
M,T
(11.5.27)
The last term is zero because of Equation (11.5.20), leaving only
= lnqe
w
+
_
lnt
N
_
M,T,N01
(11.5.28)
Now using Equation (11.5.19) on the preceding page for t
n
and Equation (11.5.18) on the previous
page for g(N, M, N
01
) and adding another dose of tedious algebra
6
we end up with
qe
w
=
1
_
a
1 a
_
(11.5.29)
where = exp as usual.
We can get rid of a in Equation (11.5.29) by using its denition in Equation (11.5.23) to obtain:
y = e
w
=
b 1 + 2
b + 1 2
(11.5.30)
6
I suspect that some readers are getting the idea that there is more perspiration than inspiration in statistical
thermodynamics. They would be right.
which gives as a function of and T.
A glance at some of the equations above shows that they are symmetrical in a way. That should
be expected given the symmetry between occupied and unoccupied sites. The symmetry can be
expressed in another way. It turns out that:
7
y()y(1 ) = 1 (11.5.31)
where y is given in Equation (11.5.30) on the preceding page.
It remains to nd the equation of state. This is just as tedious as nding the chemical potential.
From thermodynamics we have:
M = N A (11.5.32)
and so:
M = N + lnQ (11.5.33)
Using previous equations we can convert this to the form:
= ln
b 1 + 2
b + 1 2
+
1
M
lng(N, M, N
01
) +
2(1 )
b 1
w (11.5.34)
When N
01
is eliminated from this equation, the result simplies enormously, leaving only
= ln
b + 1
b + 1 2
(11.5.35)
Pressure-Area Isotherm for the One-Dimensional Lattice Gas
Figure 11.4: Pressure-Area Isotherm for the One-Dimensional Lattice Gas
7
More algebra.
12. Ideal Quantum Gases
12.1 Introduction
We took a brief look at ideal quantum gases in Section 9.2 on page 105 of Chapter 9. Those results
were, with the Fermi-Dirac case as the upper sign and the Bose-Einstein case as the lower sign:
ln = ()
k
ln
_
1 e
(12.1.1)
N) =
k
_
e
k
1 e
k
_
(12.1.2)
n
k
) =
e
k
1 e
k
(12.1.3)
E) =
k
n
k
k
=
k
_

k
e
k
1 e
k
_
(12.1.4)
and
p)V =
k
ln
_
1 e
(12.1.5)
Here = exp(/kT).
What wed like to do is nd expressions for the various thermodynamic quantities that are free of
. For example wed like an expression for p in terms of V and N.
The general scheme is to manipulate the equations above into some form and then eliminate by
solving one of the above equations for it and then using that solution to remove it from another.
12.2 Weakly Degenerate Ideal Fermi-Dirac Gas
The needed formulas from Section 12.1 are:
N) =
k
_
e
k
1 +e
k
_
(12.2.1)
and
p)V =
k
ln
_
1 +e
(12.2.2)
with
= e
/kT
(12.2.3)
where is the absolute activity.
134
Chapter 12: Ideal Quantum Gases 135
None of the sums in the rst two equations can be done exactly. We can, however, convert them
to integrals over the allowed energy levels by introducing the degeneracy of states () discussed in
Section 5.3 of Chapter 5. The result obtained there was:
()d =

4
_
8mL
2
h
2
_
3/2
1/2
d ((5.3.6))
Using this Equations (12.2.1) on the previous page and (12.2.2) on the preceding page can be recast
to:
N = 2
_
2m
h
2
_
3/2
V
_

0
1/2
e
1 +e
d (12.2.4)
and
pV = 2
_
2m
h
2
_
3/2
V
_

0
1/2
ln(1 +e
)d (12.2.5)
where the ) brackets have been removed for convenience.
Now these equations cant be integrated either.
1
What can be done with these equations is that they can be expanded in a power series and then
integrated term by term.
2
There are a number of assumptions involved in this and we shall have to
be a bit careful in doing this.
First we handle the expansion of Equation (12.2.4): We have
1
1 +y
= 1 y +y
2
y
3
+. . . [y[ < 1 (12.2.6)
so that with the substitution
y = e
(12.2.7)
the N equation becomes
N = 2
_
2m
h
2
_
3/2
V
_

0
1/2
e
_
1 e
+
2
e
2
3
e
3
+. . .
d (12.2.8)
An examination of Equations (12.2.1) on the preceding page and (12.2.2) on the previous page shows
that can take on any value in 0 since is always positive (or, in the case of the ground
state, zero). Thus for any xed , exp will eventually dominate and the sums will converge.
However, the expansion in Equation (12.2.6) holds only for [y[ < 1, unlike that for an exponential
which is valid for any value of its argument. So Equation (12.2.7) is limited to small values of . In
particular, as we can see from the substitution (12.2.7),
[[ < 1 (weakly degenerate Fermi gas) (12.2.9)
since then the maximum value of e
is then 1.
There is another consideration. We are taking an integral, Equation (12.2.4), expanding it in a
series, Equation (12.2.8) and then we will integrate it term by term. This does not always work.
3
unless the series is absolutely convergent. In this case, being an alternating series, it is absolutely
convergent. So we are in good shape.
4
1
Then why did I bother to change from a sum to an integral? Thats a good question isnt it?
2
And thats a good answer, dont you think?
3
Consult any advanced calculus text for proof. The series being integrated must be uniformly convergent. See, for
example, D. V. Widder, Advanced Calculus, 1947, Chapter IX.
4
So why did I bring it up? Education. I brought it up to educate!
With the constants segregated out, Equation (12.2.8) on the previous page becomes
N = 2
_
2m
h
2
_
3/2
V
n=1
(1)
n+1
I
n
, n = 1, 2, . . . (12.2.10)
where
I
n
=
_
1
n
_
3/2
n
_

0
x
1/2
e
x
dx (12.2.11)
Here weve substituted x for n. The integral is just a gamma function. See Section 2.8 starting
on page 23. The result is:
I
n
=
_
1
_
3/2
_

n
n
3/2
_
(3/2) (12.2.12)
Making use of the fact that (3/2) is
1/2
/2 and that the thermal wavelength dened by Equa-
tion (5.4.1) on page 48 on page 48 is (h
2
/2m)
1/2
, we get our nal and surprisingly simple-looking
result:
N =
V
n=1
(1)
n+1

n
n
3/2
(12.2.13)
Now the expansion of Equation (12.2.5) on the previous page. This turns out to be surprisingly
similar to the N expansion weve just done in spite of the fact that the functions look so dierent.
Here we need to expand the logarithm. The relevant series is:
ln(1 +y) = y
1
2
y
2
+
1
3
y
3
. . . [y[ < 1 (12.2.14)
We expand the logarithm in Equation (12.2.5) on the preceding page to yield:
ln(1 +e
) = e
1
2
e
2
+
1
3
e
3
. . . (12.2.15)
which then gives us
pV = 2
_
2m
h
2
_
3/2
V
_

0
1/2
_
e
1
2
2
e
2
+
1
3
3
e
3
+. . .
_
d (12.2.16)
we rewrite this:
pV = 2
_
2m
h
2
_
3/2
V
n=1
(1)
n+1
J
n
n = 1, 2, . . . , (12.2.17)
where
J
n
=
1
3/2
n
n
5/2
_

0
x
1/2
e
x
dx (12.2.18)
where x = n. The integral is once again a gamma function and
J
n
=
1
3/2
n
n
5/2
(3/2) (12.2.19)
Plugging this into Equation (12.2.17) along with the value for (3/2) gives the nal result:
pV =
V
n=1
(1)
n+1

n
n
5/2
(12.2.20)
The plan now is to take Equation (12.2.13) on the preceding page and revert it. That is, we will
solve it for as a power series in N. Reversion of series is discussed in some detail in Appendix 12.7
on page 152.
Then that will be used to eliminate from Equation (12.2.20) on the previous page.
The actual reversion operation is tedious.
5
The result is:
= a
1
3
V
N +a
2
_
3
V
_
2
N
2
+ +a
n
_
3
V
_
n
N
n
+ (12.2.21)
and the rst seven coecients a
1
. . . a
7
are given below to 10 signicant gures:
a
1
1.
a
2
3.535533906 10
1
a
3
5.754991027 10
2
a
4
5.763960401 10
3
a
5
4.019494152 10
4
a
6
2.098189887 10
5
a
7
8.063131085 10
7
It can be seen that the coecients get rapidly smaller, falling o eventually by a factor of about ten
for each coecient.
The coecients above were obtained by evaluating the exact but highly messy expressions:
a
1
= 1 (12.2.22)
a
2
=
1
2
3/2
(12.2.23)
a
3
=
1
4

1
3
3/2
(12.2.24)
a
4
=
1
8
+
5
2
32

5
6
36
(12.2.25)
a
5
=
5
25

7
3
24
+
3
2
16
+
95
288
(12.2.26)
a
6
=
7
10
100

19
6
72

7
3
72
+
1463
2
3456
+
7
16
(12.2.27)
a
7
=
7
49
+
8
15
225

9
5
50

6
4

1325
3
2592
+
15
2
32
+
443
384
(12.2.28)
We now plug Equation (12.2.21) into Equation (12.2.20) on the previous page. The result is:
6
pV = b
1
N +b
2
_
3
V
_
N
2
+. . . +b
n
_
3
V
_
n1
N
n
+. . . (12.2.29)
5
the computer program Derive was used to generate these results.
6
Again, this is dicult but basically trivial algebra.
The rst seven coecients are:
b
1
= 1 (12.2.30)
b
2
=
2
8
(12.2.31)
b
3
=
1
8

2
3
27
(12.2.32)
b
4
=
6
12
+
5
2
64
+
3
32
(12.2.33)
b
5
=
4
5
125

3
6
+
2
8
+
317
1728
(12.2.34)
b
6
=
10
20

5
6
36

5
3
72
+
1687
2
6912
+
35
128
(12.2.35)
b
7
=
6
7
343
+
2
15
75

3
5
25

6
6

1019
3
3888
+
9
2
32
+
173
256
(12.2.36)
It should be again noted that these results are exact. The coecients are, rounded to 10 signicant
gures:
b
1
1.
b
2
1.767766953 10
1
b
3
3.300059820 10
3
b
4
1.112893285 10
4
b
5
3.540504095 10
6
b
6
8.386347040 10
8
b
7
3.662061883 10
10
The numbers in the displays above should be accurate to within one digit in the last place given.
They were computed to 15 digits and rounded from there. However, extreme care is needed in
computing these numbers as they are the result of huge cancellations between positive and negative
terms, most much larger than the result. For example, b
3
is 1/82
3/27. The rst term is 0.125000,

the second 0.128300 (both to six decimal places. The result is 0.003300, to six places and it can be
seen that the result has lost two signicant gures. The situation is even worse in the computation
of b
7
where the individual terms are of the order of 0.1 and the result of the order of 10
10
. Clearly
nine signicant digits have been lost in that case.
Thus the numbers in the display above should be used with great caution. And the reason for
inclusion of the exact results is now clear.
7
If we introduce the number density = N/V Equation (12.2.29) on the preceding page can be
written as a virial equation of the form:
p
kT
= +B
2
(T)
2
+B
3
(T)
3
+ (12.2.37)
which here is:
p
kT
= +

3
2
5/2
2
+
_
1
8

2
3
5/2
_
3
+ (12.2.38)
from which it can be seen that the second virial coecient of a Fermi-Dirac gas is always positive.
The remainder of the series converges (for appropriately small values of ) and is an alternating
7
At some point in the future I will reevaluate the numerical values of the bs to ensure greater precision.
series. Thus the error is less than the rst neglected term, which is less than the terms included.
Thus the pressure in a weakly degenerate Fermi-Dirac gas is always higher than in the corresponding
classical ideal gas.
This can be understood from the Pauli Exclusion principle. There are, of course, many more quantum
states than particles. But still, the fact that no two particles can be in the same state forces some
to be in states higher than they otherwise would increasing the energy and hence the pressure.
All thermodynamic quantities can be obtained by dierentiation of the grand canonical partition
function (, V, ), which is:
pV = ln (, V, ) (12.2.39)
We already know pV from Equation (12.2.20) on page 136. Equation (12.2.29) on page 137 cannot
be used to determine thermodynamic properties since the variables there are , V , and N, which
are not the natural variables for the grand partition function. So we must use Equation (12.2.20)
on page 136 instead.
This is reasonable since in determining the other thermodynamic functions it is assumed that we
know , V , and = (1/) ln.
We can, however, obtain the energy E from Equation (12.2.4) on page 135. The derivation is much
like what we have done above. The energy is given by:
E =
k
e
k
1 +e
k
(12.2.40)
Using the degeneracy () from Equation (5.3.6) on page 47 and expanding the denominator as in
Equation (12.2.8) on page 135 we get:
E = 2
_
2m
h
2
_
3/2
V
_

0
3/2
e
_
1 e
+
2
e
2
3
e
3
+. . .
d (12.2.41)
Note the presence of the factor
3/2
instead of
1/2
as in Equation (12.2.8) on page 135.
As before we write this as:
E = 2
_
2m
h
2
_
3/2
V
n=0
(1)
n+1
I
n
(12.2.42)
where
I
n
=
n
_

0
3/2
e
n
d (12.2.43)
Making the substitution x = n, we get
I
n
=
_
1
n
_
5/3
n
_

0
x
3/2
e
x
dx =
_
1
n
_
5/3
n
(5/2) (12.2.44)
Since (5/2) = 3
/4, when we plug this into Equation (12.2.42) we get:

E = 2
_
2m
h
2
_
3/2
3
4
V
n=1
(1)
n+1

n
n
5/2
(12.2.45)
Putting the bits together and introducing once again,
8
we get
E =
3
2
V
n=1
(1)
n+1

n
n
5/2
(12.2.46)
8
See Equation (5.4.1) on page 48 on page 48 for the denition of .
We could now go eliminate using Equation (12.2.21) on page 137, but we dont have to! If
we compare Equation (12.2.46) on the previous page to Equation (12.2.20) on page 136 we see
immediately that:
E =
3
2
pV (12.2.47)
so that once p is known, so also E is known. Further, remember that p is always greater than the
Boltzmann ideal gas p, so the energy of the weakly degenerate Fermi-Dirac gas is also always larger
than that of the classical ideal gas.
12.3 Strongly Degenerate Ideal Fermi-Dirac Gas
12.3.1 Absolute Zero
The strongly degenerate Fermi-Dirac gas isnt ideal at all. Here strong degeneracy means condi-
tions under which quantum phenomenon are important, which in turn means low temperatures (for
suitable values of low.)
We begin with Equation (12.1.3) on page 134 for Fermi particles:
n
k
) =
e
k
1 +e
k
((12.1.3))
and write as e
, where is the chemical potential.

9
Then Equation (12.1.3) on page 134 becomes
n
k
) =
1
1 +e
(
k
)
(12.3.1)
Since there can be only a single Fermi-Dirac particle in a given energy state,
10
n
k
) is, in eect, the
probability that there is a particle in state k.
At high temperatures this goes over to the weakly degenerate Fermi-Dirac gas discussed in Section
12.2. But as T 0 something strange happens. First, note that itself is a function of temperature.
We will denote at absolute zero as
0
.
Second, the sign of the exponent on the exponential in the denominator at absolute zero depends
on whether
k
>
0
or not. If it is, then the exponent is a positive number that goes to innity as
T 0. Otherwise it goes to zero as T 0.
As a result, at absolute zero n
k
) obeys
n
k
) =
_
1 if
k
<
0
0 otherwise
(12.3.2)
and we see that all levels up to
0
in energy are occupied and all levels above
0
are empty. We
are used to exactly this behavior for electrons in atoms and molecules as well as electrons in metals,
even at normal temperatures. This implies that, for example, 300 K, is cold to an electron gas.
Using the degeneracy () from Equation (5.3.6) on page 47
()d =

4
_
8mL
2
h
2
_
3/2
1/2
d ((5.3.6))
9
This is, of course, the denition of .
10
Actually, there can be two particles in an energy state if we consider spin, which we must. This will be taken
care of later.
allows us to write, at absolute zero:
N = 4
_
2m
h
2
_
3/2
V
_
0
0
1/2
d (12.3.3)
where a factor of two has been added to take spin into account. The integral is trivial and the result
is:
N =
8
3
_
2m
h
2
_
3/2
V
3/2
0
(12.3.4)
which can be rearranged to give:
0
=
h
2
2m
_
3
8
_
2/3
_
N
V
_
2/3
(12.3.5)
where (N/V ) is the number density of the material.
In the study of metals where to a rst approximation the electrons are treated as uncharged,
11
0
is called the Fermi energy and
0
/k the Fermi temperature. The Fermi temperature denotes the
temperature below which the system is eectively at absolute zero. For most metals the Fermi
temperature T
F
is a few thousand degrees.
At absolute zero the energy can be written as:
E
0
= 4
_
2m
h
2
_
3/2
V
_
0
0
3/2
d =
3
5
N
0
(12.3.6)
This was written as E
0
to emphasize that this is a result strictly true only at absolute zero. It
represents a zero point energy just like the energy h/2 for a quantum oscillator. It is the energy
an electron gas would retain even if cooled to absolute zero.
The pressure of a Fermi-Dirac gas at absolute zero can be expected to be non-zero because there is
a zero point energy and hence a zero point pressure.
We had previously obtained Equation (12.2.5) on page 135 reproduced here with the additional spin
factor of two:
p)V = 4
_
2m
h
2
_
3/2
V
_
0
0
1/2
ln(1 +e
)d (12.3.7)
which is
pV = 4
_
2m
h
2
_
3/2
V
_
0
0
1/2
ln(1 +e
(0)
)d (12.3.8)
In the range of the integral (
0
) is positive since
0
> . And since is large, the exponential is
much greater than 1. Thus the 1 can be dropped and the logarithm taken to give:
pV = 4
_
2m
h
2
_
3/2
V
_
0
0
1/2
(
0
)d (12.3.9)
The integral is simple. The result is:
p
0
=
2
5
_
N
V
_
1/3
h
2
2m
_
3
8
_
2/3
=
2
5
N
0
V
(12.3.10)
which is usually of the order of about a million atmospheres...
11
Because there is a positive charge on a metal atom for each negative charge on the electron. Thus while the
electrons move in a periodic rather than zero potential, the movement is still free enough to be considered ideal.
12.3.2 The Merely Cold Ideal Fermi-Dirac Gas
The development here follows that of McQuarrie. The reader should also compare Huang.
12
We
will take the zero temperature results and expand them in a series involving a small parameter.
First we reproduce the equations for N and E:
N = 4
_
2m
h
2
_
3/2
V
_
0
0
1/2
d ((12.3.3))
and
E
0
= 4
_
2m
h
2
_
3/2
V
_
0
0
3/2
d ((12.3.6))
We obtained these by assuming that the temperature was absolute zero. That worked because the
probability of occupation of a state, n
k
was basically a step function, one that is one from 0 to some
point, say
0
and then zero above that. Call that function f() where
f() = n
k
) =
1
1 +e
(
k
)
((12.3.1))
which is still basically Equation (12.3.1) on page 140.
With this in mind we can rewrite Equations (12.3.3) on the preceding page and (12.3.6) on the
previous page as:
N = 4
_
2m
h
2
_
3/2
V
_

0
f()
1/2
d (12.3.11)
and
E = 4
_
2m
h
2
_
3/2
V
_

0
f()
3/2
d (12.3.12)
Both of these are of the form:
I = 4
_
2m
h
2
_
3/2
V
_

0
f()h()d (12.3.13)
where f() is given by Equation (12.3.1) on page 140 and h() is either
1/2
for I = N or
3/2
for
I = E.
So far this is nothing unusual,
13
but this is where the trick comes in. We integrate Equation (12.3.13)
by parts to get:
I = 4
_
2m
h
2
_
3/2
V
_
f()H()
_

0
f
()H()d
_
(12.3.14)
where H() is the integral of h():
H() =
_

0
h()d (12.3.15)
and f
() is the derivative of f(). The rst term in square brackets in Equation (12.3.14) is zero
because H() is zero when is zero and f() is zero when is innity.
The result is then:
I = 4
_
2m
h
2
_
3/2
V
_

0
f
()H()d (12.3.16)
12
D.A. McQuarrie, Statistical Mechanics, 1973, Chapter 10; K. Huang, Statistical Mechanics, Second Edition, 1987,
Chapter 11.
13
If you consider my taking simple formulas and making them more complex usual.
What weve gained from this can be seen by looking at a graph of f(). At absolute zero it is at
and equal to 1 up to =
0
where it suddenly falls to 0 and stays there. Its derivative is then zero
from 0 to
0
and zero from
0
to innity. It is not zero at
0
. There it is a point at innity.
14
But when we are not at absolute zero some of the fermions can move to higher energy levels. They
cant move too high because we still assume that the temperature is very low. But the step function
behavior of f() becomes modied. The sharp upper corner at =
0
becomes somewhat rounded
and the foot of the graph at that same point also becomes rounded. The derivative is no longer
non-zero at a single point, but now looks something like a Gaussian, a sharply peaked function with
the peak at = . Note that this is and not
0
because is also a function of temperature and
will have a slightly dierent value once T is no longer zero.
As a result the integrand in Equation (12.3.16) on the previous page can be seen to be zero everywhere
except in the small region around = . So we can expand the integrand in an innite series around
that point, which is now what we are going to do.
We expand H() in a Taylor series around = . This gives us:
H() = H() +
_
H
_
( ) +
1
2
_
2
H
2
_
( )
2
+ (12.3.17)
Note that the derivatives are evaluated at = and are no longer functions of .
The integral in Equation (12.3.16) on the preceding page is now:
I = 4
_
2m
h
2
_
3/2
V
_
H()L
0
+
_
H
_
L
1
+
1
2
_
2
H
2
_
L
2
+
_
(12.3.18)
where
L
n
=
_

0
( )
n
f
()d (12.3.19)
The rst term, L
0
is 1 since f
() is, as noted above, a delta function. For the others we can set
the lower limit in Equation (12.3.19) to since the contribution of the part from to 0 is
negligible. Thus:
L
n
=
_

( )
n
f
()d n = 1, 2, . . . , (12.3.20)
If we now let x = ( ) we have
L
n
=
1
n
_

x
n
e
x
(1 +e
x
)
2
dx (12.3.21)
Note that except for the factor of x
n
, the integrand is symmetric around x = 0. Thus for odd n
L
n
= 0 because the contribution from x negative equals the contribution from x positive. So L
n
exists only for even n.
These integrals can be represented in simple form. For instance
_

x
2
e
x
(1 +e
x
)
2
=

2
3
(12.3.22)
14
This is called a delta function, (x), which is zero everywhere except at x = 0 where it is innite. It also has the
property that
R
0
(x)dx = 1, a property we will use later.
Heres a listing of the rst few values of L
n
:
L
0
= 1 (12.3.23)
L
2
=

2
3
2
(12.3.24)
L
4
=
7
4
15
4
(12.3.25)
L
6
=
31
6
21
6
(12.3.26)
L
8
=
127
8
15
8
(12.3.27)
L
10
=
2555
10
33
10
(12.3.28)
Now H() is dened by Equation (12.3.15) on page 142. That integral is:
H() =
2
a + 2
1+a/2
(12.3.29)
where a is either 1 if we are dealing with equations for N or 3 if we are dealing with equations for
E. The general equation for H
(n)
, the nth derivative of H is:
H
(n)
() =
_
a
2
__
a
2
1
__
a
2
2
_

_
a
2
n + 2)
_
(a/2)n+1
n 2 (12.3.30)
With all these bits and pieces in hand we substitute our results into Equation (12.3.18) on the
previous page, not forgetting that we are evaluating the derivatives H
(n)
() at = to obtain:
N =
8
3
_
2m
h
2
_
3/2
V
3/2
_
1 +

2
8
()
2
+
_
(12.3.31)
A similar equation applies to E.
Using Equation (12.3.5) on page 141 for
0
allows us to write:
0
=
_
1 +

2
8
()
2
+
_
2/3
=
_
1 +

2
12
()
2
+
_
(12.3.32)
where the last expression above comes from raising the middle expression to the 2/3 power. (See
Appendix 12.7 for details.)
This is really an expression giving
0
/ as a function of ()
2
. If we revert this series we get /
0
as a power series in = 1/:
0
= 1

2
12
2
+ (12.3.33)
Since for many metals is of the order of 0.01 we see that changes very slowly with temperature.
As a result little error is made by using
0
instead of throughout the range of temperatures at
which a metal is solid.
The energy can be found in the same way, except that the parameter a is now 3/2. The result is:
E = E
0
_
1 +
5
2
12

2
+
_
(12.3.34)
The heat capacity of the conduction electrons in a metal is then:
C
V
=

2
NkT
2(
0
/k)
=

2
2
Nk
_
T
T
F
_
(12.3.35)
where T
F
is the Fermi Temperature. Evaluation of this shows that the heat capacity is about 10
4
T
Joules/K for metals.
12.4 The Weakly Degenerate Bose-Einstein Gas
The equations involved are once again given in Section 12.1 on page 134. Specialized to the case at
hand they are:
p)V = ln =
k
ln
_
1 e
(12.4.1)
N) =
k
_
e
k
1 e
k
_
(12.4.2)
and for the average occupation of a particular state k:
n
k
) =
e
k
1 e
k
(12.4.3)
The only dierence between these equations and those for the Fermi-Dirac weak degeneracy case
in Section 12.2 is the occurrence of a minus sign instead of a plus sign in certain places. That will
make no dierence here, but it will when we discuss the strongly degenerate case.
The reason is that with a minus sign each of the equations above contain the term 1 e
k
which
can become zero under certain circumstances. One does not like taking the logarithm of zero or
having a zero in the denominator of an expression.
15
Beyond this the development follows exactly the same path as in Section 12.2 except that the series
which there all have positive terms, here have terms with alternating signs.
As a result, only the outline of the development will be given.
The degeneracy of states with energy is:
16
()d =

4
_
8mL
2
h
2
_
3/2
1/2
d (12.4.4)
and the equations above can be converted to integrals:
N = 2
_
2m
h
2
_
3/2
V
_

0
1/2
e
1 e
d (12.4.5)
and
pV = 2
_
2m
h
2
_
3/2
V
_

0
1/2
ln(1 e
)d (12.4.6)
where the ) brackets have been removed for convenience.
These equations are then expanded in a power series, except now we have 1/(1y) = 1+y+y
2
+
and ln(1 y) = y + (1/2)y
2
+ (1/3)y
3
+ with y = exp. After integration and some
substitutions we have:
N =
V
n=1
n
n
3/2
(12.4.7)
15
Or perhaps one does, if you like to see things blow up in ugly ways.
16
I will not repeat as before each time. Just assume that the phrase is there.
pV =
V
n=1
n
n
5/2
(12.4.8)
The series Equation (12.4.7) on the preceding page is now reverted to give:
= a
1
3
V
N a
2
_
3
V
_
2
N
2
+ + (1)
n+1
a
n
_
3
V
_
n
N
n
+ (12.4.9)
where the a
n
have the values given for Equation (12.2.21) on page 137.
Equation (12.4.9) is then plugged into Equation (12.4.8) giving:
pV = b
1
N b
2
_
3
V
_
N
2
+ + (1)
n+1
b
n
_
3
V
_
n1
N
n
+ (12.4.10)
where the coecients b
n
have the values given for Equation (12.2.29) on page 137.
If we now introduce the number density = N/V and write Equation (12.4.10) as a virial equation:
p
kT
= +B
2
(T)
2
+B
3
(T)
3
+ (12.4.11)
we then have:
p
kT
=

3
2
5/2
2
+
_
1
8

2
3
5/2
_
3
+ (12.4.12)
from which it can be seen that an ideal Bose-Einstein gas, below the region in which it acts like
a regular ideal gas, has a pressure that is less than the ideal gas. This is the opposite to the
Fermi-Dirac case.
It is interesting that if you average the Fermi-Dirac case, Equation (12.2.38) on page 138 and the
Bose-Einstein case, Equation (12.4.12) what you get is:
p
kT
= +
_
1
8

2
3
5/2
_
3
+ (12.4.13)
where all the even numbered virial coecients cancel, but the odd ones do not.
The other thermodynamic properties, as in the Fermi-Dirac case, follow in a similar way.
12.5 The Strongly Degenerate Bose-Einstein Gas
It is here that the problem with the rst term in Equations (12.4.1) on the preceding page (12.4.3)
on the previous page occur. If we assume that the ground state energy
0
is zero, then exp (
0
)
is 1 and exp(
0
) = . Thus we have, for the rst term in the sums, either ln(1 ) in
Equation (12.4.1) on the preceding page or /(1 ) in Equation (12.4.2) on the previous page.
And it is clear that a problem can occur as 1, since the rst term will no longer be small.
This is best seen in Equation (12.4.3) on the preceding page where if 1, n
0
) goes to innity!
In the weakly degenerate cases (and the Boltzmann case) we expect n
k
) to be less than 1 always,
this tells us that something strange is going on.
We deal with this by splitting o the rst term in Equations (12.4.7) on the previous page and (12.4.8)
and rewriting the series slightly:
=
N
V
=
1
n=1
n
n
3/2
+

V (1 )
(12.5.1)
p
kT
=
1
n=1
n
n
5/2

1
V
ln(1 ) (12.5.2)
The average number of particles in the ground state is
n
0
) =

1
(12.5.3)
so to have sensible results we must have 0 < 1.
Since we already know that we can have more than one particle in an energy level if the particles are
bosons, we can estimate the worst case by assuming that all the particles are in the ground state
n
0
). Then n
0
) = N and
N =

1
(12.5.4)
and so
=
N
N + 1
=
1
1 + 1/N
1
1
N
(12.5.5)
Thus, with N of the order of 10
23
, the maximum value for is something like 1 10
23
.
The sum in Equations (12.5.1) on the previous page and (12.5.2) are of some interest. They are ex-
amples of a polylogarithm. There is probably more than you want to know about them in Section 12.8
on page 155. We will refer to them using a sort of obvious notation:
g
s
() =
n=1
n
n
s
(12.5.6)
where here s = 3/2 or 5/2.
As the denition of g
3/2
() shows, for very small values of g
3/2
() is linear in with a slope of
1. As increases, the function begins to increase in value faster than linear and requires that more
and more terms be kept to estimate it to any desired accuracy.
When = 1 the function is given by:
(s) =
n=1
1
n
s
(12.5.7)
which is called the zeta function or, sometimes, the Riemannian zeta function. This is also discussed
in some detail (the function is important and comes up in various places) in Section 12.8 on page 155.
What is important to us here is that at = 1 we have (3/2) = 2.6123753507 , so that this is the
value taken by g
3/2
(1).
In principle,to nd the equation of state, we rst solve Equation (12.5.1) on the previous page for
as a function of rho and then substitute this into Equation (12.5.2) to eliminate .
Sadly, this cant be done analytically and must be approached graphically, numerically, or by more
subtle arguments.
We choose the latter approach. We rewrite Equation (12.5.1) on the previous page slightly:
3
= g
3/2
() +
_
3
V
_

1
(12.5.8)
and consider the magnitude of the terms involved. For helium-4 at 4.2 K, = 4.3 10
10
m.
Then
3
= 7.7 10
29
m
3
. The volume will be of the order, at worst, of a milliliter or so. Thus
V 10
6
m
3
. So
/
V is then approximately 7 10
23
or smaller.
Knowing that the maximum value of g
3/2
() is about 2.6, then the rst term on the right in Equa-
tion (12.5.8) on the preceding page is about 2.6 while the second is 7 10
23
/(1 ).
So most of the time the second term is totally ignorable. When that is true, we have the weakly
degenerate case considered in Section 12.4 on page 145 above.
We need consider the second term when /(1 ) gets to be roughly 10
20
or so. This will happen
when = 1 10
20
.
As far as g
3/2
() is concerned, this value of is the same as g
3/2
(1), the dierence being in the 20th
decimal place.
So we have the interesting situation that, as far as g
3/2
() is concerned, = 1 when anything
interesting starts to happen in the ground state.
We now rewrite Equation (12.5.1) on page 146 as:
N =
_
V
3
_
g
3/2
(1) +n
0
(12.5.9)
where n
0
, the number of bosons in the ground state, has been written for /(1 ). This can be
rearranged to be:
n
0
N
= 1
g
3/2
(1)
3
(12.5.10)
We extract the T dependence of :
3
=
_
h
2
2mkT
_
3
/2 =
A
T
3/2
(12.5.11)
which denes the constant A. Then Equation (12.5.10) can be written:
n
0
N
= 1
_
g
3/2
(1)
A
_
T
3/2
(12.5.12)
If we now dene T
0
by:
T
3/2
=
A
g
3/2
(1)
=
_
h
2
2mk)
_
3/2
g
3/2
(1)
(12.5.13)
or, more simply, T
0
is the temperature at which
3
= g
3/2
(1).
The results for the Bose-Einstein ideal gas at xed density can then be written:
n
0
N
=
_
0 T > T
0
1 (T/T
0
)
3/2
T < T
0
(12.5.14)
On the other hand, if the temperature is xed and is varied, it can be shown by an almost identical
argument starting with Equation (12.5.10) that
n
0
N
=
_
0 <
0
1
0
/ >
0
(12.5.15)
We now turn to a slightly rewritten Equation (12.5.2) on the previous page:
pV
kT
=
_
V
3
_
g
5/2
() ln(1 ) (12.5.16)
Once again we look at the relative sizes of the terms. The left-hand side, pV/kT, is, except for k, of
the order of 1 or, more exactly nR, where R is the molar gas constant and n the number of moles.
This is very roughly of the order of 1. The presence of Boltzmanns constant k makes the left-hand
side of the order of 10
23
.
As discussed above,
3
/V is of the order of 7 10
23
, so V/
3
is of the order of 10
22
. This is quite
compatible with the left-hand side.
What of ln(1 ). As discussed above, we do not get into the highly degenerate range until (1 )
is of the order of 10
20
. However, we are now taking a logarithm. So with that value for 1 , the
logarithm is only about 46. This is totally negligible compared to 10
2
2.
Thus in the case of Equation (12.5.16) on the preceding page, the last term can be ignored!
Does this mean that pV/kT is the same as for the weakly degenerate case? No, not at all.
It is true that the weakly degenerate case applies to the point where is essentially 1. This happens
when T = T
0
. But then g
5/2
() becomes g
5/2
(1) and is no longer a function of at all.
So we have:
p
kT
=
_
g
5/2
()/
3
T > T
0
q
5/2
(1)/
3
T < T
0
(12.5.17)
The rst thing to note is that the volume dependence seems to have vanished. It hasnt really. It
can be seen in Equation (12.2.21) on page 137 on page 137. So it is contained in and exists in the
case T > T
0
.
But in the degenerate case, there is no longer any real dependence. Thus, while the pressure still
depends on the temperature (through ), it no longer depends on the volume V . Put another way,
below
0
, the pressure remains constant as the volume continues to shrink.
17
12.6 The Photon Gas
Photons are interesting. Among their interesting properties are the following:
1. Photons have one unit of spin and are hence bosons.
2. Photons have a spin degeneracy of two, not three. That is because the vibration represented
by a photon have only two vibrational modes, not three. The translational mode does not
exist.
18
3. Photons do not interact with each other.
19
4. Photons, unlike normal particles, have zero rest mass.
5. The number of photons present in a system is not conserved.
6. Photons of one frequency can be re-emitted at another frequency.
17
Indeed, as the volume decreases to zero at constant N, all the molecules go into the ground state and take up no
volume at all!. This amazing result is an artifact of our model where we assumed that the molecules themselves have
no volumes.
18
This is a relativistic eect. One can think of a photon as moving so fast that no translational mode can propagate
and only the two lateral modes exist.
19
This is true under normal conditions. Under conditions of high photon density and with a material object
present, non-linear behavior can be found.
We imagine a gas of photons inside a rigid container with adiabatic walls which are perfect mirrors
on the inside. Since photons do not interact with each other, a perfect black body of negligible
volume is assumed to be inside the container.
20
Thus the total energy contained inside the system
is constant.
We imagine an ensemble of such systems, each of xed E and V . We cannot specify N except as an
average.
A consideration of the number of standing waves inside a cubical box of length L shows that the
number having energies between and +d is:
()d =
V
2
d
2
c
3
3
(12.6.1)
The total energy of the system is:
E =
k
n
k
k
(12.6.2)
The partition function Q(V, T) is then:
Q(V, T) =
{n
k
}
e
E({n
k
})
=
{n
k
}
e
P
k
n
k
k
(12.6.3)
where n
k
stands for the set of n
k
s making up the total energy. There are, of course, many sets
that satisfy that constraint. Weve seen this before in Chapter 7 in Section 7.8. The problem there
was that the number of particles was xed. In addition to the constraint Equation (12.6.2) we had
the constraint that
N =
{n
k
}
n
k
had to be satised. There is no such constraint here. So Equation (12.6.3) becomes simply:
Q(V, T) =
k
_

n=0
e
nn
_
=
k
1
1 e
k
(12.6.4)
and then
lnQ =
k
ln(1 e
k
) =
ln(1 e
) (12.6.5)
We cant evaluate the sum directly, but we can do the equivalent integral with the aid of Equa-
tion (12.6.1). We get:
lnQ =
V
2
c
3
3
_

0
2
ln(1 e
)d (12.6.6)
which, like its brother, the ideal Bose-Einstein gas, turns out to involve a zeta function:
lnQ =
V
2
c
3
3
2
1
1
n
4
=
2V
2
(c)
3
(4) (12.6.7)
And (4) =
4
/90.
We are now basically done. The energy, which is xed, is:
E =

2
V (kT)
4
15(c)
3
(12.6.8)
which veries the well-known energy dependence on the fourth power of the temperature.
20
Alternatively, one can consider the walls to be perfect black bodies.
The heat capacity of the photon gas is then simply:
C
V
=
4
2
V k
4
T
3
15(c)
3
(12.6.9)
The pressure of the photon gas is:
p = kT
_
lnQ
V
_
T
=
2(kT)
4
2
(c)
3
(4) =

2
(kT)
4
45(c)
3
(12.6.10)
which, curiously, is independent of the volume. The pressure is very small, but easily measurable.
The entropy can be computed from the usual formulas, The result is:
S =
4
2
V k(kT)
3
45(c)
3
(12.6.11)
which goes to zero as T 0 as it should for a good quantum gas.
Curiously, if we calculate G = N) from
N) = E TS +pV
we nd that N) = 0, and, since N certainly isnt zero, we then have that:
= 0 (12.6.12)
which is probably the simplest equation in this entire work!
12.7 Appendix: Operations with Series
Innite series occur in statistical thermodynamics with some frequency. At times one needs to
manipulate such series in various ways. Here are some collected formulas with hints on how they
were derived.
12.7.1 Powers of Series
Let us take
21
y = 1 +a
1
x +a
2
x
2
+a
3
x
3
+. . . (12.7.1)
as our standard series. Any power series can be put into this form by simply dividing through by
the constant term and renaming the variables.
The square of this series is obtained by simply writing it as
y
2
= (1 +a
1
x +a
2
x
2
+a
3
x
3
+. . .)(1 +a
1
x +a
2
x
2
+a
3
x
3
+. . .)
and then multiplying to get:
y
2
= 1 + 2a
1
x + [a
2
1
+ 2a
2
]x
2
+ [2a
1
a
2
+ 2a
3
]x
3
+. . .
so that if we write the result as:
y
2
= c
0
+c
1
x +c
2
x
2
+c
3
x
3
+. . . (12.7.2)
we have the following results:
Coecient Value
c
0
1
c
1
2a
1
c
2
a
2
1
+ 2a
2
c
3
2a
1
a
2
+ 2a
3
c
4
2a
1
a
3
+a
2
2
+ 2a
4
If we cube Equation (12.7.1) we get:
y
3
= 1 + 3a
1
x + (3a
2
1
+ 3a
2
)x
2
+ (a
3
1
+ 6a
1
a
2
+ 3a
3
)x
3
+ (a
2
1
a
2
+ 2a
1
a
3
+a
2
2
+a
4
)x
4
+. . .
or in the form:
y
3
= c
0
+c
1
x +c
2
x
2
+c
3
x
3
+. . . (12.7.3)
we have:
Coecient Value
c
0
1
c
1
3a
1
c
2
3a
2
1
+ 3a
2
c
3
a
3
1
+ 6a
1
a
2
+ 3a
3
c
4
a
2
1
a
2
+ 2a
1
a
3
+a
2
2
+a
4
21
The material in this section was taken from Abramowitz and Stegun, Handbook of Mathematical Functions with
Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, 1964, Section 3.6. Later versions of this
exceptional valuable book are available from Dover Publications.
In general the nth power of Equation (12.7.1) on the preceding page is
y
n
= c
0
+c
1
x +c
2
x
2
+c
3
x
3
+. . . (12.7.4)
and
Coecient Value
c
0
1
c
1
na
1
c
2
(1/2)n(n 1)a
1
+na
2
c
3
n(n 1)a
1
a
2
+n(n 1)(n 2)a
3
1
+na
3
n(n 1)(n 2)(n 3)a
4
1
/24 +n(n 1)(n 2)a
2
1
a
2
/2
c
4
+n(n 1)a
2
2
/2 +n(n 1)a
1
a
3
+na
4
Also of use is the product of two innite series
c
0
+c
1
x +c
2
x
2
+c
2
x
3
+. . .
= (1 +a
1
x +a
2
x
2
+a
3
x
3
+. . .) (1 +b
1
x +b
2
x
2
+b
3
x
3
+. . .) (12.7.5)
produces:
Coecient Value
c
0
1
c
1
b
1
+a
1
c
2
b
2
+a
1
b
1
+a
2
c
3
b
3
+a
1
b
2
+a
2
b
1
+a
3
c
4
b
4
+a
1
b
3
+a
2
b
2
+a
3
b
1
+a
4
which is produced by simply multiplying out the two series in Equation (12.7.5)
The quotient of two series
c
0
+c
1
x +c
2
x
2
+c
2
x
3
+. . . =
(1 +a
1
x +a
2
x
2
+a
3
x
3
+. . .)
(1 +b
1
x +b
2
x
2
+b
3
x
3
+. . .)
(12.7.6)
is obtained by multiplying through by the denominator series of Equation (12.7.6) and then using
the product result above. This gives:
Coecient Value
c
0
1
c
1
a
1
b
1
c
2
a
2
(b
1
c
1
+b
2
)
c
3
a
3
(b
1
c
2
+b
2
c
1
+b
3
)
c
4
a
4
(b
1
c
3
+b
2
c
2
+b
3
c
1
+b
4
)
12.7.2 Reversion of Series
At times one needs to revert a series. That is, given the series:
y = ax +bx
2
+cx
3
+dx
4
+ex
5
+fx
6
+gx
7
+. . . (12.7.7)
one wants to nd how x depends on y. That is, given Equation (12.7.7) on the previous page, one
wants to nd
x = Ay +By
2
+Cy
3
+Dy
4
+Ey
5
+Fy
6
+Gy
7
+. . . (12.7.8)
where A, B, etc., are known functions of a, b, etc. Note the absence of a constant term in both
Equations (12.7.7) on the preceding page and (12.7.8).
Surprisingly, this can be done in many cases. Each of the terms A, B, etc. turn out to be functions
of only a nite number of a, b, etc.
Working this out is a bit complex if more than two or perhaps three terms are wanted. The method
is simple enough. One simply takes Equation (12.7.7) on the preceding page and plugs it in for y
where ever y occurs in Equation (12.7.8). Thus a version of this process would look like:
x = A(ax +bx
2
+cx
3
+. . .) +B(ax +bx
2
+cx
3
+. . .)
2
+C(ax +bx
2
+cx
3
+. . .)
3
+. . .
One then multiplies this out and collects like powers of x.
The result is an identity. The left hand side x equals some coecient times x, some other coecient
times x
2
, and so on. The coecient of x must be 1 since the right-hand side is 1. The coecients
of all other terms must vanish.
It can easily be seen that the rst coecient is aA, hence A = 1/a. The second coecient (that of
x
2
) is (Ab + Ba
2
, which must be zero, so B = b/a
3
. To solve this we must already know A, but
we do know it from the rst step.
At each step along the way it turns out that the new coecient that we are looking for is given in
terms of coecients already known. The results are lengthy:
A = 1/a (12.7.9)
B = b/a
3
(12.7.10)
C = (2b
2
ac)/a
5
(12.7.11)
D = (5abc a
2
d 5b
3
)/a
7
(12.7.12)
E = (6a
2
bd + 3a
2
c
2
+ 14b
4
a
3
e 21ab
2
c)/a
9
(12.7.13)
F = (7a
3
be + 7a
3
cd + 84ab
3
c a
4
f 28a
2
bc
2
42b
5
28a
2
b
2
d)/a
11
(12.7.14)
G = (8a
4
bf + 8a
4
ce + 4a
4
d
2
+ 120a
2
b
3
d
+ 180a
2
b
2
c
2
+ 132b
6
a
5
g 36a
3
b
2
e
72a
3
bcd 12a
3
c
3
330ab
4
c)/a
13
(12.7.15)
As one can see, these are fairly horrible relationships, getting worse as one goes up in order. For seri-
ous work the reader is strongly recommended to gain access software that will do this automatically.
The two best are Mathematica and Maple, with Derive a close third for most work.
22
For the reader who wishes to check a program to revert series, the series:
y = x + 2x
2
+ 3x
3
+ 4x
4
+ 5x
5
+ 6x
6
+ 7x
7
+. . .
reverts to:
x = y 2y
2
+ 5y
3
14y
4
+ 42y
5
132y
6
+ 429y
7
+. . .
22
The rst two of these are inordinately expensive. If one can purchase them through a school program, so much
the better. Derive is much cheaper, but not cheap and will do the job.
12.8 Appendix: The Zeta Function and Generalizations
12.8.1 The Zeta Function
In 1737, Leonhard Euler wrote a paper in which he discussed several interesting series, one being
the sum of the reciprocals of the positive integers, another the sum of the squares of such integers.
However, it was not until Riemanns paper in 1859 that series of the sort studied by Euler were
shown to be part of class of functions we today call the Riemann Zeta function.
Riemanns zeta function is one of those amazing mathematical entities which are exceptionally useful,
beautiful, and slightly mysterious all at once.
The zeta function can be dened in a number of ways, all equivalent, though that may not be obvious
at rst glance. These include the form we are most interested in, the summation formula:
(s) = 1 +
1
2
s
+
1
3
s
+ + =
k=1
1
k
s
, (12.8.1)
the integral formula:
(s) =
1
(s)
_

0
u
s1
e
u
1
du, (12.8.2)
and the prime product formula
(s) =
k=1
1
1 p
s
k
, where p
k
is the kth prime, starting with 2 (12.8.3)
Basic properties of the zeta function are perhaps most easily seen from Equation (12.8.1). Clearly
it does not converge for s = 1, since Equation (12.8.1) then becomes the harmonic series:
(1) = 1 +
1
2
+
1
3
+ +
1
n
+ (12.8.4)
which diverges.
23
but does converge for any value of s > 1. In fact, the zeta function exists for
complex values of s as well, as long as the real part of s is greater than 1.
Further, examination of the values of (s) show that it rapidly approaches 1 from above, being
already about 1.08 at s = 4 and becoming much closer as s increases.
Values of the zeta function can be computed by various methods. The best method so far developed
is by Peter Borwein:
24
(s) =
1
d
n
(1 2
1s
)
n1
k=0
(1)
k
(d
k
d
n
)
(k + 1)
s
+
n
(s) (12.8.5)
d
k
= n
k
i=0
(n +i 1)!4
i
(n i)!(2i)!
((12.8.5)a)
n
(s)
3
(3 +
8)
n
1
1 2
1s
((12.8.5)b)
23
A fact proven in almost all calculus books.
24
P. Borwein, An Ecient Algorithm for the Riemann Zeta Function, Canadian Mathematical Conference Pro-
ceedings, preprint.
This is for s real, which is the case we need. Here
n
(s) is the error estimate, which implies that for
p decimal digits in the result n ought to be taken as roughly 1.3n. With n in hand and s already
known, Equation ((12.8.5)a) is then used to compute d
k
for k = 0 up to k = n. The resulting values
are then used in Equation (12.8.5) on the preceding page to produce values of (s).
Zeta is also tabulated in Abramowitz and Stegun.
25
which, if you have access to this valuable
reference, might be the easiest way to obtain various values.
For comparison, some values of are given to 10 decimals below.
(1) = (2) =
2
/6
(3) = 1.2020569032 (4) =
4
/90
(5) = 1.0369277551 (6) =
6
/945
(7) = 1.0083492774 (8) =
8
/9450
(9) = 1.0020083928 (10) =
10
/93555
The even values are exact because in these cases the value of (s) given can be obtained by doing
the appropriate integrals. The other values are approximate
26
values obtained by computations
using Equations like (12.8.5) on the previous page. It is not known if these are are transcendental
numbers, except in the case of (3), also known as Aperys constant, which has been proven to be
transcendental.
It may be of use to see how Equations (12.8.1) on the preceding page (12.8.3) on the previous
page can be derived from each other. The integrand in Equation (12.8.2) on the preceding page can
be written as:
u
s1
e
u
1
=
e
u
u
s1
1 e
u
= e
u
u
s1
k=0
e
ku
=
k=1
u
s1
e
ku
(12.8.6)
where the series expansion for e
s
has been used in the denominator. Using this in Equation (12.8.2)
on the previous page:
(s) =
1
(s)
_

0
u
s1
e
s
1
du =
1
(s)
k=1
_

0
u
s1
e
ku
du (12.8.7)
Letting x = ku and doing some small manipulation leads to:
(s) =
1
(s)
k=1
1
k
s
_

0
x
s1
e
x
dx (12.8.8)
The integral is the Gamma Function of s and what is left is Equation (12.8.1) on the preceding page.
So the secret lay in the series expansion of e
s
, which, in retrospect, seems quite natural.
The seeming mysterious equivalence of Equation (12.8.1) on the previous page and Equation (12.8.3)
on the preceding page turns out to be just as simple. The p
k
in Equation (12.8.3) on the previous
page are the successive prime numbers, the rst 10 of which are:
2, 3, 5, 7, 9, 11, 13, 17, 19, 23
The basic theorem of arithmetic says that every whole number r can be uniquely factored into primes
to various powers. Thus, for instance, 12 is 2
2
3
1
, meaning that it is the product of two 2s and one
3.
25
Abramowitz and Stegun, Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables,
National Bureau of Standards, 1964, Chapter 23.
26
Given here to ten decimal places.
Equation (12.8.3) on page 155 is the product of terms like:
1
1 p
s
k
where p
k
is the kth prime. The secret, once again, is to expand this as a power series:
1
1 p
s
= 1 +
1
(p
1
)
s
+
1
(p
2
)
s
+
1
(p
3
)
s
+ (12.8.9)
and to do this for every prime p
k
in Equation (12.8.3) on page 155. Thus we have for k = 2, p
k
= 2:
1
1 2
s
= 1 +
1
(2
1
)
s
+
1
(2
2
)
s
+
1
(2
3
)
s
+
So the rst few products are:
1
(1 2
s
)
1
(1 3
s
)
1
(1 5
s
)
=
_
1 +
1
(2
1
)
s
+
1
(2
2
)
s
+
1
(2
3
)
s
+
_
_
1 +
1
(3
1
)
s
+
1
(3
2
)
s
+
1
(3
3
)
s
+
_
_
1 +
1
(5
1
)
s
+
1
(5
2
)
s
+
1
(5
3
)
s
+
_

Now when these are multiplied out, the rst term is 1. The second term is 1/2
s
times a bunch of
1s. The next is 1/3
s
, again times a bunch of 1s. The fourth is 1/(2
2
)
s
, the fth 1/5
s
, and so on.
What we have then is:
(s) =
k=1
1
1 p
s
k
= 1 +
1
2
s
+
1
3
s
+
1
4
s
+
1
5
s
+ (12.8.10)
which is what we set out to show.
Truly, the zeta function is interesting!
12.8.2 The Dirichlet Eta Function
The Dirichlet eta function is just like the zeta function except that the terms alternate in signs:
(s) = 1
1
2
s
+
1
3
s
+ =
k=1
(1)
k+1
k
s
(12.8.11)
The eta function does exist at s = 1; the alternating series converges rather nicely there. Examination
of tabulated values for (s) show that it rapidly approaches 1 from below, being always less than 1
for all s.
Some values of (s) are given below:
(0) = 1/2 (1) = ln2
(2) =
2
/12 (3) = (3/4)3
(4) = 7
4
/720 (5) = (15/16)(5)
(6) = 31
6
/30240 (7) = (63/64)(7)
(8) = 127
8
/1209600 (9) = (255/256)(9)
(10) = 73
10
/6842880 (11) = (1023/1024)(11)
There is a close connection between (s) and (s). This can be seen by subtracting (s) from (s):
(s) (s) =
k=1
1
k
s

k=1
(1)
k+1
k
s
=
k=1
[1 (1)
k+1
]
k
s
=
k=2,4,..
2
k
s
(12.8.12)
= 2
k=1
1
(2k)
s
=
2
2
s
k=1
1
k
s
=
1
2
s1
(s) (12.8.13)
from which we easily nd that
(s) = (1 2
1s
)(s) (12.8.14)
which in turn explains some of the entries in the list of values for (s) above.
12.8.3 The Polylogarithm
We are interested in a function very similar to the zeta function, Equation (12.8.1) on page 155,
except this one has an additional parameter. It could be denoted as (s, ) but for various reasons
the notation g
s
() will be used here. The denition is:
g
s
() = +

2
2
s
+

3
3
s
+ =
k=1
k
k
s
(12.8.15)
In the literature this function is known as the polylogarithm. It is a slight generalization of the zeta
function and, indeed, it might be called the (slightly) generalized zeta function. But there are many
generalizations of the zeta function so that it is perhaps better to call it by its most known name.
27
.
This function is usually denoted by Li
s
(). We will not use that name here.
28
It can be trivially seen that for = 1, the polylogarithm becomes the zeta function.
Equation (12.8.15) converges for all s > 0, 0 1, except that if = 1, s must be larger than 1.
For any > 1, we have
k
> k
s
(12.8.16)
for suciently large k. This can be seen by taking logarithms:
k ln > s lnk
27
The polylogarithm is also sometimes called the de Jonqui`eres function
28
Primarily because we will also have a second closely related function generalized from the eta function which we
will call fs().
since for xed and s, k increases faster than lnk. Thus eventually the later terms of Equa-
tion (12.8.15) on the preceding page get larger and larger and the series diverges.
Some values of g
s
() can be seen by inspection. We have, for example:
g
s
(0) = 0 g
() = (12.8.17)
Other values can be deduced from Equation (12.8.1) on page 155:
g
0
() =
k=0
k
=

1
(12.8.18)
and
g
1
() = ln(1 z) (12.8.19)
The function g
s
() can be dierentiated with respect to :
d
d
g
s
() =
d
d
k=1
k
k
s
=
k=1
k
k1
k
s
=
k=1
k1
k
s1
=
1
k=1
k
k
s1
=
1
g
s1
() (12.8.20)
from which we have
dg
s
()
d
= g
s1
() (12.8.21)
A similar function, f
s
() can be dened:
f
s
() =

2
2
s
+

3
3
s
+ =
k=1
(1)
k+1
k
k
s
(12.8.22)
which is g
s
() with alternating signs.
Again there are some special values:
f
s
(0) = 0 f
() = (12.8.23)
and
f
0
() =
k=0
=

1 +
(12.8.24)
13. Classical Mechanics
13.1 Introduction
This chapter is a minireview of parts of classical mechanics. Often this is not familiar material for
graduate students in chemistry, the major audience for which this book is intended. It is likely
that such readers have not had a formal course in classical mechanics and that in fact all that such
readers have seen of mechanics was what was contained in a rst-year physics course taken some
years ago.
The reason for discussing classical mechanics is simple: most paper and pencil computations in
statistical thermodynamics are done from the Gibbsian (ensemble) point of view. However, most
computations are done from the Boltzmann (trajectory) point of view and those computations are
done using classical mechanics for the most part.
1
13.2 Denitions and Notations
There is much in mechanics that will not concern us. What will mostly be of interest is the physics
of particles, be they macroscopic particles or molecules. So we will begin by thinking about a single
particle in space. We will take as our starting point a set of cartesian coordinates x, y, and z
with a xed origin. Of course in any given situation there could be fewer than three coordinates or,
for that matter, more than three spatial coordinates. But three seems to be a convenient number,
allowing the reader to generalize or specialize as needed.
The coordinates of a particle can be represented as a vector r with n components where n is the
number of dimensions in which the particle nds itself. In ordinary space there are three of these
components which can be called x, y, and z. In general we will represent vectors in bold face and
components in ordinary type.
A Cartesian Coordinate System is one in which the coordinate axes are straight lines and are at
right angles to each other.
A coordinate system where coordinate lines cross each other at right angles are said to be orthogonal
coordinate systems. This is a larger class of coordinate systems than Cartesian coordinates. For
example spherical coordinates are also orthogonal.
2
Thus if a particle moves parallel to the x-axis,
the y and z components of its position will not change.
We use orthogonal coordinates because experience in the real world show that motion in the real
world obeys the statement above.
We will also assume that the space spanned by our cartesian coordinates is isotropic. That is, the
space has the same properties in all directions.
3
Of course, as the reader knows, other orthogonal
1
What are known as Monte Carlo calculations are ensemble calculations, though they too very often use classical
computations of potential energies, etc.
2
Orthogonal coordinates have the property that the dot product of the unit vectors along two dierent coordinate
axes is zero for all such pairs of coordinates.
3
This is, of course, not true of the world in which we live here on earth. There is a special direction called vertical
which is dierent than left, right, forward, or back. We can always nd out which direction is vertical by dropping a
test mass. It moves in the vertical direction.
160
Chapter 13: Classical Mechanics 161
coordinate systems are not only possible but are frequently used. Among these are spherical coordi-
nates, cylindrical coordinates, and polar coordinates. In every case though these are dened in terms
of an underlying cartesian coordinate system.
If there is more than one particle present we usually subscript the position vectors for the individual
particles. Thus the position of particle i is given by r
i
.
One position vector can contain the positions of all the particles present. This vector usually has
no subscript and is made up of the components of each particle. Thus for an N-particle system in
n dimensions we would have:
r = r(r
1
, r
2
, ..., r
N
) (13.2.1)
which would be a vector of nN components.
The rate at which positions change with time is called the velocity. The velocity is denoted either
by the symbol v or by r, using Newtons notation for a time derivative. Thus we have:
v =
dr
dt
= r (13.2.2)
In component notation the x-component of velocity, v
x
given by:
v
x
=
dx
dt
= x (13.2.3)
and similarly for the other components. Again, if there is more than one particle present we usually
subscript the velocity vectors for each particle. The velocity of the i th particle is then given by v
i
.
The velocity vector for a system of N particles is usually not subscripted, but is made up of the
velocity vectors of all the particles in the system:
v = v(v
1
, v
2
, ..., v
N
) (13.2.4)
The acceleration is denoted by the symbol a and given by the equivalent representations:
a =
dv
dt
= v =
d
2
r
dt
2
=r (13.2.5)
In component notation the x-component of acceleration, a
x
given by:
a
x
=
dv
x
dt
= v
x
=
d
2
x
dt
2
= x (13.2.6)
Again, if there is more than one particle present we subscript the acceleration vectors. The acceler-
ation of the i th particle is then given by a
i
.
The acceleration vector for a system of N particles is not subscripted, but is made up of the accel-
eration vectors of all the particles in the system:
a = a(a
1
, a
2
, ..., a
N
) (13.2.7)
Again, these vectors are usually time-dependent though there will not always be an explicit t present.
The momentum is also a vector. It is usually denoted by p and dened by:
p =
d(mr)
dt
(13.2.8)
Note that in general the mass could vary and that a varying mass will change the momentum.
4
But in our discussions the mass M will always be constant unless otherwise noted. In that case the
denition simplies considerably:
p = m r (13.2.9)
or, in component notation, either of the relations:
5
p
x
= mv
x
= m x (13.2.10)
The force, denoted F is the vector time rate of change of momentum:
F = p (13.2.11)
or, if as will be normal here, the mass is constant,
F = ma = m v = mr (13.2.12)
And the x-component of the force is given by any of the following equations:
F
x
= ma
x
= m v
x
= m x (13.2.13)
This denition of force is really Newtons First Law. We will talk about it again when we discuss
all three of Newtons laws.
Work on the other hand, is a scalar denoted here by w. By denition the work done in moving a
mass m from position s
1
to position s
2
is:
w
12
=
_
2
1
F ds (13.2.14)
This is the same denition used in thermodynamics except for the sign. In thermodynamics the force
and the distance moved are always in the same direction. In that case we dont need the explicit
vector dot product and the work is just the force times the distance moved.
13.3 Energy
With the mass constant we can use the denition of force to introduce the kinetic energy:
6
_
F ds = m
_
dv
dt
vdt =
m
2
_
d
dt
(v
2
) dt (13.3.1)
which, when integrated from position s
1
to position s
2
is
w
12
=
m
2
_
v
2
2
v
2
1
_
(13.3.2)
The quantity
K =
1
2
mv
2
(13.3.3)
is known as the kinetic energy and is often denoted by the symbol K. It is a scalar quantity.
4
This is an important consideration in dealing with rockets. But what we are doing here is not rocket science...
5
Again, only the x-component is given
6
The material this and the next few sections is drawn primarily from H. Goldstein, Classical Mechanics, Addison-
Wesley, 1957.
Thus the work done in moving the mass m from s
1
to s
2
is
w
12
= K
2
K
1
(13.3.4)
or the dierence between the nal and initial kinetic energies.
Let us do a mental experiment.
7
Let us take a mass m at a position s
1
and move it around in space
along any path, nally ending up right back at position s
1
.
Now if the net work done in this process is zero, then the forces present and the system are both
said to be conservative.
8
This is denoted by:
_
F ds = 0 (13.3.5)
where the circle on the integral indicates integration around any closed path.
The forces in a conservative system have a simple, but very important property. They can be
obtained from a scalar function called the potential energy, U by dierentiation.
9
In symbols this means that a conservative force can be written as:
F =
_
U
x
+
U
y
+
U
k
k
_
(13.3.6)
where , , and

k are unit vectors
10
pointing in the x, y, and z directions respectively. The minus
sign is due to the denition of force in physics.
The x-component of the force would be given by:
F
x
=
_
U
x
_
y,z
(13.3.7)
In a conservative system, the work done in moving a particle from position s
1
to position s
2
is
independent of path and hence is:
w
12
= (U
2
U
1
) (13.3.8)
where the minus sign follows from the denition of the force.
We now have two dierent expressions for the work. One is Equation (13.3.4), which involves the
kinetic energy and Equation (13.3.8), which involves the potential energy.
If we equate these two expressions we get:
w
12
= K
2
K
1
= U
2
+U
1
(13.3.9)
which gives:
K
2
+U
2
= K
1
+U
1
(13.3.10)
or, in words, the statement that in a conservative system the total energy is conserved.
7
These are very handy, often revealing relationships without the need either to set an experiment or to clean up
afterwards.
8
Not all systems and force elds are conservative. The most common example to the contrary are systems containing
friction.
9
Im not going to prove that directly. But it follows from the fact that the curl of a gradient must vanish.
10
That is, vectors of length 1 pointing parallel to the named axes.
13.4 Newtons Laws of Motion
In 1687 Isaac Newton wrote what is arguably the most important scientic book produced to date,
the Principia Mathematica. In it he nally
11
not only published his invention of calculus, but also
applied it to the problem of the motion of the earth and the planets.
In doing this he formulated three laws of mechanics, today known as Newtons Laws. These laws
are in actuality, postulates. They are not proven from more fundamental laws. They are simply
accepted as true based upon experience with the real world.
These laws are, in modern notation and terminology:
1. A moving body with no force acting on it will continue to move in whatever direction it was
moving with constant velocity.
v = constant (no force present) (13.4.1)
2. An external force F acts on a body of mass m to produce an acceleration a in the same
direction as the force such that the acceleration is given by
F = ma (13.4.2)
3. If a body A exerts a force F
AB
on body B, then body B exerts and equal and opposite force
F
BA
on body A or:
F
AB
= F
BA
(13.4.3)
The most important of Newtons Laws of Motion for our purposes is Equation (13.4.2), F = ma.
Because it is a vector statement it is really a set of second order dierential equations that can, in
principle, be solved to give the motion of all the particles in a system.
12
To see this lets do a most elementary example.
Example 13.1:
Let us assume that we have a particle of mass m moving in an empty three dimensional space where
there are no forces acting on the particle at all. The particle is originally located at r = (x
o
, y
o
, z
o
)
and is moving with velocity v = (v
x,o
, v
y,o
, v
z,o
)
The equations of motion written in component notation are given by:
m
d
2
x
dt
2
= 0, m
d
2
y
dt
2
= 0, m
d
2
z
dt
2
= 0 (13.4.4)
Integrating these equations is best done by changing d
2
x/dt
2
to dv
x
/dt and similarly for the other
two derivatives. This gives, after dividing through by m:
dv
x
dt
= 0,
dv
y
dt
= 0,
dv
z
dt
= 0 (13.4.5)
So clearly the integrals show that the speeds are constant. (Compare this to Newtons First
Law (13.4.1)).
11
Newton had delayed publication for years until convinced by his friend Edmund Halley, that he would lose his
priority for the invention of calculus if he did not publish.
12
It is also true that Newtons First Law of Motion is a subpart of the Second Law of Motion since if there is no
force, then there is no acceleration and the velocity of the body can not change. This is obvious from the calculus,
but it was not obvious to Newtons contemporaries.
There is no trouble as taking them as the initial speeds as the speed at time t = 0, v
x,o
, etc. We
then have another set of integrals to do:
dx
dt
= v
x
= v
x,o
,
dy
dt
= v
z
= v
y,o
,
dz
dt
= v
z
= v
z,o
(13.4.6)
This integrates trivially. And we will have another constant of integration, this time the initial
positions, x
o
, etc. We get:
x = v
x,o
t +x
o
, y = v
y,o
t +y
o
, z = v
z,o
t +z
o
(13.4.7)
as our answer. Note that once the six initial conditions are given, we can know the motion of the
particle forever. Also note that six initial conditions are needed per particle. This does not change
if forces are present. We always have three second-order dierential equations per particle and they
require two integration constants per equation.
Heres a well-known example. The result establishes what we call simple harmonic motion. But the
solution of the problem is a bit complex:
Example 13.2:
A particle of mass m moves in one dimensions It is initially at x
0
and moving with an initial velocity
of v
0
in a potential given by:
U(x) =
1
2
ax
2
(13.4.8)
where a is a positive constant. Find the equations of motion.
First, to use Newtons Second Law we need to nd the force. By dierentiation we have:
F
x
= ax (13.4.9)
The equation of motion is then:
m
d
2
x
dt
2
= m
dv
dt
= ax (13.4.10)
where the last form is the best for our purposes. There is a trick for integrating this: multiply both
sides by the velocity:
mv
dv
dt
= avx = ax
dx
dt
(13.4.11)
which gives:
1
2
md(v
2
) =
a
2
d(x
2
) (13.4.12)
This easily integrates into:
mv
2
= ax
2
+ac
2
1
(13.4.13)
where c
1
is a constant, taken as squared here for later convenience. Solving for v:
v
2
=
a
m
_
c
2
1
x
2
(13.4.14)
and then:
v =
dx
dt
=
_
a
m
_1
2
_
c
2
1
x
2
1
2
(13.4.15)
where the double sign comes from the square root. The integral we need is:
_
a
m
_1
2
dt =
dx
[c
2
1
x
2
]
1/2
(13.4.16)
The integral on the right is a standard one. We get
_
a
m
_1
2
t +c
2
= sin
1
(x/c
1
) (13.4.17)
where c
2
is another constant. Solving for x:
x = c
1
sin
_
_
a
m
_
1/2
t +c
2
_
(13.4.18)
We chose the plus sign in the argument of the sine function
13
and write the result:
x = c
1
sin
_
_
a
m
_
1/2
t +c
2
_
(13.4.19)
It remains to evaluate the two constants of integration. Plugging into the solution the value of x
when t = 0, we get: x
0
= c
1
sin(c
2
), and solving for c
1
gives:
c
2
= sin
1
(x
0
/c
1
) (13.4.20)
The constant c
1
can be found from the velocity equation above:
c
1
=
_
x
2
0
+
m
a
v
2
0
_
1/2
(13.4.21)
and the problem is solved. The particle swings back and forth past the origin in simple harmonic
motion with a frequency that depends both on the mass m and the constant a in the potential.
The solution above was a good deal of work. If the potential energy had been at all complicated it
is quite likely that the resultant integrals could not have been integrated.
In that case the only recourse would be to numerical solutions. Indeed, that is most often the case
with problems that come up in real life. Well see some examples of that below.
13.5 Making Mechanics More Simple
Problems can often be drastically simplied by a change in coordinates. For example think of two
particles bonded together by some potential but otherwise free to move in space.
We can let r
1
and r
2
be two coordinate vectors of three components each, representing the six
coordinates in the problem.
To completely specify the motion of these two particles wed need 12 initial conditions: six initial
positions (three for each particle) and six velocities (three for each particle).
That would give us six dierential equations of the second order to solve. And if the particles
are bonded by a potential that depends on the distance between them, it would depend on all
six coordinates. That means that the six dierential equations are coupled and must be solved
simultaneously.
This is a dreary task in cartesian coordinates.
13
Had we chosen the minus sign we would only have aected the arbitrary constant since sin() = sin( +), and
the can be incorporated into the arbitrary constant c
2
But youve doubtless already thought of a better way. We know from elementary physics that the
motion of the center of mass of a group of particles is independent of the motion of the particles
about the center of mass.
14
So we do one of those awful transformation of coordinates so that we have our six new position
coordinates, three representing the motion of the center of mass (X, Y, Z) and three representing
the motion of the particles about their center of mass. And a convenient choice there would be
spherical coordinates r, , and .
Since there is no force acting on the center of mass, it is moving with constant velocity (which may
be zero) in some xed direction. We worked this out in the rst example in the last section.
And we know that the potential energy depends only on r and not on the angles , and . Now the
potential energy occurs in only one dierential equation, the one for r. So instead of six coupled
second order dierential equations, We have three that weve already integrated (the center of mass
coordinates), two which are trivial since there is no potential involved, and only one uncoupled
equation that could cause us any dicult.
Thats a big gain.
But can we simply apply Newtons laws to these new coordinates?
The three coordinates of the center of mass are cartesian, so it would seem that theres no problem
there. They should (and do) obey Newtons laws in the form wed talked about above.
But what about the spherical coordinates? What equations of motion to they observe? I suspect
that you already know that spherical coordinates are not quite as simple as cartesian ones.
Let me bring up another diculty. What if the two particles were stuck together with an iron
rod so that their distance apart could not change? Now we have what is known as a constraint, an
unchanging relationship among the variables.
Here the constraint is best expressed as r = constant. But obviously there are situations where
constraints are not so simple such as a bead frictionlessly sliding on a rod of complex shape.
What do we do now?
The answer is that we need a reformulation of the laws of mechanics that will do two things: The
rst is that it will let us use generalized coordinates. The second is a corresponding law that
works with generalized coordinates to give us the proper dierential equations to solve
Generalized coordinates are coordinates that are set up specically for a particular problem. They
need not be cartesian or even orthogonal. But there must be enough of them to totally specify the
problem. In the sample problem weve been discussing we know that we will need six coordinates
for our two particles. How we chose them is up to us.
The freedom to make this choice is the major advantage of generalized coordinates.
13.5.1 Coordinate Transformations
Before we present the uniform law that gives us the proper dierential equations to go with our
generalized coordinates, we need to study such coordinates a bit more.
The uniform law uses two concepts, the kinetic energy denoted by K and the potential energy
denoted by U. The choice of symbol for the potential energy sometimes leads to confusion with the
14
Well take that as axiomatic right now since weve not discussed this yet. But we shortly will. See Section 13.6
on page 173.
volume of a system. Context should tell you which is meant. But we will try to keep confusion to a
minimum.
We know what the kinetic energy is in cartesian coordinates. It is made up of components, one for
each coordinate, of the form m x
2
/2, where x here stands for any cartesian coordinate.
The potential energy is usually a function of the coordinates themselves. However, unlike the kinetic
energy which has a term for each coordinate, the potential energy is often one single expression
possibly involving all the coordinates. An example might be:
U(x
1
, y
1
, z
1
, x
2
, y
2
, z
2
) =
k
2
_
(x
2
x
1
)
2
+ (y
2
y
1
)
2
+ (z
2
z
1
)
2
1/2
(13.5.1)
When using generalized coordinates we must be sure that we have enough coordinates to specify
the problem and that we know how to express both the kinetic and potential energies in the new
coordinate system.
The rst part is easy enough. We need no more than six quantities per particle. The six have to be
chosen so that the six cartesian coordinates can be expressed in terms of the new coordinates.
The second part is really the motivation here. The kinetic energy rarely poses any problems. But it
is extraordinarily helpful if we can choose coordinates that simplify the potential energy.
For instance, in Equation (13.5.1) above, if we set things up so that we used the distance between
particles one and two as our coordinate (lets call it q) then we can write the potential energy as
U(q) =
kq
2
(13.5.2)
which is certainly an improvement!
To make this concrete Let us work out in both abstract symbols and via a real example the conversion
to polar coordinates.
We rst have to dene our generalized coordinates:
q
1
= r = (x
2
+y
2
)
1/2
q
2
= = tan
1
(y/x)
(13.5.3)
where the qs are our notation for generalized coordinates and r, , and are the radius and angle
respectively in polar coordinates.
The general form of these equations is:
q
1
= q
1
(x, y, t)
q
2
= q
2
(x, y, t)
(13.5.4)
where the time t is included because it is sometimes convenient to dene a moving coordinate system.
In most of the work we do here the coordinate transformations will be independent of time.
There is a second set of equations describing the reverse transformation:
x = x(q
1
, q
2
, t)
y = y(q
1
, q
2
, t)
(13.5.5)
The question is: under what conditions can we go back and forth between these representations.
The necessary and sucient conditions for going back and forth between Equations (13.5.4) and
Equations (13.5.5) can be obtained by making a small change in the qs and seeing what the corre-
sponding change in the cartesian coordinates would be.
We end up with two equations:
dx =
_
x
q
1
_
dq
1
+
_
x
q
2
_
dq
2
(13.5.6)
dy =
_
y
q
1
_
dq
1
+
_
y
q
2
_
dq
2
Of course, if there were n dimensions, wed have n terms in each equation and n equations.
The Equations (13.5.6) can be solved for the dqs if and only if the determinant of the coecients
does not vanish. In symbols:

(x/q
1
) (x/q
2
)
(y/q
1
) (y/q
2
)
,= 0 (13.5.7)
This determinant is called the Jacobian determinant of the xs with respect to the qs. It is
usually denoted by the following curious symbol
(x, y)
(q
1
, q
2
)
(13.5.8)
a notation that many students nd confusing.
15
When you see Equation (13.5.8), think of Equa-
tion (13.5.7).
As an example lets see how this works out for polar coordinates.
_
x
r
_
= cos
_
x
_
= r sin (13.5.9)
_
y
r
_
= sin
_
y
_
= r cos
This is the Jacobian for the transformation from polar to cartesian coordinates. The determinant
of this is
r cos
2
+r sin
2
= r (13.5.10)
which is not in general zero. So this transformation is permitted. If we actually do the transformation
we get
x = r cos y = r sin (13.5.11)
Furthermore, the scale factor for this transformation is r. That is
dxdy = rdrd (13.5.12)
This is the volume element in polar coordinates.
The velocities in generalized coordinates are obtained from Equations (13.5.11) by dierentiation.
For polar coordinates this is:
x = r cos r
sin (13.5.13)
y = r sin +r
cos
Acceleration is found by a further dierentiation:
x = ( r r
2
) cos (r
+ 2 r
) sin (13.5.14)
y = ( r r
2
) sin + (r
+ 2 r
) cos
15
Including me when I was a student...
but we generally will not need this.
With this we can now construct the kinetic energy in polar coordinates. We start with Equa-
tions (13.5.13) on the preceding page and form x
2
and y
2
and add them together. The result is:
x
2
+ y
2
= r
2
+r
2

2
(13.5.15)
Multiplication by m/2 where m is the mass of the particle gives the kinetic energy.
Transformations to other more or less standard coordinate systems are contained in an appendix
to this chapter. But it would be wrong to think that generalized coordinates are useful only for
converting from cartesian coordinates to some other-well known coordinate system. They have far
more uses than that.
One must not lose sight of the fact that there is huge latitude in picking generalized coordinates for
any given problem. Thus in general one has to be prepared to work out the transformation.
13.5.2 The Lagrangian
If the kinetic energy of a system is denoted by K and written as a function of the generalized
coordinates and velocities of the system, and if U denotes the potential energy of the system written
as a function of the generalized coordinates and velocities then
L(q, q) = K U (13.5.16)
is called the Lagrangian of the system. Note that here vectors are being used, one per particle, so
that the indices i index the particles in the system. The Lagrangian is written as L(q, q) to remind
us that the Lagrangian must be written in terms of the generalized positions and velocities.
The Lagrangian ts into a certain equation called cleverly enough, Lagranges Equation. This
equation is:
d
dt
_
L
q
i
_
_
L
q
i
_
= 0 (13.5.17)
There is one such equation for each coordinate.
16
This equation keeps this form no matter what generalized coordinates are chosen. Thats a huge
advantage. Lagranges Equation gives a second order partial dierential equation for any mechanical
system whose solution can then be found by integrating the dierential equations.
17
By proper choice of the generalized coordinates one can often make constraints go away. This
happens automatically when one chooses suitable coordinates for the problem. Instead of worrying
about how to represent a constraint, one makes it go away and in the process one loses one degree
of freedom.
In using the Lagrangian it is useful to think of their being a Lagranges Equation for each coordinate.
Since coordinates are generalized and no are longer necessarily associated simply with a particular
particle mass, this is a necessary mode of thought.
16
Note how I have artfully kept secret how one would ever obtain the Lagrangian and its corresponding equation
or how one would prove that it has the properties I discuss below. The reader is referred to any standard text on
classical mechanics for the answers to these questions. Answering them here would triple the size of this chapter.
17
At least, thats the hope. Often the dierential equations cannot be solved in terms of known functions and
numerical methods must be resorted to. But in principle, Equation (13.5.17) and the initial conditions together
contain the solution.
In addition, in the Lagrangian viewpoint there is associated with each generalized coordinate a
generalized momentum p
i
given by
p
i
=
_
L
q
i
_
(13.5.18)
The downside of using the Lagrangian is that one needs to transform between generalized coordinates
and cartesian coordinates. The gain is independence from any particular coordinate system.
Examples will make these ideas (including the downside) clear.
Example 13.3:
Let us assume that we once again have a particle of mass m moving in a three-dimensional space with
no potential energy present. We will resolve this problem using the Lagrangian formulation. The
reader should understand that it is cumbersome in this case because we are using a sledgehammer
to crack a peanut.
Here the kinetic energy K is simply:
K =
1
2
m( x
2
+ y
2
+ z
2
) (13.5.19)
and the potential energy is zero:
U(x, y, z) = 0 (13.5.20)
So the Lagrangian is identical to the kinetic energy
L =
1
2
m( x
2
+ y
2
+ z
2
) (13.5.21)
We have three Lagranges equations, one for each of x, y, and z. Ill use x as an example. The other
two equations are identical in form:
d
dt
_
L
x
_
_
L
x
_
= 0 (13.5.22)
We have
_
L
x
_
= m x (13.5.23)
and so
d
dT
_
L
x
_
= m x = 0 (13.5.24)
since the derivative of L with respect to x yields zero, there being no xs in L.
The result is what we expect. Newtons Second Law! The force (the mass times the acceleration) is
zero since there is no potential. A rst integration of the acceleration will yield only a constant, the
initial speed in the x-direction. A second will yield another constant, the initial x-coordinate. Our
result will be simply:
x = v
x,0
t +x
0
(13.5.25)
with identical results for y and z.
Heres a more complicated example:
Example 13.4:
Consider a pendulum swinging from the origin and hanging down in the direction of the negative
y-axis. It swings back and forth in the x direction. The pendulum rod is of length R and is, of
course, weightless. The pendulum bob is of mass m. The potential energy is
U(x, y) = mgRcos (13.5.26)
where g is a constant (the acceleration due gravity) and is the angle between the pendulum rod
and the y axis.
If we did this problem in cartesian coordinates (and we can) wed have two coupled dierential
equations to solve. But if we switch to polar coordinates r and , r will be constant and we will
have only one coordinate, .
The appropriate equations for x and y are simpler than the general ones given in Equation (13.5.11)
on page 169. They are:
x = Rsin y = Rcos (13.5.27)
The velocity is:
x = R

cos y = R

sin (13.5.28)
Squaring and adding gives:
x
2
+ y
2
= R
2

2
(13.5.29)
and the Lagrangian is then:
L(,

) =
1
2
mR
2

2
+mgRcos (13.5.30)
Then
_
L
_
= mR
2

(13.5.31)
d
dt
_
L
_
= mR
2

(13.5.32)
Since
_
L
_
= mgRsin (13.5.33)
the equation of motion is:
mR
2
+mgRsin = 0 (13.5.34)
or, simplied slightly
=
g
R
sin (13.5.35)
where it can be seen that the mass has canceled out and the motion depends only on g and the
length of the pendulum R.
Going further than this is dicult. As it stands the equation cannot be solved in terms of elementary
functions.
18
It can be solved in terms of elliptic integrals of the rst kind.
19
But to do that here
would take us down complicated paths that we do not need to enter. The result is that the period
of the real pendulum does depend on the amplitude of its motion. Thus a real pendulum with a
nite swing is not periodic.
There is a standard approximation. That is to set sin = . This is valid for small where sin :
=
g
R
(13.5.36)
18
Elementary functions are usually taken to mean trig functions, exponentials and algebraic expressions and their
inverses.
19
Which we will not discuss here. But it is frightening, isnt it, to think that there not only things called elliptical
integrals, but that there is more than one kind...
Which is a type of equation that weve already seen. The solution is harmonic motion; a periodic
function so we can write a trial solution as:
= Asint +B (13.5.37)
where A, B and are constants.
We dierentiate this trial solution twice:
= Acos t +B (13.5.38)
=
2
Asint +B (13.5.39)
Plugging this in to our dierential equation gives:
2
Asint +B =
g
R
Asint +B (13.5.40)
from which we see that
=
_
g
R
_
1/2
(13.5.41)
with known the initial conditions can be plugged into our solution for both and

to determine
A and B.
13.6 The Center of Mass
With the Lagrangian in hand we can prove that any system of particles subject not only to internal
forces but to external ones as well, can be broken into two types of motion: the motion of the center
of mass and the motion of the particles about the center of mass.
Let us consider a group of n particles,
20
each of mass m
i
, (i = 1, n) and vector coordinate r
i
,
(i = 1, n).
Each of these particles is acted on by two types of force. One, caused by some external source, is
denoted by f
i
. The other is caused by pairwise interactions between the particles themselves. The
force due to the interaction between particles i and j is denoted by f
ij
. In all cases here and below
the indices run from 1 to n.
The total force on particle i is F
i
. Newtons Second Law, (13.4.2) on page 164 gives us:
m
i
r
i
= F
i
= f
i
+
j=i
f
ij
(13.6.1)
If we sum up Equation (13.6.1) for all values of i, we get:
i
m
i
r
i
=
i
f
i
+
j=i
f
ij
(13.6.2)
Now Newtons Third Law, Equation (13.4.3) on page 164 causes the double sum to vanish. This is
because for every term f
ij
in the last summation in Equation (13.6.2) there is also a term f
ji
which
is equal and opposite to f
ij
. Thus we have:
i
m
i
r
i
=
d
2
dt
2
i
m
i
r
i
=
i
f
i
= F (13.6.3)
20
The material in this section follows closely the development of Slater and Frank, Mechanics, McGraw-Hill, 1947,
p. 90, and material taken from Whittaker, Analytical Dynamics, Dover, 1944. This is a reprint of the Fourth Edition
of 1937 published by the Cambridge University Press. Yes, sometimes the oldest books are the clearest.
If we now denote the total mass of the system by M, then the coordinates of the center of mass
of the system R are given by:
R =
m
i
r
i
M
(13.6.4)
and we then use this in Equation (13.6.3) on the previous page we get:
d
2
dt
2
MR = M

R = F (13.6.5)
The meaning of this is simple. A system of particles acted upon by an external force behaves as if
the force acts upon a single particle at the center of mass where that particle has a mass equal to
the total mass of the system.
There is more: Let us dene new coordinates for the system of particles with the center at the center
of gravity of the system and the axes parallel to the old axes. The new coordinates will be denoted
by r
:
r
i
= r
i
R (13.6.6)
The kinetic energy T for such a system is given by:
K =
1
2
i
m
i
r
2
i
(13.6.7)
We will reformulate this in terms of the center of mass coordinates R and the coordinates of the
particles r
i
relative to the center of mass.
We rearrange Equation (13.6.6) slightly and dierentiate it once with respect to time to get:
r
i
= r
i
+

R (13.6.8)
and place that into Equation (13.6.7). The result after squaring Equation (13.6.8) is:
K =
1
2
i
m
i
_
r
2
i
+

R
2
_
+ 2

R
i
m
i
r
i
(13.6.9)
The last term is zero as can be seen by resubstituting r
i
+

R for r
i
in Equation (13.6.9):
i
m
i
r
i
=

R
i
m
i
( r
i
+

R)
=

R
i
m
i
r
i

R
2
i
m
i
= M

R
2
M

R
2
= 0 (13.6.10)
Thus we have an important result:
K =
1
2
i
m
i
_

R
2
+ r
2
_
(13.6.11)
If the external forces f
i
and the internal forces f
ij
are both conservative
21
then the potential energy
U of the system is given by:
U(R, r
1
. . . r
n
) = U(R) +U(r
1
. . . r
n
) (13.6.12)
Then Lagrangian for the system, KU splits into two Lagrangians, one for the motion of the center
of mass and the other for the relative motion about the center of mass:
L(R, r
1
. . . r
n
) = L(R) +L(r
1
. . . r
n
) (13.6.13)
each of which can be solved separately.
21
And hence derivable from a potential U by dierentiation
13.7 The Hamiltonian
The Lagrangian is an exceptionally useful concept. Indeed, in many branches of physics one starts
with the Lagrangian and proceeds onward from there.
However there is another formulation of physics that is more natural for our purposes. Indeed you
have already run into it in many dierent guises particularly in quantum mechanics.
This formulation uses what is called the Hamiltonian in a set of equations called Hamiltons Equa-
tions. The next several subsections explore this.
13.7.1 Hamiltons Equations
We have previously talked about the Lagrangian for a system of N particles. It can be written
symbolically as:
L = L(q
j
, q
j
, t) (13.7.1)
where the q
j
and q
j
stand for the entire set of generalized coordinates and velocities for the problem.
22
Weve also talked (briey) about the generalized momentum p
j
that goes with a particular
velocity q
j
:
p
j
=
L
q
j
(13.7.2)
Using this we can take Lagranges equation
d
dt
L
q
j
L
q
j
= 0 (13.7.3)
and rewrite it as:
p
j
=
L
q
j
(13.7.4)
Employing the Lagrangian in Lagranges equation leads to a set of second order partial dierential
equations that describe the motion of the system in time. The variables are the positions and the
velocities.
However, sometimes it is inconvenient to use the velocities q
j
and it more useful to use momenta
instead. In fact this leads to a very symmetric formulation of classical mechanics.
Doing the switch from velocities to momenta as independent variables isnt just a matter of multi-
plying all the velocities by m and changing the resulting mvs to momenta. We are using generalized
coordinates and the momentum that goes with a coordinate q is not necessarily m q. The appropriate
momenta is that given by (13.7.2).
To do this more complex change in variable we need to use what is known as a Legendre transfor-
mation.
To see how this works lets take the total derivative of the Lagrangian:
dL =
j
L
q
j
dq
j
+
j
L
q
j
d q
j
+
L
t
dt (13.7.5)
Using our denitions above we can write this as:
dL =
j
p
j
dq
j
+
j
p
j
d q
j
+
L
t
dt (13.7.6)
22
There can be fewer than 6N total variables because there may well be constraints between some of them.
and this can be rewritten as:
dL =
j
d(p
j
q
j
) +
j
p
j
dq
j

j
q
j
dp
j
+
L
t
dt (13.7.7)
which is better seen by taking Equation (13.7.7) and working backwards.
What we have now can be written in the form:
d
_
j
p
j
q
j
L
_
=
j
p
j
dq
j
+
j
q
j
dp
j

L
t
dt (13.7.8)
which, to the annoyance of the reader, will now simply be tucked away until needed later.
Lets invent a new function called the Hamiltonian after Sir W. R. Hamilton who actually invented
it back in the rst part of the 19th century. In his honor well call it H:
H =
j
p
j
q
j
L (13.7.9)
Curiously, this is exactly what is on the left hand side of Equation (13.7.8)
Theres one catch. While the Lagrangian is written in terms of the generalized coordinates, velocities,
and time, we want to write the Hamiltonian as a function of the generalized coordinates, momenta,
and time. If we take the Hamiltonian to be a function of those variables we can then take the total
derivative of Equation (13.7.9). This is:
dH =
j
H
q
j
dq
j
+
j
H
p
j
dp
j
+
H
t
dt (13.7.10)
If we now compare this with Equation (13.7.8) we can identify the derivatives term by term
23
q
j
=
H
p
j
(13.7.11)
p
j
=
H
q
j
(13.7.12)
_
H
t
_
=
_
L
t
_
(13.7.13)
These are called Hamiltons Equations and make up a set of 2N rst order dierential equations
for the motion of the system. Further, the generalized coordinate q and the generalized momentum
p are said to be canonical conjugates of each other. They individually may have any units, but their
product must have the units of action (which are the same as those of angular momentum.
13.7.2 More on Legendre Transformations
Legendre transformations are a way to construct a new function from an old one by a change in
variable. This is not the same as simply doing a change in variable that leaves a function unchanged.
Legendre transformations are often used in thermodynamics and will, in a slightly disguised form,
play a role in statistical mechanics as well. Since it also is the method by which we take an old
function, the Lagrangian L(q
j
, q
j
, t), and convert it to a new function, the Hamiltonian, H(p
j
, q
j
, t)
with dierent variables, it pays to take a somewhat longer look at the method.
Lets consider a function of three variables f = f(x, y, z) and write its total dierential:
df = udx +vdy +wdz (13.7.14)
23
This is because Equation (13.7.8) is also the total derivative of the Hamiltonian and a function can have only one
total derivative with the same independent variables.
where clearly
u =
df
dx
, v =
df
dy
, and w =
df
dz
(13.7.15)
We wish now to change to a new function that will be a function not of x, y, and z, but of u, y, and
z. Well call the new function g and dene it this way:
g(u, y, z) = f(x, y, z) ux (13.7.16)
The total dierential of g is then:
dg = df udx xdu (13.7.17)
and substituting in the total dierential for df weve already written above, we get:
dg = xdu +vdy +wdz (13.7.18)
which is clearly a function of u, y, and z.
Example 13.5:
Given the thermodynamic function H(S, p, ) and the variables T, V, and n dened by:
dH = TdS V dp +dn, (13.7.19)
convert H to a new function G(T, p, n).
To do this we have to eliminate S in favor of T. A moments though tells us that we need to dene
the new function G as:
G = H TS (13.7.20)
whose derivative is:
dG = dH d(TS) (13.7.21)
giving us
dG = TdS V dp +dn TdS SdT (13.7.22)
or more simply:
dG = SdT V dp +dn (13.7.23)
What we have done is transformed the enthalpy into the Gibbs free energy.
What we have done in creating the Hamiltonian in Equation (13.7.9) on the preceding page is exactly
the same. Weve removed the velocity dependence in the Lagrangian and substituted instead the
momentum dependence. In the process we get a new function, the Hamiltonian.
13.7.3 Phase Space
We are used to coordinate space which for one particle is a three dimensional space in which the
particle is located. If we have two particles, we have six coordinate dimensions, three each for the
two particles.
We can represent the positions of the two particles either as two points on a three-dimensional graph
or as one single point on a six-dimensional graph.
For N particles we can have N points on a three dimensional graph, or one point on a 3N-dimensional
graph.
Each representation is sometimes useful. For example if our N particles are contained in a box, the
cloud of points representing those particles will be conned to a single region of coordinate space.
Or if we use a 3N-dimensional graph, our single point is likewise contained in a region of coordinate
space.
We could, if we wanted, similarly graph the momentum of a particle in a three-dimensional momen-
tum space.
And again, if we wished, we could graph the momenta of N particles either by 3N points on a
three-dimensional graph or a single point in a 3N-dimensional graph.
If we look back at Hamiltons Equations (13.7.11) on page 176 and (13.7.12) on page 176 we see that
p and q are treated almost symmetrically (the dierence is the minus sign in Equation (13.7.12)
on page 176.) That tempts us to consider graphing both the position and the momentum of our N
particles on a single graph. It turns out that this is a good thing to do.
The space dened by such a graph is known as phase space. And again, it can be either a 6-
dimensional graph with a cloud of N points in it or a 6N-dimensional graph with a single point on
it.
The rst sort of graph plots the individual particles in a system. The second plots the system itself.
As time goes on the points move because the positions and the momenta of the particles change in
time. So the system point moves around in its 6N-dimensional space, tracing out a path as it does
so. This path is called its trajectory in time.
A moments thought should show that through any point in system phase space there can be one
and only one trajectory.
Why? Because given the 6N coordinates and momenta of all the particles in the system, Hamiltons
equations (or Lagranges equations, if you wish) give us the the next point that the system will
occupy. And with the same initial conditions, they will always give us the same next point. If
trajectories crossed, there would be two next points at the point of intersection. Weve just seen
that there is only one. Thus trajectories in phase space cannot cross themselves.
13.7.4 Properties of the Hamiltonian
The Hamiltonian may change along a trajectory in phase space. To see how let us look at its time
derivative. Since the Hamiltonian is most generally a function of coordinates, momenta, and the
time, i.e H(p,q,t) then we can get the time derivative by rst writing the total derivative as:
dH =
j
__
H
q
j
_
dq
j
+
_
H
p
j
_
dp
j
_
+
_
H
t
_
dt (13.7.24)
(which is the same as Equation (13.7.10) on page 176) and then dividing by dt to get:
dH
dt
=

H =
j
__
H
q
j
_
q
j
+
_
H
p
j
_
p
j
_
+
_
H
t
_
(13.7.25)
Looking back at Hamiltons Equations (13.7.11) on page 176 and (13.7.12) on page 176, Equa-
tion (13.7.25) becomes:
dH
dt
=

H =
j
( p
j
q
j
+ p
j
q
j
) +
_
H
t
_
(13.7.26)
or
H =
_
H
t
_
=
_
L
t
_
(13.7.27)
where the last is from Equation (13.7.13) on page 176
Now if the Lagrangian L is not a function of the time, then L/t = 0 and
H = 0 (13.7.28)
so that the Hamiltonian is everywhere constant along a trajectory.
It can be shown
24
that if the Hamiltonian is independent of the time then:
H = K +U (13.7.29)
or, in words, the Hamiltonian is the total energy of the system.
Then in the case that the Hamiltonian is not a function of time and that the force can be derived
from a potential energy, the Hamiltonian is the total energy of the system and that energy does not
change as the system evolves in time.
13.7.5 Examples
The purpose of these examples is to show how the Hamiltonian formulation of mechanics works. To
that end we will repeat several of the previous examples.
Example 13.6:
Again let us consider a free particle of mass m moving without any forces acting on it whatsoever.
The particle is initially at x
o
, y
o
, and z
o
just as before. But this time instead of initial velocities we
specify initial momenta p
x,o
, p
y,o
, and p
z,o
.
The Hamiltonian for this problem is easily written:
H =
_
p
2
x
2m
+
p
2
y
2m
+
p
2
z
2m
_
(13.7.30)
Hamiltons equations are:
x =
_
H
p
x
_
=
p
x
m
p
x
=
_
H
x
_
= 0 (13.7.31)
y =
_
H
p
y
_
=
p
y
m
p
y
=
_
H
y
_
= 0 (13.7.32)
z =
_
H
p
y
_
=
p
z
m
p
z
=
_
H
z
_
= 0 (13.7.33)
There is no potential energy. Thus all the equations for the momenta equal zero. This means that
the momenta are all constants.
There is a general point here. When a coordinate does not appear in the potential, then the
corresponding momentum equation is equal to zero and that momentum is constant. This can often
be used to make the solution of the remaining equations more simple.
So now we have:
p
x
= p
x,o
p
y
= p
y,o
p
z
= p
z,o
(13.7.34)
Substitution of these into the other Hamiltons Equations gives:
x =
p
x,o
m
y =
p
y,o
m
z =
p
z,o
m
(13.7.35)
24
But it would take us a bit aeld to do so.
which immediately integrate to
x =
p
x,o
m
t +x
o
y =
p
y,o
m
t +y
o
z =
p
z,o
m
t +z
o
(13.7.36)
which is the nal result. Note that these are the same answers we had before as we can see if we
write the momentum in terms of the mass and the velocity.
Heres another of our previous examples:
Example 13.7:
Consider a pendulum swinging from the origin and hanging down in the direction of the negative
y-axis. It swings back and forth in the x direction. The pendulum rod is of length R and is, of
course, weightless. The pendulum bob is of mass m. The potential energy is:
U(x, y) = mgRcos (13.7.37)
where g is a constant (the acceleration of gravity) and is the angle between the pendulum rod and
the y-axis.
Convenient coordinates for this problem are r, the distance from the origin to the pendulum bob,
and , the pendulum angle.
With these polar coordinates the kinetic energy is:
m
2
R
2

2
(13.7.38)
and the Lagrangian
L(,

) = K U =
m
2
R
2

2
+mgRcos (13.7.39)
Remember that we need the Lagrangian in order to nd the proper momentum conjugate to . Of
course, with experience one can omit this step and simply write down the conjugate momentum.
That momentum is given by:
p
=
_
L
_
= R
2

(13.7.40)
Solving for theta and squaring gives
2
=
p
2
m
2
R
4
(13.7.41)
which now gets inserted into the kinetic energy above to make it a function of the angular momentum
and not the angular velocity. We can then write the Hamiltonian:
H(, p
) =
1
2
p
2
mR
2
mgRcos (13.7.42)
From Hamiltons equations we get:
=
_
H
p
_
=
p
mR
2
(13.7.43)
p
=
_
H
_
= mgRsin (13.7.44)
Sadly, just as with our example using the Lagrangian, these equations cannot be solved in terms of
simple functions.
25
But once again we can simplify by assuming that is small. Then the potential
energy becomes U = - mgR and the last equation above is then:
p
=
_
H
_
= mgR (13.7.45)
Once again we can assume a solution of the form:
= Asin(t +B) (13.7.46)
with exactly the same results as before.
25
No, Im not going to mention elliptical integrals again.
13.8 Appendix: Standard Coordinate Systems
Transformation of coordinates happens so often that it seems a good idea to collect together in one
place all needed information on such transformations.
26
Three will be dealt with here. These are the transformations from cartesian coordinates x and y
to polar coordinates r and , and the transformation of the cartesian coordinates x, y, and z to
cylindrical coordinates , , and z and to the polar coordinates r, , and .
13.8.1 Polar Coordinates
The polar (r, ) coordinate system is superimposed on a two-dimensional cartesian system. The
angle is measured as positive as it runs from the positive x-axis toward the positive y-axis.
The transformation equations are:
x = r cos y = r sin (13.8.1)
x = r cos r
sin y = r sin +r
cos (13.8.2)
Then:
x
2
= r
2
cos
2
2r r
sin cos +r
2

2
sin
2
(13.8.3)
y
2
= r
2
sin
2
+ 2r r
sin cos +r
2

2
cos
2
(13.8.4)
and 5
x
2
+ y
2
= r
2
+r
2

2
(13.8.5)
The Jacobian can be found from
_
x
r
_
= cos
_
x
_
= r sin (13.8.6)
_
y
r
_
= sin
_
y
_
= r cos (13.8.7)
which, when evaluated gives
J =
(x, y)
(r, )
= r (13.8.8)
This means that the area element transforms as:
dxdy = r dr d (13.8.9)
13.8.2 Cylindrical Coordinates
Cylindrical coordinates , , z are basically polar coordinates with a z -dimension added.
x = cos y = sin z = z (13.8.10)
26
If this material is not useful in this course, it almost certainly will be useful somewhere, sometime. As a student
I bemoaned the fact that the material included here was often scattered in several places and hard to nd.
and the time derivatives are
x cos

sin y sin +

cos z = z (13.8.11)
The squares of the time derivatives are essentially identical to those for polar coordinates. The sum
of those squares is:
x
2
+ y
2
+ z
2
=
2
+
2

2
+ z
2
(13.8.12)
The Jacobian can be found from:
_
x
r
_
= cos
_
x
_
= r sin
_
x
z
_
= 0 (13.8.13)
_
y
r
_
= sin
_
y
_
= r cos
_
y
z
_
= 0 (13.8.14)
_
z
r
_
= 0
_
z
_
= 0
_
z
z
_
= 1 (13.8.15)
The Jacobian turns out to be J = r, thus
dxdy dz = r dr d dz (13.8.16)
13.8.3 Spherical Coordinates
The spherical coordinates r, , are superimposed on a three-dimensional cartesian coordinate
system. The angle is measured as positive starting from the positive x-axis increasing as it moves
toward the positive y-axis and beyond.
27
It can range from 0 to 2. The angle is measured from
the positive z axis down to the negative z -axis and measures the angle made with the positive z
axis. It ranges from 0 to .
x = r sin cos y = r sin sin z = r cos (13.8.17)
The time derivatives are a bit tedious but it is good to have them written out
x = r sin cos +r
cos cos r

sin sin
y = r sin sin +r
cos sin +r

sin cos
z = r cos r
sin
(13.8.18)
Squaring these is exceptionally tedious and very prone to error but when done correctly many of
the terms either combine or cancel. The result of summing the squares is:
x
2
+ y
2
+ z
2
= r
2
+r
2

2
+r
2

2
sin
2
(13.8.19)
The Jacobian can be found from:
_
x
r
_
= sin cos
_
x
_
= r cos cos
_
x
_
= r sin sin (13.8.20)
_
y
r
_
= sin sin
_
y
_
= r cos sin
_
y
_
= r sin cos (13.8.21)
_
z
r
_
= cos
_
z
_
= r sin
_
z
_
= 0 (13.8.22)
27
Sadly, the angle is the same as the angle in polar and cylindrical coordinates but here has a dierent name.
This is admittedly confusing.
so that J = r
2
sin.
The volume element transformation turns out then to be:
dxdy dz = r
2
sin dr d d (13.8.23)
14. Classical Statistical Mechanics
14.1 Introduction
As weve said, the aim of statistical mechanics is to be able to derive all the mechanical properties
of a macroscopic molecular system from the laws of molecular dynamics.
To do this statistical mechanics must therefore be concerned with both the microscopic and macro-
scopic representations of a system.
Systems come in two varieties, static and dynamic. Dynamic systems are those that have at least
some macroscopic properties that change in time.
1
Although they can be treated by the methods
of statistical mechanics, we shall not consider dynamic systems in any detail in this text.
On the other hand, static systems have macroscopic properties that are constant and do not change
with time.
2
Such properties are often called equilibrium or thermodynamic properties. The branch
of statistical mechanics that deals with such systems is sometimes known as statistical thermo-
dynamics.
The computation of equilibrium macroscopic properties from the microscopic properties seems very
complex.
In the microscopic view we have a set of initial conditions, either 3N positions and 3N velocities
(for a Lagrangian calculation) or 3N momenta and 3N coordinates (for a Hamiltionian calculation).
Either way there are 6N initial conditions.
We then use these initial conditions to integrate the equations of motion and nd out how the
position and momentum (or velocity) variables change in time.
If we use a Hamiltonian view and construct a phase space of 6N coordinates, the state of the system
at any time will be represented by a point. With time that point will move in the phase space.
The track it traces out is called its trajectory. In general the trajectory is an endless line continuing
onward as time increases without ever returning to any point in the phase space visited before. Thus
the trajectory in general is not closed except in some very simple (or very strange) systems where
it can be.
An example of a system with a closed trajectory is a system of N non-interacting identical pendula.
If they all start with the same initial position and the same initial momenta, they will return to
those positions and momenta at the same time, over and over again. Of course thats a trick system.
3
What does happen is that a non-repeating trajectory comes back over and over again to the imme-
diate neighborhood of any point that it once passed through. So any small region around a point
1
The reason for the imprecision is that a dynamic system may have some static properties as well as dynamic ones.
The volume of such a system may, for example, be xed while its pressure changes.
2
One must be careful here. To be exactly correct, wed have to observe a system forever in order to be sure that it
was really static. Even graduate students do not have that much time. As a practical matter it is only necessary that
the properties being observed not change in a noticeable way over a time period many times longer than the period
of observation.
3
However, if the pendula are coupled in any way, the problem becomes both complex and very interesting, and the
trajectories are not closed. It is, in fact, a simple model of a crystalline solid and will be discussed later.
185
Chapter 14: Classical Statistical Mechanics 186
on the trajectory becomes in time densely packed with trajectories that almost repeat.
4
In any event, on a microscopic level we are faced with 6N variables. Even assuming that we can
know how each of these change in time (and we will assume that), we still have the problem of how
to extract macroscopic thermodynamic information from this data.
We know that on a macroscopic level very few variables are needed to specify the system. That
is, if we were to construct two macroscopically identical systems wed have to make exactly F + 1
thermodynamic variables the same.
Here F is the number of independent intensive variables the system has.
5
We have to add a single
extensive variable to this in order to x the size of the system.
6
The quantity F is given by the
Gibbs Phase Rule:
F = C P + 2 (14.1.1)
where here C is the number of independent chemical components and P is the number of distinct
phases in the system. Thus a system consisting of a gas of argon atoms at a reasonable temperature
and pressure requires three variables for its macroscopic specication, for example pressure, temper-
ature, and volume. It requires 6N for its microscopic specication. Theres a bit of a discrepancy
there...
How do we make this transition from 6N to three? The insight we need is that the three thermody-
namic variables are averages over an enormous number of microscopic variables. Hence the name of
the eld: statistical mechanics. The question is, how do we do this averaging? The key lies in the
trajectories.
14.2 Liouvilles Equation
Weve already discussed classical mechanics and introduced both Lagrangian and Hamiltonian for-
mulations of mechanics. And weve introduced phase space. We need now to return to those ideas
in order to move ahead.
Consider a classical system containing N particles. Well let q stand for the 3N generalized coordi-
nates and p for the 3N generalized momenta.
At any given instant the exact state of this system is given by a single point in a 6N-dimensional
phase space.
The motion of the particles in this system are governed by the 6N rst-order dierential equations:
q
j
=
_
H
p
j
_
p
j
=
_
H
q
j
_
(14.2.1)
so that, in principle if 6N initial conditions
q = q
o
p = p
o
(14.2.2)
are known, the motion of each particle in the system is then known for all time. In phase space the
point representing the state of the system moves in time as dictated by Hamiltons equations, tracing
out a trajectory that can never intersect itself. This last is obvious since if a given point is the point
of intersection, there can be, by Equations (14.2.1), only one next point. Since a crossing implies
4
The problem of repeating trajectories has a long and interesting history and a voluminous literature. It is known
as the Poincare Recurrence Theorem.
5
An intensive variable is one whose numerical value is not changed by dividing the system in two. Temperature
and pressure are examples.
6
An extensive variable is one whose numerical value is cut in half when the size of the system is cut in half.
that there are two next points, the one originally taken and the one to be taken now, crossings must
be impossible.
By the same token, two dierent trajectories also cannot intersect, since, with the systems identical,
such an intersection also implies two next points.
In addition to not crossing itself, the trajectory of the system may or may not be closed. If it is
closed, then the points lying on the trajectory will be revisited over and over again. The logic of
this also depends on Equation (14.2.1) on the preceding page.
If the trajectory is not closed, as is usually the case, then the system never repeats its state, but
moves forever without ever exactly repeating itself. It can be shown that under these conditions the
system will return to the neighborhood of a previous point on the trajectory innitely often. How
long this takes depends on the size of the neighborhood.
7
Now lets consider an ensemble consisting of / macroscopically identical copies of our original
system. These systems are independent and do not interact in any way. Each of course has identical
macroscopic values of (say) N, V, and E. And each will have drastically dierent sets of microscopic
variables. This is equivalent to saying that to each macrostate of a system there correspond a large
number of microstates. To make this plausible
8
note that two macroscopically identical systems of
gas molecules having identical Maxwell-Boltzmann velocity distributions will have the macroscopic
same properties. Thus we can change the properties of individual molecules how we will as long as
we keep the velocity distribution unchanged.
We will represent each of these systems as a point in our 6N-dimensional phase space. The ensemble
will then be a cloud of points in phase space.
A very useful quantity is (p, q, t), the density of ensemble members in a region of phase space of
size d = dq
1
. . . dq
3N
dp
1
. . . dp
3N
. This quantity will in general vary with position and momentum
and perhaps change in time.
Often is normalized so that it its integral over the whole phase space is the total number of
ensemble members /. Thus: _
(q, p, t) dqdp = / (14.2.3)

where indicates that the integration is to be taken over the entire phase space Here is a number
density of ensemble members in phase space.
Sometimes is also normalized to 1. In that case it is the fraction of ensemble members in a given
volume element, or, what is the same thing, the probability that a randomly chosen system is in a
given volume element. This is really a probability and to avoid confusion with Equation (14.2.3) we
will dene P(q, p, t) as:
_
P(q, p, t) dqdp = 1 (14.2.4)

This is often useful since depends on /, which is a purely arbitrary (but large) number. The two
are simply related: = /P.
Were interested in the phase space density of states because with it we can calculate the value of
any mechanical property.
9
If R is a mechanical property, then its expected value is:
R) =
_
R(p, q) (q, p, t) dqdp

_
(q, p, t) dqdp
(14.2.5)
7
This idea is embodied in the Poincare Recurrence Theorem.
8
We will demonstrate it later when we talk about quantum systems because the demonstration is trivial there.
9
A mechanical property is one that can be computed from a knowledge of the coordinates and momenta of a
system.
where clearly here weve used as the density of ensemble members.
Let us now ask how changes in time as the various trajectories develop in phase space.
10
First,
we consider the rate of change of with time at any point p, q in phase space. This point is
described by a multidimensional hypercube with one corner at p, q and a diagonally opposite corner
at p +dp, q +dq. The volume of this hypercube is d = dq
1
. . . dq
3N
dp
1
. . . dp
3N
.
The number of phase points inside this volume at any time t is then given by:
N = dq
1
. . . dq
3N
dp
1
. . . dp
3N
= d (14.2.6)
This number will change in time because the number of phase points entering the volume element
in unit time through one face will not necessarily be the same as the number leaving though the
opposite face in that time element.
To be specic let us consider the two faces perpendicular to the q
1
axis, one located at q
1
, the other
at q
1
+dq
1
.
Now in time dt, all systems outside the face at q
1
and moving toward it with speed q
1
= dq
1
/dt will
cross the boundary and enter the volume element. The number of such systems is:
q
1
dq
2
. . . dq
3N
dp
1
. . . dp
3N
(14.2.7)
The same argument can be applied to the phase points leaving the volume element through the face
at q
1
+dq
1
. At that face is slightly dierent, being at a slightly dierent location. We can develop
that dierence in a power series using an obvious notation:
(q
1
+dq
1
) = (q
1
) +
_

q
1
_
dq
1
+
1
2!
_
q
2
1
_
(dq
1
)
2
+ (14.2.8)
Because the change from q
1
to q
1
+dq
1
was an innitesimal one anyway, we can safely neglect terms
in (dq
1
)
2
and higher. Similarly q
1
is also slightly dierent and can be written (in the same obvious
notation) as:
q
1
(q
1
+dq
1
) = q
1
(q
1
) +
_
q
1
q
1
_
dq
1
+
1
2!
_
2
q
1
q
2
1
_
(dq
1
)
2
+ (14.2.9)
Then the equivalent of Equation (14.2.7) for the face at q
1
+dq
1
is:
_
+
_

q
1
_
dq
1
_ _
q
1
+
_
q
1
q
1
_
dq
1
_
dq
2
. . . dq
3N
dp
1
. . . dp
3N
(14.2.10)
Let us multiply Equation (14.2.8) out, dropping terms involving products of dierentials. We then
get:
_
q
1
+
_
q
1
q
1
_
dq
1
+ q
1
_

q
1
_
dq
1
_
dq
1
. . . dq
3N
dp
1
. . . dp
3N
(14.2.11)
The change in number of ensemble members due to ingress and egress through these two sides is
found by subtracting Equation (14.2.11) from Equation (14.2.7) to get:
_
q
1
q
1
_
+ q
1
_

q
1
__
dq
1
. . . dq
3N
dp
1
. . . dp
3N
(14.2.12)
where dq
1
has been factored out of the square brackets.
10
Much of the following is taken from Richard C. Tolman, The Principles of Statistical Mechanics, Dover Publica-
tions, 1979, being a reprint of the 1938 edition published by Oxford.
Exactly the same arguments can be applied to the walls at p
1
and p
1
+ dp
1
. This will, in the end,
give:
__
q
1
q
1
_
+
_
p
1
p
1
__
+
__

q
1
_
q
1
+
_

p
1
_
p
1
__
dqdp (14.2.13)
If we sum over both all coordinate and position coordinates we get the change dN/dt in number of
ensemble members in the little volume element per unit time:
dN
dt
=
3N
i=1
_
__
q
i
q
i
_
+
_
p
i
p
i
__
+
__
q
i
_
q
i
+
_

p
i
_
p
i
__
dqdp (14.2.14)
This simplies a lot since we know that
q
i
=
_
H
p
i
_
and p
i
=
_
H
q
i
_
((13.7.11), (13.7.12))
then, since the order of dierentiation doesnt matter:
_
q
i
q
i
_
=

2
H
p
i
q
i
=

2
H
q
i
p
i
=
_
p
i
p
i
_
(14.2.15)
and so
__
q
i
q
i
_
+
_
p
i
p
i
__
= 0 (14.2.16)
With this simplication Equation (14.2.14) becomes:
dN
dt
=
3N
i=1
__
q
i
_
q
i
+
_

p
i
_
p
i
_
dqdp (14.2.17)
Now we divide by the phase space volume element dqdp. Doing so will convert dN, the number
of ensemble members in the volume element to dN/dqdp, the number per unit volume. This is, of
course . Since there is still a division by dt, what we have is:
_
t
_
p, q
=
3N
i=1
__
q
i
_
q
i
+
_

p
i
_
p
i
_
(14.2.18)
The meaning of the partial derivative on the left in Equation (14.2.18) is that it is the change in
density in the volume element at a specic point p q xed in phase space.
Equation (14.2.18) is known as Liouvilles Theorem
11
and applies to all incompressible uids. It
is of particular importance in the foundations of statistical mechanics.
If we now substitute the denitions of q
i
and p
i
in terms of the Hamiltonian into Liouvilles Theorem
and move the sum to the left-hand side we get:
_
t
_
p, q
+
3N
i=1
__
H
p
i
__
q
i
_
_
H
q
i
__

p
i
__
= 0 (14.2.19)
In classical mechanics the second term in Equation (14.2.19) is called a Poisson bracket and is
written
3N
__
H
p
i
__
q
i
_
_
H
q
i
__

p
i
__
= [, H] (14.2.20)
11
Liouville, Journ. de Math. 3, 349 (1838)
The notation looks like the commutator in quantum mechanics. In fact it is the classical analog of
the quantum mechanical commutator and plays a similar role.
12
Using this notation we can write
Liouvilles Equation in its more customary form:
_
t
_
p, q
+ [, H] = 0 (14.2.21)
The Liouville equation contains all of Hamiltonian mechanics in it. Indeed, it is not too much to
say that it is the fundamental equation of classical statistical mechanics. And it is especially useful
in time-dependent situations.
14.2.1 Incompressible ow in Phase Space
The Liouville equation also allows some interesting and useful deductions. The most important of
these is the following:
If we form the total derivative of ( p, q, t) we get:
d
dt
=
_
t
_
+
3N
i=1
__
q
i
_
q
i
+
_

p
i
_
p
1
_
= 0 (14.2.22)
we see that it must equal zero as the right hand side is identically zero.
The quantity d/dt is the rate at which the density of ensemble points in our volume element changes
as the volume element moves in time. The fact that it is zero means that the density in that volume
element does not change.
This is another way of saying that the uid composed of phase space points is incompressible.
14.2.2 Conservation of Extension in Phase
Let us return to Equation (14.2.6) on page 188
N = dq
1
. . . dq
3N
dp
1
. . . dp
3N
= d ((14.2.6))
which is the number of ensemble members inside the volume element d.
Lets turn this around and for a moment regard a volume element d to be dened by the phase
points inside of it.
13
As time goes by these points move in slightly dierent directions. Thus the
volume element changes shape with time. However its density remains . Since trajectories cannot
suddenly start at any arbitrary time, or disappear either, no new system points can suddenly appear
inside or vanish from this volume element. And since Liouvilles Theorem applies to this volume
element even if its shape changes, no new trajectories enter or leave this volume element either.
Thus, taking the time derivative of Equation (14.2.6) on page 188 we must have:
d(N)
dt
=
d
dt
+
d()
dt
= 0 (14.2.23)
Of course the term d/dt is zero as shown by Equation (14.2.22). Thus d()/dt must be zero as
well and we have:
d()
dt
=
d
dt
_

_
dq
1
. . . dp
3N
= 0 (14.2.24)
12
It would be more correct to call the quantum mechanical commutator the analog of the Poisson bracket, mainly
because the Poisson bracket was developed rst and the quantum mechanical entity a direct analog of it.
13
For the vividly minded among you, imagine that these points painted red.
This result means that the size of our volume element does not change in time. So while our original
region can become distorted, indeed quite contorted, its volume does not change.
14
Gibbs called
this the conservation of extension in phase.
14.3 The Virial Theorem
The Virial Theorem is not, strictly speaking, part of Hamiltonian mechanics. Nevertheless this is
a reasonable place to discuss it, even though we wont use it for a while.
15
Newtons Second Law is:
F = p = m a (14.3.1)
We
16
consider the quantity
G =
i
p
i
r
i
(14.3.2)
where the subscript i denotes a particle in a system of many particles and the dot indicates the
vector dot product. The total derivative of G with respect to time is:
17
dG
dt
=
i
r
i
p
i
+
i
p
i
r
i
(14.3.3)
The rst term in Equation (14.3.3) is:
i
r
i
p
i
=
i
m
i
r
i
r
i
=
i
m
i
s
2
i
= 2K (14.3.4)
where s is the scalar speed of the particle and K is the kinetic energy. The second term can be
rearranged using Equation (14.3.1):
i
p
i
r
i
=
i
F
i
r
i
(14.3.5)
so that the expression for dG/dt (Equation (14.3.3)) can be written as:
d
dt
i
p
i
r
i
= 2K +
i
F
i
r
i
(14.3.6)
It is instructive to look at the long time average of this. We get that by integrating over a time
interval from 0 to t and then dividing by t:
1
t
_
t
0
_
dG
dt
_
dt =
_
dG
dt
_
= 2K) +
_
i
F
i
r
i
_
(14.3.7)
or
2K) +
_
i
F
i
r
i
_
=
1
t
[G(t) G(0)] (14.3.8)
where the right hand side comes from the fact that G is an exact dierential (see Equation (14.3.2))
and hence dependent only on the initial and nal values.
Now Equation (14.3.8) is of no use without a bit of insight.
18
If there is an upper bound to G, then
[G(t) G(0)] is less than some quantity B and the right hand side of Equation (14.3.8) will be less
14
Again, for the vividly minded, our little region dened by the red dots changes shape but not volume.
15
After all, we are not developing a text involving all knowledge. We pick and choose what to discuss primarily on
the basis of what will be needed later and t the chosen material in where we can. This looked like a good spot...
16
Im following H. Goldstein, Classical Mechanics, Addison-Wesley, 1953 in this.
17
The reader may be wondering where we are going with all this. For now just follow along. The road is neither
long nor tedious.
18
Isnt that so often the case in life.
than or equal to B/t. Since we can take t to be as large as we like, B/t can be made as small as we
like. For the limit as t , the right hand side then becomes zero. In that case we have:
K) =
1
2
_
i
F
i
r
i
_
(14.3.9)
This is known as the Virial Theorem and the right-hand side is the virial of Clausius.
But is G in fact bounded? Since G involves both r and p we can consider these separately.
For any system conned to a box of nite dimensions, r is clearly limited to the maximum dimension
of of the box.
For any system with a nite energy E, p is clearly limited since even if one particle has all the energy
in the system, its momentum p = (2mE)
1/2
, which is bounded.
What if we deal with systems with no xed energy? In such systems the probability of nding a
particle with energy E falls o as expE/kT. This causes the long-time integral to converge to a
nite value.
19
So G is in fact bounded and Equation (14.3.9) is correct.
19
I know that it would be best if Id prove that right now, but weve not yet developed the machinery.
15. The Classical Microcanonical
Ensemble
15.1 The Specication of a System
Weve already discussed the fact that a macroscopic system requires that only F+1 variables be spec-
ied, where F is the number of intensive variables given by the Gibbs Phase Rule (Equation (14.1.1)
on page 186). The extra variable must be extensive, of course.
1
The specication is usually done by placing constraints on the walls of the system. For instance
we can give a system a specied volume by having it surrounded by xed rigid walls. And we can
specify a number of particles in a system by placing that number of particles inside walls that are
impenetrable to particles.
Fixing the energy is a bit more complex. This requires that the walls be both adiabatic and rigid.
That is, that they do not allow heat to pass into or out of the system and they do not allow the
system to do any work.
It isnt hard to see how to control volume, number, and energy all at the same time. All these are
extensive variables.
Control of intensive variables is more complex. We can control temperature by using walls that
conduct heat if we also place our system into a constant temperature bath. That does not interfere
with controlling volume or number of particles, but a system cannot have both its energy and its
temperature controlled in this way. Similarly pressure can be controlled by using movable walls
and placing the system into a constant pressure bath.
2
But clearly we cannot control pressure and
volume simultaneously.
And we can control the chemical potential of a given component by using a wall made of a semi-
permeable membrane that allows passage of that component. And we also must place the system in
a bath containing the specied component at a concentration such that it has the desired chemical
potential.
In fact we can in principle allow several or all components to have a xed chemical potential by
suitable choice of membrane and bath.
3
But it is clear that we cannot also control the number of
particles of that component.
In other words thermodynamic variables come in conjugate pairs: p and V, N and , and E and T.
Only one of each pair of variables can be specied. The other will then be what it is in the system.
There are times when certain types of systems are not possible. For example chemists call a system
with xed energy, volume, and number a closed system. But such a system may allow electromagnetic
elds to penetrate. Thus a battery driven radio transmitter inside such a system could still lose
energy to the outside. Such a system is not closed in spite of having what might seem as appropriate
walls.
1
Or youd have a system of undened extent, whose thermodynamic properties would be equally dicult to dene.
2
Think of a system contained in a balloon in the atmosphere. The pressure is constrained to remain atmospheric.
3
As a practical matter not all combinations of materials may be simultaneously controllable, mainly due to limi-
tation on our technology rather than any theoretical reason for the limitation.
193
Chapter 15: The Classical Microcanonical Ensemble 194
While we can shield against electrical and magnetic elds, we cannot shield against gravitational
ones. Thus if gravitational eects are taken into account, there can be no closed systems.
15.2 Problems
The classical approach to statistical mechanics suers from one major problem. It attempts to apply
Newtons Laws, which are essentially macroscopic laws, to microscopic situations.
In many cases this works out well. But in others, it fails completely.
These failures were discovered by early researchers. But because quantum mechanics was still in the
future, they did not really grasp the implications of these failures.
The rst of these failures had to do with specication of position and momentum. Classically it was
known that the probability of a particle being exactly at a given point q is zero. This is easy to see.
For example consider a line segment running from q = 0 to q = 1 that contains one particle. If the
probability of nding a particle exactly at q were nite, then the probability that the particle was
someplace in the given interval would be that constant times the number of points in the interval,
which is innite. This is not a satisfactory result.
The classical x for this was to assign a nite probability not to a point but to a range of points,
usually given as running from q to q + dq. In doing this dq is thought of as being a very small but
nite range. The same thing applies to the position q.
Classically though the chance of nding that particle with a position uncertainty dq and a momentum
uncertainty dp could be made as small as desired as long as it remained non-zero.
So while in reality we cant talk about a system being at a point in phase space, we can talk about
it being inside a volume element of arbitrarily small dimensions.
On the other hand we know that systems on a microscopic level are really governed by quantum
mechanics. And so we also know that dq dp is in fact given by the Heisenberg Uncertainty Principle:
dq dp
h
2
(15.2.1)
and so clearly the joint uncertainty dq dp can not be made as small as desired. However, in classical
mechanics in principle the volume elements in phase space could be made as small as desired.
Today, because of quantum mechanics, we know we cannot talk about a system being at a point in
phase space at all. The system at best is a blob in phase space. And the blob has dimensions of
roughly h
3
N, where N is the number of particles in the three-dimensional system.
This is not just a dierence in approach. It turns up, as we shall soon see, in the division of phase
space into microstates. A microstate is a small hypervolume element in phase space supposedly
large enough to contain one or more trajectories (Boltzmann) or system points (Gibbs). Unfortu-
nately, the physical results obtained from a classical statistical mechanical calculation often depend
on the size of this hypervolume element and there is no way, short of experiment, to determine
the size of it.
Experimental measurements showed that nature required a hypervolume for a microstate that was
approximately h
3N
, where N was the number of particles in the system and h turns out to be
Plancks Constant. That the hypervolume was exactly h
3N
was rst postulated by O. Sakur in 1911
and veried by comparison of theory to experiment a year later by Tetrode
4
using data on gaseous
4
Tetrode has three syllables.
mercury.
5
Knowing the Uncertainty Principle, as we do, this is not a surprise to us.
To avoid the sort of complication and handwaving endemic to early presentations of classical statis-
tical mechanics, we will choose the dimensions of our microstates to be h
3N
right from the start.
6
A second problem is of the same nature. When dealing with systems composed of identical molecules,
it was classically assumed that these molecules were distinguishable. That is, in principle one could
model each molecule as a small shape and, with a very very small pen, write a number on that
shape. Thus the molecules could be distinguished. For instance, given three molecules on a line, it
was known that there were six dierent ways of arranging them:
(1, 2, 3) (1, 3, 2) (3, 1, 2) (2, 1, 3) (2, 3, 1) (3, 2, 1)
But in fact you cant write numbers on molecules. So the six dierent congurations
7
are actually
only one! In classical statistical mechanics one has to correct for this manually by inserting an N!
in the appropriate place when dealing with systems of indistinguishable molecules.
The third problem is in a way more benign. It is due to the quantization of energy levels. Classical
statistical mechanics works well at high temperatures but fails badly as the temperature drops. For
example we know that C
V
, the constant volume heat capacity should go to zero as the temperature
goes to zero. The classical C
V
does not go to zero.
The cure for this is to use classical formulas properly. That is, at suciently high temperatures.
Beyond that there is nothing that can be done.
8
15.3 The Microcanonical Ensemble
Let us specify a system by xing its energy, volume, and number of particles. We now duplicate
that system so as to create an ensemble of / such systems. This ensemble is known as the micro-
canonical ensemble.
How are the system points of this ensemble distributed in phase space?
First, we have to recognize that in a mathematical sense we cannot specify the energy with innite
precision. The chance of our producing a system with, for example, an energy of exactly 105 1/3
kilojoules is zero.
9
The best we can do is to produce a total energy H such that:
E H(p, q) E +dE (15.3.1)
where H is the Hamiltonian (the total energy) and dE is a small (innitesimal) amount of energy.
5
The information on Sakur and Tetrode was taken from R.K. Pathria, Statistical Mechanics, Pergamon, 1972, page
43.
6
Most modern texts, if they mention classical statistical mechanics at all, simply assume that the volume of a
microstate is h
3N
without any discussion. Early writers had a great deal of diculty with this volume.
7
Actually 3!. The number of congurations of n particles is, of course, n! as the reader doubtless knows.
8
It should be noted that the famous example of this is black body radiation. The appropriate description was
quantized as was shown by Planck in 1900. But it took until Einsteins treatment of the photoelectric eect in 1905
(for which he received the Nobel Prize in 1921) for its implications to even start to aect classical statistical mechanics.
9
The same argument applies to the volume as well. However the number of particles is a dierent matter as even
before quantum mechanics was discovered it was recognized that matter is quantized into small units called atoms.
Thus while we cant produce exactly any decimal number of particles, we can produce a denite integer number of
particles.
This means that all the systems points of our ensemble are located in phase space on a thin surface
shell of thickness dE and constant energy E. The probability P of nding a system in a given
microstate is then:
10
P(p, q) =
_
constant if E H(p, q) E +dE
0 otherwise
(15.3.2)
This needs to be justied. It is clear from the denition of the microcanonical ensemble that the
energy E is xed. Thus the conditions on the right of Equation (15.3.2) are correct. But why do we
make the probability of nding a system in a given microstate a constant?
The answer is, once again, the Principle of Democratic Ignorance, which was discussed back in
Chapter 2 on page 10 in Section 2.3 on page 11.
If R(p, q) is a mechanical property of these systems, and if the ensemble is large enough to sample
all parts of the energy surface in phase space, then the value of R that we would expect to observe,
R) is given by:
R) =
_

_
R(p, q)P(p, q)dqdp (15.3.3)
The expected value of R, R) is not the only possible average value of R that we might measure.
But if the distribution of values of R is sharply peaked around R), all the dierent measures of the
average such as the most probable value,
11
the mode,
12
etc., will have the same numerical value.
One measure of the sharpness of the peak is the relative standard deviation:
R
2
) R)
2
R)
2
<< 1 (15.3.4)
As long as Equation (15.3.4) holds, the actual measure used for the average does not matter; they
will all be the same. When Equation (15.3.4) does not hold, things need to be investigated more
closely.
13
The fundamental quantity that gives us the connection between the microscopic world and macro-
scopic thermodynamics in the microcanonical ensemble is the number of microstates in the thin
energy shell where P is non-zero.
We will denote this number by (N, V, E), a dimensionless quantity. The connection to thermody-
namics is simply this:
S(N, V, E) = k ln(N, V, E) (15.3.5)
where S is the macroscopic entropy and k is a constant.
14
Why would we think that the logarithm of would have anything at all to do with the entropy?
Right now, Equation (15.3.5) is an assumption.
15
The primary justication for it now is that it
works. One area in which it works is that it allows S to be an extensive property of a system. If,
for example, a system is made up of two parts whose entropies are S
1
and S
2
, then the entropy of
the entire system is S = S
1
+S
2
.
10
In this and the following material I am using the approach of Kerson Huang, Statistical Mechanics, Second
Edition, Wiley, 1987
11
The value of R that occurs most often in the ensemble
12
The value of R that is the middle value of all the observed values of R.
13
Examples of situations where Equation (15.3.4) does not hold includes at phase transitions and in systems either
with a high boundary to volume ratio or a small number of particles (or both.)
14
It will turn out that k is Boltzmanns Constant, as we shall shortly see.
15
It does not have to be an assumption. It will be shown to be true. But we cant do that right now.
To demonstrate this, we will consider a system divided into two independent subsystems.
The microcanonical ensemble corresponding to the rst subsystem will be assumed to have N
1
particles, a volume V
1
and an energy E
1
lying between E
1
and E
1
+ dE
1
. The ensemble for the
second will have N
2
particles, a volume V
2
and an energy E
2
lying between E
2
and E
2
+dE
1
.
Then:
S
1
(N
1
, E
1
, V
1
) = k ln(E
1
) and S
2
(N
2
, E
2
, V
2
) = k ln(E
2
) (15.3.6)
Now the composite system made up of the two subsystems will have an given by:
(E
1
+E
2
) = (E
1
)(E
2
) (15.3.7)
The s occur as a product because the subsystems are independent. For instance if your right
hand is an independent subsystem that can be in one of three microstates and your left hand is an
independent subsystem that can be in one of four microstates, then the two together can be in a
total of 3 4 = 12 dierent microstates.
Given this then
S = S
1
+S
2
= k ln(E
1
) +k ln(E
2
) = k ln[(E
1
)(E
2
)] (15.3.8)
which was to be proved.
To compute S we must compute . The computation depends on the sort of system for which we
do the computation. Here we will do this for the case of an ideal gas.
We make the gas ideal by assuming that there is no potential energy in the Hamiltonian for the
system. Thus the Hamiltonian for this system in cartesian coordinates is:
H(p, q) =
i
p
2
i
2m
(15.3.9)
where m is the mass of the ideal gas molecules.
The Hamiltonian is the total energy of the system and is a constant in the microcanonical ensemble.
That means that all the system points (and indeed all the trajectories) lie on a 6N 1 dimensional
surface in the 6N dimensional phase space for these systems. This surface has the specied energy
E and a thickness of dE.
This surface has a hyperarea which we will denote by A(E).
One way to compute is to compute the hyperarea A(E) and then divide it by the hyperarea of a
single microstate, v.
=
A(E)
v
(15.3.10)
Note that there is no way in classical mechanics to unambiguously decide on a value for v. But
as discussed in Section 15.2 on page 194, we will use the value v = h
3N
as the hyperarea of a
microstate since we know from quantum mechanics that this is correct.
What we need now is the hyperarea of the (6N 1)-dimensional shell. Thats given by:
A(E) =
_

_
EH(p,q)E+dE
dqdp (15.3.11)
where H(p, q) is the Hamiltonian for our systems. The limits on the integration reect the parts of
phase space where P(q, p) is non-zero.
The fact that the sums of the squares of the momenta (divided by m) must add up to a constant
suggests that a change to spherical coordinates might simplify this problem. Then we could take the
sums of the squares of the momenta as the square of the radius of a hypersphere. The hyperarea is
simply related to the hypervolume of the sphere in the same way that the area of a sphere is related
to the volume of the sphere.
So we nd the hyperarea by rst calculating the volume of a hypersphere of volume 1 of radius E
given by:
16
1(E) =
_

_
EH
dqdp (15.3.12)
Because the systems are ideal gases, there is no potential energy and the Hamiltonian is independent
of the coordinates q. Then the integral Equation (15.3.12) becomes the product of two integrals,
one over the positions q and the other over the momenta p.
The integral over the positions is trivial. Weve assumed that the systems have a xed volume, so
the integration over dxdy dz for any particle must give the volume V. And since there are N such
particles, the result is V
N
.
We now have simply:
1(E) = V
N
_

_
EH
dp (15.3.13)
Now we do the integrations over the momenta. Here we really want to go into spherical coordi-
nates, so we rst change variables, letting y
i
= p
i
/(2m)
1/2
Then dp
i
= (2m)
3N/2
dy
i
for all i. The
momentum integral is then
I = (2m)
3N/2
_

_
EH
dy (15.3.14)
Now from the denition of y we can easily see that
3N
i=1
y
2
i
= E (15.3.15)
so what we really have in Equation (15.3.14) is the contents of hypersphere of radius R where
R
2
= E =
3N
i=1
y
2
i
(15.3.16)
The volume of such an n-dimensional hypersphere is (see Section 15.7 on page 203):
V
n
=

n/2
(n/2)!
R
n
(15.3.17)
so that in our case of 3N-dimensions
I =
(2m)
3N/2
(3N/2)!
R
3N
(15.3.18)
and thus:
1(R) = V
N
(2m)
3N/2
(3N/2)!
R
3N
(15.3.19)
16
Note that this works for spheres, but not for all multidimensional gures such as hypercubes. See Appendix 15.7
on page 203 for a discussion.
Converting back to energy, since R = E
1/2
(see Equation (15.3.17) on the previous page) yields:
1(E) = V
N
(2mE)
3N/2
(3N/2)!
(15.3.20)
Weve now done the hard part. To get the hyperarea of the shell we need only dierentiate Equa-
tion (15.3.20) with respect to E:
/(E) =
3NV
N
2
(2m)
(2mE)
3N/21
(3N/2)!
(15.3.21)
Of course we dont want /(E), we want which is //h
3N
:
= V
N
3N
2
(2m)
(2mE/h
2
)
3N/21
(3N/2)!
= V
N
(2m)
(2mE/h
2
)
3N/21
(3N/2 1)!
(15.3.22)
Since S = k ln, we get:
S(N, V, E)/k = N lnV + ln(2m) +
_
3N
2
1
_
ln
_
2mE
h
2
_
ln(3N/2 1)! (15.3.23)
which can be simplied greatly. First, we recall that N is a macroscopic number of particles,
something of the order of 10
23
. Compared to that 1 is so small as to be ignorable. So we will ignore
it. Further ln(2m) is a very small negative number of the order of 26 (since the mass is the mass
of one molecule in kilograms.) Again, this is ignorable compared to 10
23
. So after all that we have
for the entropy:
S(N, V, E)/k = N lnV +
3N
2
ln
_
2mE
h
2
_
ln(3N/2)! (15.3.24)
Using Stirlings Approximation (see Section 2.8 on page 23) for the factorial and doing a bit of
algebra gives us:
S(N, V, E) = Nk ln
_
V
_
4mE
3Nh
2
_
3/2
_
+
3Nk
2
(15.3.25)
This should be our nal answer, but if we look closely, theres a problem. The entropy S is extensive.
So if we double the size of the system (for instance) we double N, E, and V. Certainly if we double N
and E, the entropy is doubled, since the factor of two cancels out inside the parentheses with E/N
leaving just the doubled N in both parts of Equation (15.3.25).
But the doubling of V causes the rst term in Equation (15.3.25) to increase by only a factor of the
logarithm of 2. This cant be right.
We discussed this as the second problem in Section 15.2 on page 194. We need to include an N facto-
rial in the denominator in Equation (15.3.22). This adds a term N lnN +N to Equation (15.3.25)
giving us:
17
S(N, V, E) = Nk ln
_
V
N
_
4mE
3Nh
2
_
3/2
_
+
5Nk
2
= Nk ln
_
V
N
_
4mE
3Nh
2
_
3/2
e
5/2
_
(15.3.26)
either of which is now our nal answer.
17
Of course this sort of ad hockery is unsatisfactory. We will do all of this in a much more convenient manner when
we talk about semi-classical statistical mechanics.
The reader should note (by going through the math) that our nal answer, Equation (15.3.26) on
the previous page could have been obtained from Equation (15.3.20) on the preceding page instead
of Equation (15.3.21) on the previous page.
How can this be? Equation (15.3.20) on the preceding page is for the volume of a 3N dimensional
hypersphere while Equation (15.3.21) on the previous page is for the surface area of that hypersphere.
Surely there is far more space in the volume than in the surface?
The answer is that yes, there is more space in the volume than in the surface. However the
relative increase is negligible. Each minor increase in the radius R of the hypersphere increases the
number of microstates involved by such a huge number that the dierence between the volume and
the surface just dont matter.
In fact weve demonstrated that in the terms we neglected in going from Equation (15.3.23) on the
preceding page to Equation (15.3.24) on the previous page. Those neglected terms are the dierence
and they are vanishingly small.
15.4 The Thermodynamics of the Microcanonical Ensem-
ble
Since the entropy in thermodynamics is a natural functionof N, V, and E, we should be able to
derive all of thermodynamics from an expression for S in those independent variables.
In fact we have:
dS =
1
T
dE +
p
T
dV

T
dN (15.4.1)
where we recognize that in thermodynamics the energy E is almost always known as the internal
energy
18
U. To avoid confusion with the potential energy we will use E for the internal energy. We
then have:
_
S
E
_
N,V
=
1
T
_
S
V
_
N,E
=
p
T
_
S
N
_
V,E
=
T
(15.4.2)
We compute the temperature from Equation (15.3.26) on the preceding page and nd that:
_
S
E
_
V,N
=
3Nk
3E
=
1
T
(15.4.3)
which not only tells us that the temperature in our ideal gas system can be computed from
T =
2E
3Nk
(15.4.4)
but that the energy is given by the familiar equation
E =
3
2
NkT (15.4.5)
which serves, as promised, to identify k as Boltzmanns constant.
It then comes as no surprise that
_
S
V
_
N,E
=
Nk
V
=
p
T
(15.4.6)
18
The internal energy is called that because it does not include any energy of the system measured outside the
system. Thus any kinetic energy due to the system hurtling through space is not part of E.
resulting in the fairly familiar
19
pV = NkT (15.4.7)
Finally, the determination of the chemical potential from the appropriate one of Equations (15.4.2)
on the previous page is endishly left as an exercise for the reader.
Example 15.1:
As an exercise lets calculate
20
the entropy of exactly one mole of argon at a temperature of
298.15
and a pressure of exactly 1 bar using Equation (15.3.26) on page 199.

These are not the independent variables of of the microcanonical ensemble, but given the formulas
above it is not hard to generate those.
We use the following constants: Avogadros number = 6.0221367 10
23
per mole, Boltzmanns
constant = 1.380658 10
23
Joules/K, and Plancks constant = 6.626075 10
34
Joule-sec.
From Equation (15.4.5) on the previous page we have an energy of 3718.4511 Joules/mol and from
Equation (15.4.7) we have a volume of 0.0247897 meters
3
.
The atomic mass of argon is 0.039948 kg/mole and, for reference, its entropy under these conditions
is 154.8 J/mol-K.
The calculation itself is best taken in stages.
_
4mE
3Nh
2
_
3/2
=
_
12.566371 6.6325259 10
26
3718.4571
1806641 10
24
4.3904877 10
67
_
3/2
=
_
3.0996816 10
21
7.9320351 10
43
_
3/2
=
_
3.9078011 10
21
_
3/2
= 2.4428606 10
32
(15.4.8)
V
N
= 4.1164293 10
26
(15.4.9)
e
5/2
= 12.182494 (15.4.10)
We now put the bits together:
S = 8.3145112 ln
_
4.1164293 10
26
4.1164293 10
26
12.182494
_
= 154.84 J/mol-K (15.4.11)
This compares to the experimental value of 154.8. It is exactly this sort of calculation that allowed
Sakur and Tetrode to conclude
21
that the volume of a microstate was, in fact, h
3N
.
19
If this result is not familiar, the reader is in deep trouble and should switch to reading popular ction immediately!
20
The alert reader will note that the calculation below carries far more signicant gures than necessary. In the
old days folks carried only the minimum number of signicant digits because calculations were done using paper
and pencil, three signicant digit slide rules, or ve place logarithm tables. The result was that accumulated round-o
errors often cost the calculation its last digit. Today it is trivial to keep all signicant digits (up to the capacity of
your calculator) and to do the rounding o to the correct number of signicant digits only once, at the very end of
the calculation. That way the result is good to 1 in the last place.
21
See Section 15.2
15.5 The Number of Microstates
Einstein, in 1905, realized that Equation (15.3.5) on page 196 can be inverted:
= e
S/k
(15.5.1)
to give the number of microstates that a system can occupies. For instance, given the molar entropy
of argon at 298.15
and 1 bar pressure as 154.8 J/mol-K, we nd that

= e
154.8/1.380710
23
= e
1.1210
25
= 10
4.8610
24
(15.5.2)
which a 1 followed by about 8 moles of zeros and then a decimal point.
15.6 Discussion
Weve solved the microcanonical ensemble for the case of an ideal gas. In doing so we did two
integrals, one over the momenta p, the other over the coordinates q.
The coordinate integration was trivial and the integration over the momenta a bit of a mess.
However, unless the potential energy depends on the momenta, the momentum integral Equa-
tion (15.3.13) on page 198 is always exactly the same. Weve done it once and we never have
to do it again. Its value is:
1
p
=
(2mE/h
2
)
3N/2
(3N/2)!
(15.6.1)
The integration over coordinates is, in general, not at all simple. It was simple in the ideal gas case
because there was no potential energy. If there is a potential energy, life gets very complex.
First, the potential energy is almost never a simple function of the positions q. In general it depends
on the distance between particles. This makes it a function of q
i
q
j
, and often a messy one at
that.
So where we had an integral over a hyperspherical shell in phase space for the ideal gas situation, now
we have an integral over a very complex shell in hyperspace at least over the position coordinates.
This is a serious diculty and in general cannot be done analytically. For this reason the micro-
canonical ensemble is not often used for hand calculations.
15.7 Appendix: Volume of an n-Dimensional Hypersphere
The formula for the volume of an n-dimensional hypersphere isnt obvious. But it can be derived in
various ways.
We will use the symbol V
n
for the volume of an n-dimensional hypersphere and the symbol A
n
for
the surface area of the same hypersphere.
The results are clearly going to be proportional to the radius R to the nth power. Thus we can
write:
V
n
= C
n
R
n
and A
n
=
dV
n
dR
= nC
n
R
n1
(15.7.1)
where C
n
is a numerical constant independent of R.
22
The quantity C
n
can be evaluated by a trick.
Consider the integral:
_

e
y
2
dy =
1/2
(15.7.2)
which is a standard denite integral. Let I be
I =
_

e
y
2
1
e
y
2
2
e
y
2
n
dy
1
dy
2
. . . dy
n
(15.7.3)
which is Equation (15.7.2) repeated n times:
__

e
y
2
dy
_
n
=
n/2
(15.7.4)
Now I can also be written:
I =
_

e
(y
2
1
+y
2
2
++y
2
n
)
dy
1
dy
2
. . . dy
n
(15.7.5)
Heres the trick: If now I let
R
2
= y
2
1
+y
2
2
+ +y
2
n
(15.7.6)
then Equation (15.7.5) becomes
I =
_

0
e
R
2
dV
n
=
_

0
e
R
2
A
n
dR = nC
n
_

0
e
R
2
R
n1
dR (15.7.7)
where weve made use of the fact that dy
1
. . . dy
n
= dV
n
and the second relation of Equation (15.7.1).
This integral is in fact a standard integral. To see that we change variables letting t = R
2
so that
dR = (t
1/2
/2)dt. Then:
I =
n
2
C
n
_

0
e
t
t
n/21
dt =
n
2
C
n
_
n
2
1
_
! =
_
n
2
_
!C
n
=
n/2
(15.7.8)
where the rst integral is the gamma function
23
and the last equality comes from Equation (15.7.4).
22
Equation (15.7.1) is true for hyperspheres. It is not necessarily true for other multidimensional gures. For
example the area of a cube is not dV/dR, which would be 3R
2
, but 6R
2
instead. Indeed, the area of an N-dimensional
hypercube is given by 2dV/dR.
23
The gamma function is discussed in section 2.8 on page 23.
Thus we have the results:
C
n
=

n/2
(n/2)!
(15.7.9)
V
n
=

n/2
(n/2)!
R
n
(15.7.10)
A
n
=
n
n/2
(n/2)!
R
n1
(15.7.11)
Factorials of non-integers are not usually seen in elementary work. Here when n is odd, we will have
half-integer factorials. These work like this example:
(5/2)! = (5/2)(3/2)! = (5/2)(3/2)(1/2)! = (5/2)(3/2)(
1/2
)/2 =
15
1/2
8
(15.7.12)
because (1/2)! is (
1/2
)/2.
We can see how some of this works out. From these formulas we readily nd that:
V
2
= R
2
A
2
= 2R
V
3
=
4
3
R
3
A
3
= 4R
2
V
4
=

2
2
R
4
A
4
= 2
2
R
3
16. The van der Waals Gas
16.1 Introduction
It is now time to consider systems that look somewhat like real gases. We will begin with a toy
system, that of a van der Waals gas and we will derive van der Waals equation using an approximate
and incorrect derivation that works only because of the fortuitous cancellation of errors.
The reason for doing this is to give an insight into the analytical techniques that can be used to
study real gases and liquids. There is a limit as to how far these techniques can be pushed but even
so they provide great insight into the physical processes that take place in such systems.
Today these systems are mainly studied via computer simulation, either by following the trajectory in
phase space of a representative system or by looking at the phase space distribution of representative
systems. The former involves integrating the equations of motion, the latter techniques commonly
called Monte Carlo methods.
The van der Waals equation is:
1
_
p +
N
2
a
V
2
_ _
V Nb
_
= NkT (16.1.1)
where a and b are constants special to each dierent species of gas. This equation was not derived
from any fundamental principle, but was instead proposed as a corrected ideal gas equation. The
constant b was to correct for the hard cores of molecules (which were assumed to be spherical) so
that the corresponding ideal volume was V Nb. The constant a corrects for attractive forces.
They were assumed to be proportional to the number density in the gas N/V and to the number
of surrounding molecules, also proportional to N/V . Since the attractions reduced the ability of a
molecule to move and hence exert pressure, the corrected pressure was assumed to be p+(N/V )
2
a.
Van der Waals equation can be developed into a virial expansion
p =
NkT
V Nb

N
2
a
V
2
=
NkT
V (1 Nb/V )

N
2
a
V
2
=
NkT
V
_
1 +
Nb
V
+
_
Nb
V
_
2
+
_
N
2
a
V
2
(16.1.2)
which results in
pV = NkT
_
1 +
N
V
_
b
a
kT
_
+
_
N
V
_
2
b
2
+
_
(16.1.3)
The quantity
B
2
= b
a
kT
(16.1.4)
is known as the second virial coecient, and
B
3
= b
2
(16.1.5)
is the third virial coecient, and so on.
1
J.D. van der Waals, Sr., Doctoral Dissertation, University of Leiden, 1873.
205
Chapter 16: The van der Waals Gas 206
16.2 The Approximate Derivation
We shall assume that we have N identical spherical
2
molecules in a volume V. We will ignore any
possible internal degrees of freedom so in eect we are dealing with a gas of atoms at temperatures
too low to produce any electronic excitations.
There are 3N cartesian coordinates to be considered and the same number of momentum coordinates.
The Hamiltonian for the system is:
H( p, q) =
N
j=1
p
2
j
2m
+U( q) (16.2.1)
and the classical canonical partition function is then:
Q(N, V, T) =
1
h
3N
N!
_
exp
_
_
j=1
p
2
j
2m
U( q)
_
_
d pd q (16.2.2)
The integral over the momenta can be done as usual and gives:
Q(N, V, T) =
1
N!
_
2mkT
h
2
_
3N/2
Z(N, V, T) (16.2.3)
where Z is the conguration integral
Z(N, V, T) =
_
e
U( q)
d q (16.2.4)
We will make certain assumptions about the potential energy U( q). First well assume that it can
be written as a sum of terms each depending only on the distance between a pair of molecules, say
molecule i and j. This is usually called the Assumption of Pairwise Additivity because under this
assumption if three molecules are close to each other, the potential is assumed to be the sum of
three pair potentials with no contribution from three-body forces.
Second well assume that this pair potential is zero for large separations between a pair of molecules,
becomes attractive as the distance between them becomes small, goes through a single minimum at
some distance and then increases very rapidly as the molecules get even closer. We will also assume
that the potential function goes to zero as the separation increases faster than the inverse third
power of the distance.
3
The assumption of pairwise additivity means that the potential energy can be written as a sum over
the N(N 1)/2 dierent pairs that can exist in a system of N molecules. Thus
U( q) =
N1
j=1
N
i>j
u(r
i,j
) (16.2.5)
where u(r
i,j
) is the potential energy of interaction between molecules i and j.
The conguration integral is then a product. The coordinates of two molecules occur in each of the
N(N 1)/2 terms and the coordinates of any particular molecule occur in N 1 dierent terms
of the product. So the conguration integral is not writable as a product of integrals as is the case
with the integration over the momenta as in Equation (16.2.2). The best we can do is write:
e
U( q)
=
Ni>j1
e
u(rij)
(16.2.6)
2
This assumption is implicit in the development of van der Waals equation.
3
The reasons for this will be discussed below.
When any particular r
ij
is large, the corresponding u(r
ij
) goes to zero but exp[u(r
ij
)] then goes
to 1. This is inconvenient since it would be much more useful to deal with something that goes to
zero. To make this happen well switch to Mayer f functions, a useful device for this sort of thing.
The Mayer f function f
ij
is dened by:
f
ij
= e
u(rij)
1 (16.2.7)
which has the behavior we want. Further since
e
u(rij)
= 1 +f
ij
Equation (16.2.6) on the preceding page can now be written as:
e
U( q)
=
Ni>j1
(1 +f
ij
) (16.2.8)
We can now expand the product into sums of terms:
e
U( q)
= 1 +
Ni>j1
f
ij
+
ij
kl
f
ij
f
kl
+ (16.2.9)
Now it is time to make the unjustiable assumption that was talked about above. That assumption
is that it is a good approximation that only the rst two terms in the equation above.
With this assumption the conguration integral becomes:
Z(N, V, T) =
_
_
_
1 +
Ni>j1
f
ij
_
_
d q
1
d q
N
(16.2.10)
The integral over 1 is trivial. It leads to a factor of V (volume) for every three coordinates or a
factor of V
N
overall. If there were no potential energy, that is if the f
ij
were identically 0, we would
be left only with V
N
and we would have recovered the ideal gas law.
But our f
ij
are not identically zero. But we do know that all of the f
ij
are identical in form since
the molecules are identical. So what we have is N(N 1)/2 identical integrals:
_
f
ij
d q
1
d q
N
Integration over all coordinates but i andj is trivial. We get a factor of V for each of these. There
are N 2 such molecules, so what we have now is
V
N2
_
f
ij
d q
i
d q
j
We are left with a pair of molecules. Let us (at least mentally) switch to the coordinates of the
center of mass of the pair and to spherical coordinates for the separation and orientation of the pair
of molecules.
The center of mass coordinates integrate to another factor of V. And assuming, as we have, that
there is no angle dependence in the potential, integration over the relative angular coordinates gives
a factor of 4. So if r is the internuclear distance, the integral has become one over that distance
and:
_
f
ij
d q
i
d q
j
= V
_

0
4f(r) r
2
dr (16.2.11)
This integral will converge as long as f(r
ij
) goes to zero at large r faster than 1/r
3
, as stipulated
above.
Given the restriction on the potential,
4
this integral will have some nite value that we can call J.
Equation (16.2.11) on the previous page will then be
5
_
f
ij
d q
i
d q
j
= V
_

0
4f(r) r
2
dr = V J (16.2.12)
There are, as we know, N(N 1)/2 such terms and since N is very large this is essentially N
2
/2.
The conguration integral is then:
Z(N, V, T) = V
N
_
1 +
1
2
N
2
J
V
_
(16.2.13)
If the volume per molecule v = V/N is introduced, this becomes
Z(N, V, T) = N
N
v
N
_
1 +
1
2
N
J
v
_
(16.2.14)
Since we are interested in the thermodynamic properties of this gas, we will want to deal with the
thermodynamic properties of the gas. What we have overall is:
Q(N, V, T) =
1
N!
_
2mkT
h
2
_
3N/2
N
N
v
N
_
1 +
1
2
N
J
v
_
(16.2.15)
The pressure is given by
P =
_
lnQ
V
_
and when one takes the log of Equation (16.2.15) and does the dierentiation, one gets:
P =
1
v
_
1
J
2v
_
(16.2.16)
which looks like van der Waals equation, but isnt quite yet.
What we want is for J = (a b), where a and b are the van der Waals constants. The question is:
is there any actual potential that can give J in this form?
The answer is yes. Mayer and Mayer demonstrated one such a potential.
6
The potential is:
u(r) =
_
0 r r
o
u
o
(r
o
/r)
m
r
0
r
(16.2.17)
where m is a positive integer greater than 3. With this choice the integral we have to do (see
Equation (16.2.11) on the previous page) is:
_

0
4f(r) r
2
dr =
_
ro
0
4r
2
dr +
_

ro
4
_
e
uo(ro/r)
m
1
_
r
2
dr (16.2.18)
The rst integral on the right is that of a hard sphere and we can dene b to be its value:
b =
1
2
_
ro
0
4r
2
dr =
2
3
r
3
o
= 4v
o
(16.2.19)
4
To be technically accurate, the potential has to also have a nite minimum value at r = 0 as well as going to zero
faster than 1/r
3
as r . The condition at r = 0 is necessary because if the minimum were , (as would happen,
for instance, with a pure gravitational potential) all the molecules would end up together at one point.
5
This assumes that the molecule is far enough from a wall for the wall not to interfere. Since this will be true for
all but a vanishingly small fraction of the molecules, it is a reasonable assumption.
6
Mayer and Mayer, Statistical Mechanics, John Wiley and Sons, 1940, page 267
where v
o
is the volume of a sphere with a radius of r
o
, i.e.:
v
o
=
4
3
_
r
o
2
_
3
=

6
r
3
o
In the second integral we can expand the exponential in a power series and keep only the rst term:
e
u(r)
1 u(r) = u
o
r
m
o
r
m
(16.2.20)
and then we can dene a to be the value of the second integral
a =
1
2
4u
o
r
m
o
_

ro
r
(m2)
dr =
2
m3
u
o
r
3
o
=
12
m3
u
o
v
o
(16.2.21)
which not only gives us a value of a
a =
12
m3
u
o
v
o
(16.2.22)
but lets us see exactly why the potential has to fall o to zero with r faster than 1/r
3
.
The derivation is, of course, awed. Two errors were made that luckily cancel out (though weve
not proven that). A correct, but more complex derivation is possible and will be given.
17. Real Gases
17.1 Introduction
Attempts to produce an equation of state suitable for real gases has been over the years both
continuous and unavailing. No such universal equation has ever been demonstrated, though all sorts
of more or less approximate (and useful) equations have been developed.
The problem lies in the two-phase region below the critical temperature as seen in a p V diagram
for a typical real gas. In this region liquid and gas coexist and any isotherm in this region is at.
However, once this region is left, say beyond the point where liquid disappears and only gas remains,
the isotherm is nowhere at.
Thus, while the isotherm itself is continuous, its slope is not, having a nite discontinuity at the
boundary of the two-phase region.
It is not really possible to represent this sort of behavior using nite numbers of elementary functions
such as powers, trigonometric functions, or logs and exponentials. None of these have a nite region
in which their slope is zero. Indeed, none of these have any points at which they show a nite
discontinuity either.
Thus one is forced to the conclusion that no equation of state can exist for real gases that contains
a nite number of elementary functions alone as components.
Thus modern attempts to produce useful equations of state for real gases focus on providing real-
istic behavior only in special regions such as around the critical point or at high pressures.
There is a way out of this dilemma. What if one considers an innite number of terms? Perhaps
that would at least provide better approximations over a wider range of temperatures and volumes.
Innite series can, for example, approximate behavior such as a function with a at region. One
only need think of the expansion of a square-wave in a Fourier transform.
The idea is simple. Let
p = F(N, V, ) (17.1.1)
be the actual (unknown) equation of state for a real gas. Then we can expand this in a power series
in the number density = N/V of the gas:
p = +B
2
2
+B
3
3
+ (17.1.2)
where the Bs are known as virial coecients.
1
These viral coecients are generally functions of
the temperature but not the volume. If the density is allowed to go to zero, what is recovered is:
p
kT
= =
N
V
(17.1.3)
which is the ideal gas law.
2
1
This is an expansion in reciprocal powers of the volume. It is also possible to do an expansion in powers of the
pressure. The two series are related as almost any Physical Chemistry textbook will show.
2
Which shows that the coecient of the rst term in Equation (17.1.2) must be 1. This is the rst virial coecient
B
1
. Given that it is always 1 for any gas (or that gas would never obey the ideal gas law at low densities, it is always
omitted, thus causing virial coecients to start with the second one and B
1
is silently ignored, a fact often causing
much concern to undergraduates.
210
Chapter 17: Real Gases 211
The virial coecients B
2
, B
3
, etc., are really thermodynamic functions evaluated at zero density.
This can be seen by looking at a Taylor series development of the compressibility factor Z
Z =
p
kT
(17.1.4)
around = 0:
Z = 1 +
_
Z
_
=0
+
1
2!
_
2
Z
2
_
=0
2
+ +
1
n!
_
n
Z
n
_
=0
n
+ (17.1.5)
Since division of Equation (17.1.2) on the previous page by gives Equation (17.1.5), we can see
that the virial coecients are given by:
B
n
=
1
(n 1)!
_
n1
Z
n1
_
=0
(17.1.6)
So as already said, the virial coecients are in principle calculable from thermodynamics. However
this cannot be done without a theory giving us Z as a function of , or experimental data suciently
accurate to separate out the various derivatives.
3
What we now set out to do is to nd a way to compute the viral coecients from microscopic
properties. This will be done in the next sections.
17.2 Virial Coecients and Conguration Integrals
We will restrict ourselves to single component systems that possess a virial expansion.
4
The discus-
sion below is long, not because the derivation is dicult, but because we must take many side roads
to get to our destination. I will try to point out which discussions are side roads and which are not.
The most convenient approach is via the grand canonical partition function. We have:
(, V, T) = e
pV/kT
=
N=0
Q(N, V, T)
N
= 1 +
N=1
Q(N, V, T)
N
(17.2.1)
where = exp(). This equation is a power series in which we have called the absolute activity.
This expansion will be crucial in what follows. However, we must take the rst side road here. In
the material to come it will be useful to use another activity z which satises:
lim
0
z = (17.2.2)
where is the number density N/V . So we take the rst side road to nd out how to do this.
Mentally bookmark Equation (17.2.1); we shall return to it below.
We can dene an appropriate z in this way: We have in general
ln = pV = ln
_
1 +Q(1, V, T) +Q(2, V, T)
2
+
(17.2.3)
If we let go to zero this becomes:
ln = pV = ln [1 +Q(1, V, )] Q(1, V, ) (17.2.4)
3
Experimental data can be used to nd B
2
roughly and to nd B
3
approximately in a few cases. See Dymond and
Smith, The Virial Coecients of Gases: A Critical Compilation, Oxford, 1969. Errors in the second virial coecient
are typically of the order of a percent and errors in the third virial coecient are generally greater.
4
In the following I generally follow the treatment given by T.L. Hill, Introduction to Statistical Thermodynamics,
Dover, New York, 1986, which is a reprint of the second printing of the 1960 edition published by Addison-Wesley.
since ln(1 +x) = x for small x. Since
N =
_
ln
_
= Q(1, V, T) = pV (17.2.5)
then we can take
z =
Q(1, V, T)
V
(17.2.6)
The quantity Q(1, V, ) is the one-particle canonical partition function. It will occur often so we
shall abbreviate it as Q
1
. As such it contains all of the internal motions of the molecules. If we
are dealing with atoms, Q(1, V, ) = V/
3
and z = (/
3
), which clearly goes to zero as goes
to zero. For molecules both rotational and vibrational contributions will be present. For quantum
gases Q(1, V, T) will take other forms. But in any event using Q(1, V, T) makes the following theory
applicable to many dierent sorts of gases.
We now return to Equation (17.2.1) on the preceding page. We can convert it into a power series in
z by replacing with its equivalent zQ
1
/V from Equation (17.2.6):
= e
pV/kT
= 1 +
N=1
_
Q(N, V, T)V
N
Q
N
1
_
z
N
(17.2.7)
If we now dene Z
N
(V, T) = Z
N
to be:
Z
N
N!
=
Q(N, V, T)V
N
Q
N
1
(17.2.8)
We can rewrite Equation (17.2.1) on the preceding page as:
= e
pV/kT
= 1 +
N=1
Z
N
N!
z
N
(17.2.9)
This is always a valid expansion since e
x
= 1 +x+x
2
/2! +. . . is an absolutely convergent expansion
for any value of x, real or imaginary.
In classical mechanics Z
N
is simply the conguration integral as can be seen by solving Equa-
tion (17.2.8) for Q(N, V, T) and replacing Q
1
by V/
3
. This gives:
Q(N, V, ) =
1
N!
_
V
3
_
N
Z
N
where the rst two factors come from an integration of exp(H) over the momentum coordinates
(which can always be done for a gas) and the conguration integral part from the integration over
the spatial coordinates.
Up to this point we have done nothing but some algebra. Equation (17.2.9) is identical to Equa-
tion (17.2.1) on the previous page except for changes in variable (z for ) and changes in notation.
What we are now going to do is the tricky part of all of this. We are going to expand the grand
partition function in a power series in z.
Sadly, this cannot be done directly as the appropriate series do not converge. What one gets is,
however the correct result due to cancellation of large terms. We did things just this way in the
discussion of the van der Waals equation in a previous Chapter. Now we shall do it correctly.
When we have the appropriate power series expansion in z we will then compare the terms in that
series with the terms in Equation (17.2.9). Since the terms in Equation (17.2.9) are in principle
known, we can then know the terms in the pV/kT expansion. Those terms will be the virial
coecients while the terms we already have are known through the conguration integral.
Of course we still have to gure out how to integrate the appropriate conguration integrals, but
one thing at a time.
We begin by expanding exp(pV ) in a power series. This can always be done since the series for
exp(x) is convergent for all values of x. Then:
e
pV
= 1 + (pV ) +
1
2!
(pV )
2
+ +
1
j!
(pV )
j
+ (17.2.10)
Now I assume that p has a convergent power series expansion given by:
p =
j=1
b
j
z
j
(17.2.11)
This is a reasonable assumption since p will properly go to zero as z goes to zero. Note that there
is no constant term in Equation (17.2.11), a fact that will become important later.
This expansion is then plugged into Equation (17.2.10) to give:
e
pV/kT
= 1 +V
j=1
b
j
z
j
+
1
2!
V
2
_
_
j=1
b
j
z
j
_
_
2
+
1
3!
V
3
_
_
j=1
b
j
z
j
_
_
3
+
+
1
m!
V
m
_
_
j=1
b
j
z
j
_
_
m
+ (17.2.12)
What wed like is an expansion in powers of z. This equation is that, but the powers of z are scattered
about in many terms. It is not particularly easy to see how those terms are gathered together. It
helps to write the equation out in the form:
e
pV/kT
= 1 +V
j=1
b
j
z
j
+
1
2!
V
2
_
_
j=1
b
j
z
j
_
_
_

m=1
b
m
z
m
_
+
1
3!
V
3
_
_
j=1
b
j
z
j
_
_
_

m=1
b
m
z
m
__

s=1
b
s
z
s
_
+ (17.2.13)
and then to multiply out the terms to see what they actually are:
e
pV/kT
= 1 +V
_
b
1
z +b
2
z
2
+b
3
z
3
+
+
1
2!
V
2
_
b
1
z +b
2
z
2
+b
3
z
3
+
_
b
1
z +b
2
z
2
+b
3
z
3
+
+
1
3!
V
3
_
b
1
z +b
2
z
2
+b
3
z
3
+
_
b
1
z +b
2
z
2
+b
3
z
3
+
_
b
1
z +b
2
z
2
+b
3
z
3
+
+ (17.2.14)
Finally we collect like powers of z. As often happens this can be done fairly easily for the rst few
terms since low powers of z occur only in the rst few products.
e
pV/kT
= 1 + [V b
1
] z +
_
V b
2
+
V
2
2
b
2
1
_
z
2
+
_
V b
3
+ 2
V
2
2
b
1
b
2
+
V
3
6
b
3
1
_
z
3
+ (17.2.15)
It should be noted that this gathering of terms would not be possible if Equation (17.2.11) on the
preceding page had a constant term. In that case each term in Equation (17.2.13) on the previous
page beyond the rst would contain a single power of z. Similar all would contain z
2
, z
3
, etc., and
the result in Equation (17.2.15) on the preceding page would not exist.
We now need another digression. Again, mentally bookmark Equation (17.2.15) on the previous
page as we will return to it.
It is not easy to see in general what the form of the various terms in Equation (17.2.15) on the
preceding page will be. In fact it is a very dicult. The coecient of z
N
turns out to be
m
_
_
N
j=1
(V b
j
)
mj
m
j
!
_
_
(17.2.16)
where the sum is done over all sets of integers m
j
that satisfy the condition
N
j=1
jm
j
= N (17.2.17)
There is a rather elegant method for computing these numbers using graphs. These are mathematical
structures that can be created for various situations such as this one that enables one to transform
a problem in, for example, algebra, to one of counting congurations.
5
We shall not go into it here.
Suce it to say that what we have done is enough to compute the second and third virial coecients
which is about the practical limit of utility.
We now return to Equation (17.2.15) on the previous page. What we do next is to compare that
equation to Equation (17.2.9) on page 212. To see the result lets write out Equation (17.2.9) on
page 212
e
pV/kT
= 1 +
Z
1
1!
z +
Z
2
2!
z
2
+
Z
3
3!
z
3
+ (17.2.18)
And then we shall recall that the power series representation of a given function, here exp(pV ), in a
given set of variables (here p, V, T, and z) is unique. Thus the right-hand sides of Equations (17.2.18)
and (17.2.15) on the previous page must be identical. That gives us:
V b
1
=
Z
1
1!
V b
2
+
V
2
2
b
2
1
=
Z
2
2!
V b
3
+ 2
V
2
2
b
1
b
2
+
V
3
6
b
3
1
=
Z
3
3!
(17.2.19)
We assume that the Zs are known, so we actually want the bs in terms of the Zs. Since we can
start at the top of the list and solve for the bs as we go, turning this around isnt hard. We get:
1!V b
1
= Z
1
2!V b
2
= Z
2
Z
2
1
3!V b
3
= Z
3
3Z
1
Z
2
+ 2Z
3
1
(17.2.20)
These simplify a bit since the conguration integral Z
1
is just V, the volume. That makes b
1
= 1.
5
The method is known as Mayer Cluster Theory. An introductory explanation of it is given in T.L. Hill, An
Introduction to Statistical Thermodynamics, Dover, New York, 1986.
Our last task is to convert all of this to an expansion in powers of the number density instead of
z. Then we will have the virial coecients B
n
directly. This isnt hard to do. We recall that
N = z
_
ln
z
_
or = z
_
p/kT
z
_
(17.2.21)
Since
p =
j=1
b
j
z
j
((17.2.11))
this results in
=
j=0
jb
j
z
j
(17.2.22)
What we want is
z = +a
2
2
+a
3
3
+ (17.2.23)
We now substitute Equation (17.2.23) into Equation (17.2.22) and equate coecients on both sides
of the equation. This leads to:
a
2
= 2b
2
a
3
= 3b
3
4a
2
b
2
= 3b
3
+ 8b
3
2
(17.2.24)
so that in the end we have for the virial coecients
B
2
= b
2
B
3
= 4b
2
2
2b
3
etc. (17.2.25)
Putting this all together we have the virial coecients B
n
given in terms of the expansion coecients
b
j
in Equation (17.2.25). And then we have the expansion coecients B
j
given in terms of the
conguration integrals Z
n
in Equation (17.2.20) on the previous page. While the conguration
integrals are dened in Equation (17.2.8) on page 212, what remains to be done is to be more
explicit about the potential energy of a system of interacting particles and to show explicit formulas
for evaluating the second and third virial coecients.
6
17.3 Evaluating the Integrals
First we make some assumptions about the potential energies involved in these calculations. We will
always assume that the potentials are pairwise additive. That means that all potentials, no matter
how many particles are involved, can be expressed in terms of sums of potential energies involving
two particles.
Thus if three particles, 1, 2, and 3, are interacting, the potential energy V ( q
1
, q
2
, q
3
) is the sum of
three potential energies of the form: u( q
i
q
j
):
V ( q
1
, q
2
, q
3
) = u( q
2
q
1
) +u( q
3
q
1
) +u( q
3
q
2
) (17.3.1)
The same sort of formula will hold true no matter how many particles are involved in an interaction.
We will also assume that the intermolecular potentials have a hard core. That is, the potential
energy never drops to negative innity. If it did so for some value of the intermolecular distance, all
the molecules would pile up together with a total energy of minus innity.
6
Ive chosen not to go further because higher virial coecients are not only hard to measure experimentally, they
are also hard to compute numerically.
In addition we will assume that the intermolecular potential energy is short ranged. That is, it goes
to zero faster than 1/r
3
and does so over a fairly short distance.
Last, we will make an assumption that makes the calculations much easier, but which is not really
necessary: that the intermolecular potentials are spherically symmetric.
What this means is that
V ( q
1
, q
2
, . . . , q
N
) = u(r
12
) +u(r
13
) +. . . +u(r
1N
)
+ u(r
23
) +u(r
24
) +. . . +u(r
2N
)
+
+ u(r
N2,N
) +u(r
N2,N1
)
+ u(r
N1,N
) (17.3.2)
or, for short
V ( q
1
, q
2
, . . . , q
N
) =
N1
i=1
N
j=i+1
u(r
ij
) (17.3.3)
There are N(N 1)/2 terms in these last two equations.
The conguration integral Z
1
is easy to calculate. With only one particle there can be no potential
energy and the integral simply gives V. So we have:
Z
1
=
_
V
d r
1
= V (17.3.4)
Now from Equation (17.2.25) on the previous page we have that B
2
= b
2
, and from Equa-
tion (17.2.20) on page 214 we have that b
2
= (Z
2
1
Z
2
)/2V . Or, using Z
1
= V , be get
B
2
= b
2
=
Z
2
Z
2
1
2V
(17.3.5)
Since Z
2
is:
Z
2
=
__
V
e
u(r12)
d r
1
d r
2
(17.3.6)
I can write Equation (17.3.6) as:
B
2
=
Z
2
Z
2
1
2V
=
1
2V
__
V
_
e
u(r12)
1
_
d r
1
d r
2
(17.3.7)
where the one comes from writing the square of the volume V
2
as:
V
2
=
__
V
d r
1
d r
2
(17.3.8)
In Equation (17.3.7) we can change the spatial coordinates from the coordinates of each of the two
particles to the coordinates of the center of mass of the two particles and the relative spherical
coordinates (r
12
, , ) of the two particles.
Integration over the center of mass coordinates is easy. They do not occur in the potential energy so
they contribute another factor of V to the overall integral. As for the relative coordinates, neither
of the two orientation angles is present in the potential energy either. They contribute a factor of
4. We are then left with:
B
2
=
1
2
_

0
4r
2
12
_
e
u(r12)
1
_
dr
12
(17.3.9)
For any reasonable interaction potential this integral cannot be done in closed form.
7
However, it
is not at all dicult to do either by series expansion or numerical integration. Which method is
chosen depends on the actual potential energy u(r
12
).
But for some very simple potentials the integral can be done.
Example 17.1:
Trivial Example: If there is no potential energy at all, then u(r
12
) is identically zero, exp(u(r
12
))
is then 1 and B
2
simply becomes 0. To the approximation of just the second virial coecient
8
the
ideal gas law is then obeyed.
We can also rather simply do this example:
Example 17.2:
Hard Sphere Gas: Here the potential is
u(r) =
_
0 r
0 < r <
where is the distance of closest approach. B
2
is then:
B
2
= 2
_

0
(1)r
2
12
dr
12
2
_

(1 1)r
2
12
dr
12
=
2
3
3
(17.3.10)
where the integral from to contributes nothing to the total.
Since is the distance of closest approach of molecules 1 and 2 it is the distance between the centers
of those molecules and hence twice the radius of one of them. If the radius of a single molecule is a,
then = 2a and we see that B
2
is four times the volume of a single molecule.
We can also examine the square well potential.
Example 17.3:
Square Well Gas: Here the potential is
u(r) =
_
_
0 r
< r w
0 w < r <
where again measures the distance of closest approach and w measures the size of the attractive
region in multiples of the distance of closest approach.
9
The depth of the potential well is so
that normally is a positive number.
7
Yes, I know that this seems to be a big disappointment after all this work, but the situation is not at all hopeless.
8
Actually, all the virial coecients will be zero in this case.
9
Since is in fact the diameter of one of our necessarily spherical molecules, w is also the size in diameters.
The integral for B
2
breaks up into three integrals each covering one of the regions in Equa-
tion (17.3.11) on the previous page.
B
2
= 2
_

0
(1)r
2
12
dr
12
2
_
w
_
e
r
2
12
dr
12
2
_

w
(1 1)r
2
12
dr
12
=
2
3
2
3
_
e
1
_
3
(w 1)
=
2
3
3
_
1
_
e
1
_
(w 1))
(17.3.11)
As checks we note that if is zero or if w is 1 this reduces to the hard-sphere gas.
This gas also exhibits a Boyle Temperature T
B
at which the second virial coecient vanishes. At
this temperature the gas behaves in an ideal manner. Any combination of w and that satises
the symmetric formula:
we
w = 0 (17.3.12)
will cause the second virial coecient of the Square-Well potential gas to vanish. The only restriction
is that w must be greater than 1 for the potential to make physical sense.
This second virial coecient is similar in behavior to actually observed second virial coecients in
that it is negative at low temperatures and positive at high ones. The crossover point is, of course,
the Boyle temperature. Further, at very high temperatures B
2
becomes constant, a behavior again
observed in experiment. The limiting value for B
2
can easily be seen to be the hard-sphere value.
17.4 The Third Virial Coecient
While the second virial coecient is not too hard to evaluate, the third is another story. Indeed, as
the order of the virial coecients increases, the diculty in evaluating them increases very fast.
But rst we must obtain as simple an expression as possible for the third virial coecient B
3
. It is
given by:
B
3
= 4b
2
2
2b
3
(17.4.1)
which looks simple enough.
Before going ahead it is good to collect the appropriate formulas in one place: We have:
Z
1
=
_
dr
1
Z
2
=
__
e
u(r12)
dr
1
dr
2
Z
3
=
___
e
u(r12)u(r13)u(r23)
dr
1
dr
2
dr
3
(17.4.2)
where the rs represent the three coordinates of a given particle, and
1!V b
1
= Z
1
2!V b
2
= Z
2
Z
2
1
3!V b
3
= Z
3
3Z
1
Z
2
+ 2Z
3
1
(17.4.3)
It is helpful to shorten the notation slightly by writing
x
ij
= e
u(rij)
(17.4.4)
With these things in mind we can write
3!V b
3
=
___
_
x
12
x
13
x
23
3(1)x
12
+ 2(1)
3
dr
1
dr
2
dr
3
(17.4.5)
where the rst term is Z
3
, the second is three times Z
1
times Z
2
, and the third is 2 times Z
1
2
.
We can make this look more symmetric by realizing that integrating over x
12
gives exactly the same
result as integrating over x
13
or x
23
. So we can rewrite Equation (17.4.5) as:
3!V b
3
=
___
[x
12
x
13
x
23
x
12
x
13
x
23
2] dr
1
dr
2
dr
3
(17.4.6)
A glance at the denition of B
3
above shows that twice b
3
must be subtracted from four times b
2
2
.
It is convenient to get a simple expression for b
2
2
in the following way:
We note that
2!V b
2
2
=
___
(x
12
1)(x
13
1)dr
1
dr
2
dr
3
= V
__
(x
12
1)dr
12
_
__
(x
13
1)dr
13
_
(17.4.7)
Since the equation for b
3
is multiplied by three already and that for b
2
2
above is already multiplied
by two, we end up using the form
3V B
3
= 12V b
2
2
6V b
3
Putting this all together gives:
3V B
3
=
___
[(x
12
1)(x
13
1) + (x
12
1)(x
23
1) + (x
13
1)(x
33
1)
[x
12
x
13
x
23
x
12
x
13
x
23
2]dr
1
dr
2
dr
3
(17.4.8)
Multiplying and canceling the terms that cancel out,
10
we end up with:
3V B
3
=
___
(x
12
1) (x
13
1) (x
23
1) dr
1
dr
2
dr
3
(17.4.9)
which really isnt so bad.
11
If we switch from dr
1
dr
2
dr
3
to coordinates relative to particle 1, i.e. dr
1
dr
12
dr
13
we can then
integrate over particle 1 and get another factor of the volume V. We then have:
B
3
=
1
3
___
(x
12
1) (x
13
1) (x
23
1) dr
12
dr
13
(17.4.10)
This is very dicult to integrate. To do so one must not only express r
23
in terms of r
12
and
r
13
(actually, their components)
12
but in addition then integrate the exponentials containing the
potential energy. This is especially dicult for the term arising from u(r
23
).
10
Though students (and even faculty) sometimes end up canceling terms that really dont in messy expressions.
This is a cause of great angst and wailing.
11
One cannot look at Equation (17.4.9) without thinking that such a beautiful symmetric result cannot accidently
be the result of all that manipulation that preceded it. There must be another way to get it that emphasizes the
symmetry. And there is. The method is called Mayer Cluster Theory and it is a way of getting directly to the virial
coecients in terms as simple as they can be. Sadly the theory itself is a bit complex. It is not included here because
we (and most people) do not need to go past the third virial coecient and because it is a subject worthy of an
appendix all its own.
12
Readers are invited to try this for themselves.
17.5 The Lennard-Jones Potential
A very common intermolecular pairwise potential energy is one given by Lennard-Jones.
13
It is given
by:
u(r) = 4
_
_
r
_
12
r
_
6
_
=
_
_
r
o
r
_
12
2
_
r
o
r
_
6
_
(17.5.1)
The potential starts at at r = 0, falls to 0 at r = , reaches a minimum of at r = r
o
and
then rises to zero as r . The relationship between and r
o
is r
o
= 2
1/6
. Either form in
Equation (17.5.1) may be used and both forms can be found in the literature.
With this form it turns out that B
2
is negative at low temperatures, rises to positive values as the
temperature is increased, and then reaches a maximum and falls o slightly as the temperature is
increased further. This agrees with what is observed experimentally.
The second virial coecient for the Lennard-Jones potential cannot be obtained in a simple form.
It however can be obtained exactly in a series representation that converges fairly rapidly.
14
The second virial coecient is given by:
B
2
= 2
_

0
_
e
u(r)
1
_
r
2
dr (17.5.2)
We begin by obtaining a modied form for the second virial coecient. We integrate by parts. Thus
we have
_
b
a
UdV = UV
b
a
_
b
a
V dU (17.5.3)
and we choose
U = e
u
1 and dV = r
2
dr (17.5.4)
then
dU =
_
du
dr
_
e
u
dr and V =
r
3
3
(17.5.5)
Now if u(r) 0 faster than r
3
as r , then UV 0 as r . Since UV is zero at r = 0, the
term UV in Equation (17.5.5) is then zero at both limits and hence vanishes. What is left is:
B
2
=
2
3
_

0
_
du
dr
_
e
u
r
3
dr (17.5.6)
With Equation (17.5.1) (in the form inserted into this we have:
B
2
=
8
3
_

0
_
12
_
r
_
13
+ 6
_
r
_
7
_
exp
_
4
_
r
_
12
_
exp
_
4
_
r
_
6
_
r
2
dr (17.5.7)
It is exceptionally convenient to introduce new variables and combination of constants. In particular
we will let
x = r/ , = 1/ , and = 4/ (17.5.8)
Further, the quantity
B
2
=
B
2
2
3
3
(17.5.9)
13
Who is, in fact, a single person.
14
In the following I use the development presented by Donald Rapp, Statistical Mechanics, Holt, Rinehart, and
Winston, 1972. Ive also worked it out in detail, something that is rather hard to nd in most textbooks.
is known as the reduced second virial coecient and is in common use in the literature. With these
substitutions we get:
B
2
=
_

0
_
12x
10
+ 6x
4
e
x
12
e
x
6
dx (17.5.10)
The integral is evaluated by expanding the second exponential in a power series thusly:
e
x
6
=
n=0
1
n!
n
x
6n
(17.5.11)
We now insert this into Equation (17.5.10) in place of the appropriate exponential to get
B
2
=
_

0
_
12x
10
+ 6x
4
e
x
12
n=0
1
n!
n
x
6n
dx (17.5.12)
We can interchange the integration and the summation and then break the result into two integrals:
B
2
= 12
n=0
1
n!
n+1
_

0
x
6n10
e
x
12
dx
6
n=0
1
n!
n+1
_

0
x
6n4
e
x
12
dx (17.5.13)
Now we need a small digression. The integrals in Equation (17.5.13) can be done in terms of the
Gamma function (z). This is why weve been going through all these contortions and why weve
expanded one exponential and not the other. The Gamma function is a well-known function that is
fully tabulated in many places and is closely related to the factorial function.
What we have above is an integral that can be written generally as:
I =
_

0
r
a
e
c/r
b
dr (17.5.14)
The Gamma Function is dened to be
(z) =
_

0
t
z1
e
t
dt (17.5.15)
where the real part of z must be positive. We need to get the integral in Equation (17.5.14) into the
form of Equation (17.5.15). This isnt hard. We take the blunt approach and let
t = cr
b
thus r =
_
c
t
_
1/b
and dr =
1
b
_
c
t
_
1/b
t
1
dt (17.5.16)
Substituting we get:
I =
_
0
_
c
t
_
a/b
e
t
1
b
_
c
t
_
1/b
1
t
dt (17.5.17)
which, after some algebra, simplies to
I =
1
b
c
(1a)/b
_

0
t
(a1)/b
t
1
e
t
dt (17.5.18)
where the integral is now in the form of Equation (17.5.14) if we let z = (a 1)/b. Thus
I =
1
b
c
(1a)/b
_
a 1
b
_
(17.5.19)
With the digression over we can evaluate the integrals in Equation (17.5.13) on the preceding page:
_

0
x
6n10
e
x
12
dx =
1
12
(6n+9)/12
_
6n + 9
12
_
(17.5.20)
_

0
x
6n4
e
x
12
dx =
1
12
(6n+3)/12
_
6n + 3
12
_
(17.5.21)
and insert the results into Equation (17.5.13) on the previous page:
B
2
= 12
n=0
1
n!
n+1
1
12
(2n+3)/4
_
2n + 3
4
_
6
n=0
1
n!
n+1
1
12
(2n+1)/4
_
2n + 1
4
_
(17.5.22)
This simplies slightly:
B
2
=
n=0
1
n!
(2n+1)/4
_
2n + 3
4
_
n=0
1
n!
1
2
(2n+3)/4
_
2n + 1
4
_
(17.5.23)
What I am now trying to do is to combine the two sums into one to give us a more compact formula.
To do this I need to manipulate Equation (17.5.23) a bit. We play with each sum separately. First
Ill split out the rst (n = 0) term from the rst sum:
B
2
=
1/4
_
3
4
_
+
n=1
1
n!
(2n+1)/4
_
2n + 3
4
_
n=0
1
n!
1
2
(2n+3)/4
_
2n + 1
4
_
(17.5.24)
and then Ill re-index the second sum by setting k = n + 1:
B
2
=
1/4
_
3
4
_
+
n=1
1
n!
(2n+1)/4
_
2n + 3
4
_
k=1
k
k!
1
2
(2k+1)/4
_
2k 1
4
_
(17.5.25)
Now, realizing that k can be renamed n we can factor out the terms in and n! to get:
B
2
=
1/4
_
3
4
_
+
n=1
1
n!
(2n+1)/4
_
_
2n + 3
4
_
n
2
_
2n 1
4
__
(17.5.26)
The term in the square brackets can be simplied if we use a property of the gamma function
(z + 1) = z(z) (17.5.27)
which is easy to prove using Equation (17.5.15) on the previous page. With this we can write:
_
2n + 3
4
_
=
_
2n 1
4
+ 1
_
=
2n 1
4

_
2n 1
4
_
(17.5.28)
With this the term in the square brackets in Equation (17.5.26) becomes
_
2n 1
4
__
2n 1
4

n
2
_
=
1
4
_
2n 1
4
_
and so we succeed in putting the two sums together:
B
2
=
1/4
_
3
4
_
n=1
1
4n!
(2n+1)/4
_
2n 1
4
_
(17.5.29)
We can even simplify this by noting that
_
3
4
_
=
1
4
1
4
_
and if this is used in the rst term, we note that it ts the pattern of the sum and is, in fact,
unsurprisingly, the zeroth term of it. Thus our nal result is the amazingly simple:
B
2
=
n=0
1
4n!
_
4
_
(2n+1)/4
_
2n 1
4
_
(17.5.30)
where Ive substituted 4/ for since the explicit temperature dependence is good to know.
15
The result converges quite rapidly for values of of about 4 or more. As gets smaller, more and
more terms are needed. And while it looks as if the second virial coecient is always negative, in
fact the rst term in Equation (17.5.30) is positive, and at high temperatures that term dominates
the others.
Evaluating this series is not a problem on a computer. Given any system interacting via Lennard-
Jones potentials, Equation (17.5.30) will provide the second virial coecient if both and are
known to any desired degree of accuracy.
15
Recall that = kT/.
18. Wide versus Deep
18.1 Introduction
For many years the eld of applied statistical thermodynamics (also known as molecular dynamics)
has been dominated by the search for the equilibrium shape of a molecule. This has usually been
taken to be that conformation that has the lowest possible potential energy.
This is the wrong point of view.
The stable conformation is the one that has the lowest free energy, which is a combination of both
energy and entropy.
This is not the only problem. There is also the problem of dening what is actually meant by a
conformation. Take, for example, rotation around a carbon-carbon bond. Both carbon atoms are
assumed to have three other substituents each. The result is that the rotation about the carbon-
carbon bond has a potential energy with three not necessarily idential minima.
1
Ideally, the minima
in these potentials correspond to three rotational angles
1
,
2
, and
3
. Each of these corresponds
to a given conformation of the molecule or what is sometimes termed a rotamer.
This is an obvious idea. But it is best not pushed too far, considering that there is always movement
around the ideal bond angles. The reader is asked to consider how far a given bond angle is allowed
to deviate from one of the three s before the molecule is assumed to have changed rotomeric states.
18.2 The Model
We can generalize this problem in the following way:
2
Let us assume that we have a single particle
moving on a line of length L. There are two square potential wells along this line. One running
from position a
1
to position b
1
and of depth
1
; the other running from a
2
to b
2
and of depth
2
.
For completeness we will assume that the line begins at a
0
and ends at b
0
. In fact, the connection
to rotation can be make explicit if we assume that points a
0
and b
0
are the same point, that is, that
the movement is on a circle instead of a straight line.
The situation is depicted in Figure 18.1.
Figure 18.1: Model Potential Energy Surface
The canonical partition function for this potential energy can be written as:
Q(N, V, T) =
1
N!
_
V
3
_
N
Z
N
(18.2.1)
1
There is no direct potential energy associated with rotation. The potential energy here is due to interactions
among the substituents on the two carbon atoms.
2
This is not the most general way to formulate this problem, but it is suciently general for our discussion here.
224
Chapter 18: Wide versus Deep 225
where, for convenience I will call the part coming from translation Q
t
. Thus
Q(N, V, T) = Q
t
Z
N
(18.2.2)
To avoid complications, we wil restrict ourselves to a single particle, since, as shall be seen, this does
not otherwise aect our analysis. Then the conguration integral Z
N
becomes simply Z given by:
Z =
_
b0
a0
e
u(r)
dr (18.2.3)
where u(r) is the potential energy shown in Figure 18.1. The integral is simple. It breaks into ve
segments:
Z =
_
a1
a0
dr +
_
b1
a1
e
1
dr +
_
a2
b1
dr +
_
a1
a0
dr +
_
b2
a2
e
2
dr +
_
b0
b2
dr (18.2.4)
which is even more boring than it looks.
3
The rst integral gives a
1
a
0
, the third, a
2
b
1
, and the
fth, b
0
b
2
. We can add these lengths up and simply call it L
0
.
The other two are only slighly more interesting. Since the exponentials are not, in fact, functions of
r since u(r) is a constant in each of them, we get (writing these two integrals rst):
Z = e
1
L
1
+e
2
L
2
+L
0
(18.2.5)
where L
1
= b
1
a
1
and L
2
= b
2
a
2
.
We now have everything we need. The probability p
1
that we will nd our particle in potential well
1 is simply:
p
1
=
e
1
L
1
Z
(18.2.6)
There should be a factor of Q
t
in both the numerator and the denominator of Equation (18.2.6),
but they cancel out. Similarly:
p
2
=
e
2
L
2
Z
(18.2.7)
Now, as drawn above, potential well 1 is wide and shallow, while potential well 2 is narrow and deep.
In which one will we nd the particle if not all the time, at least most of the time?
The answer is particularly simple. The ratio of the chance of nding the particle in well 1 against
the chance of nding it in well 2 is simply:
p
1
p
2
=
L
1
L
2
e
(21)
(18.2.8)
Note that both the well depth (the s) and the well length (the Ls) are involved in the answer.
18.3 But What Does It All Mean?
We can do some simple calculations. First, clearly, if the s are the same, then p
1
/p
2
= L
1
/L
2
,
which is what wed expect. Note that this is a pure entropy eect, since the Ls are directly the a
priori chance of getting the particle into a particular well.
3
The point here is to have a very simple model, not to be a training ground for doing messy integrals!
Chapter 18: Wide versus Deep 226
What if we make the lengths the same? Then we have:
p
1
p
2
= e
(21)
(18.3.1)
For small potential energy dierences we can expand the exponentials in a power series and keep
only the rst two terms to get:
p
1
p
2
= 1 +(
2
1
) (18.3.2)
so it is clear that the probabilities depend directly on the length of the wells but exponentially on
their energies. So deep is usually better than wide, but not always. Ultimately one has to compare:
L
1
e
1
with L
2
e
2
(18.3.3)
History
The eld of statistical mechanics (and what, at the elementary level is almost the same thing,
statistical thermodynamics) is a eld with a long history. It is also a eld that has undergone major
changes in the past two decades.
It has also undergone major changes in the past. Since the late 19th century, its very language
has changed as well. Thus old books and papers are hard to read today as both the terminology
and the notation are strange to us. Anyone interested in the development of classical statistical
mechanics needs to know not only this language
1
but that the eld grew primarily out of the study
of the kinetic theory of gases and was developed into statistical mechanics rst by James Clerk
Maxwell, Ludwig Boltzmann, and then by J. Willard Gibbs. Maxwells work consisted primarily in
the discovery of the Maxwell-Boltzmann distribution law. A major review of Boltzmanns work has
been written by Paul and Tatiana Ehrenfest
2
which served to make the eld a standard part of both
theoretical chemistry and theoretical physics.
While I know of no satisfactory written history of statistical mechanics,
3
an exposition of Boltz-
manns work is given in The Conceptual Foundations of the Statistical Approach in Me-
chanics by Paul and Tatiana Ehrenfest, originally an article in the Encyklopadie Der Mathema-
tischen Wissenschaften, Leipzig, 1912 where it is No. 6 of Volume VI:2:II. It is currently available
under the English title given above as translated by Michael J. Moravesik in a Dover Phoenix Edition
published by Dover Publications, Mineola, 2002. This is not an easy book for novices but it repays
close study. It also contains a bibliography of the literature up to 1912 that is no longer well known
(or sadly, easily available.)
Gibbs wrote extensively on Statistical Mechanics. His book, Elementary Principles in Statistical
Mechanics was rst published in 1902 and is kept in print today by Dover Publications. It is
much more accessible to students than the Encyclopedia Article by the Ehrenfests. The article by
the Ehrenfests mentioned above contains a review of the work by Gibbs on this subject
4
that is
interesting in the contrasts between the two views of statistical mechanics.
Development of the eld continued for the next 60 years with the advent of quantum mechanics
providing a major creative impulse.
The problem of the theory of heat capacities is an example. It is easy to show classically that the
constant volume heat capacity of monatomic solids is 3R per mole for all temperatures, but this
disagrees with experiments which show that the heat capacity falls to zero as the temperature goes
to absolute zero.
A satisfactory theory for the heat capacity of solids was worked out early in the 20th century by
application of quantization to the problem.
5
1
Even the now standard symbols of thermodynamics have changed. Gibbs, for instance, used for the entropy.
2
Tatiana Ehrenfests role in the development of statistical mechanics is often played down. In the preface to the
English translation of their article that I reference below, Tatiana Ehrenfest-Afanassjewa modestly claims that The
great task of collecting the literature and of organizing the Encyklopadie article was done by Paul Ehrenfest. My
contribution consisted only in discussing with him all the problems involved and I feel that I succeeded in clarifying
some concepts that were often incorrectly used. This was written in 1959. Paul Ehrenfest had died in 1933, a suicide
as his mentor Boltzmann had been in 1906. Tatiana Ehrenfest-Afanassjewas modesty was well-known during Paul
Ehrenfests lifetime. It is not so well known today. She did much more than discussing with him all the problems
involved....
3
A skeleton history exists in R.K. Pathria, Statistical Mechanics, Pergamon Press, 1972, in the section titled
Historical Introduction, pages 1-7.
4
It was Gibbss last major scientic publication.
5
First by A. Einstein, Ann. Physik [4], 22, 180 (1907) and in a much more satisfactory way by Peter P. Debye,
227
Appendix: History 228
In the 20s and 30s of the last century the quantum approach to statistical mechanics was used very
successfully to provide answers to a number of other problems. Very readable books written in that
period are those by Fowler and Guggenheim
6
and by Tolman.
7
However, the systems solved were primarily ideal or idealized problems. More dicult ones such as
the behavior of real gases were not attacked until the late 30s when Meyer and Meyer, introduced a
formalism for determining the behavior of non-ideal gases.
8
But while the formalism is very pretty,
it was not suitable for actual calculations much beyond the second virial coecient.
Textbooks such as that written by T.L. Hill
9
made the eld accessible to the less mathematically
sophisticated students of chemistry as opposed to those in physics and applied mathematics. As a
result the theories and method of thinking of statistical mechanics became part of the armament of
every physical and biological scientist.
However by this time the calculations that could be done with pencil and paper had mostly been
done. Theories such as the Mayer Cluster Theory mentioned above could not used for calculations
beyond the simple because of the computational complexity required. Pure theory could be further
extended, but without calculated results to compare to experiment, such extensions were dry and
sterile.
Several signposts to the future did occur. For example in 1957 B.J. Alder and T.E. Wainwright
programmed an early digital computer (an IBM 704) to simulate a gas of hard spheres. In doing
so they used systems of 32 and 108 particles moving under classical equations of motion.
10
This
computation could not then (and can not now) be done by hand. The labor involved would be
prohibitive.
But it was clear that even a simple problem involving hard spheres with no other interaction potential
between them required huge amounts of computer time. And while computers were improving in
speed, they remained relatively slow until the 1990s.
11
Today, all that has changed. Very fast very large memory multiple CPU machines are readily
available. Computations only dreamed about a few years ago are now feasible. And as a result
statistical mechanics is once again a thriving research area.
Ann. Physik, 39, 789 (1912)
6
R. Fowler and E.A. Guggenheim, Statistical Thermodynamics, Cambridge, 1939.
7
R.C. Tolman, The Principles of Statistical Mechanics, Dover Publications, 1979, a reprint of the edition published
by Oxford in 1938.
8
Statistical Mechanics by Mayer and Mayer, Wiley, 1940. The theory is developed in Chapter 13. As is the case
with Tatiana Ehrenfest, it is usually the husband, in this case Joseph Mayer, that is remembered. In fact, his wife
Maria Goeppert-Meyer (who won the Nobel Prize for the development of this shell theory of the atomic nucleus) was
an equal contributor.
9
An Introduction to Statistical Thermodynamics by Terrel L. Hill, Addison-Wesley, 1960. This text, as of this
writing, is still in print as a Dover Publications reprint of 1986.
10
B. J. Alder and T. E. Wainwright, Phase transition for a hard sphere system, J. Chem. Phys., 27,
1208-1209, (1957).
11
The Author notes that an early IBM machine, the IBM 650, could multiply two ten digit numbers together in
a millisecond. Todays single core machines alone operate in the range of a billion such multiplications a second, an
increase in speed by a factor of a million.
Fundamental and Derived
Physical Constants
The physical constants used in this work are given below. Those above the horizontal line are
considered fundamental. Those below are derived from fundamental constants. The listing is not
exhaustive. Some constants used in the text are not given here, but are derivable from those that
are.
The numbers in parentheses are the uncertainties in the last two digits given.
Table 1: Fundamental and Derived Physical Constants
Quantity Symbol Value Units
Speed of light c 299792458 (exact) m/s
Electric Constant
1
0
8.854187817 10
12
(exact) F/m
Plancks constant h 6.6260693(11) 10
34
Js
Electron charge e 1.60217653(14) 10
19
C
Electron mass m
e
9.1093826(16) 10
31
kg
Proton mass m
p
1.67262171(29) 10
27
kg
Avogadros number N
o
6.0221415(10) 10
23
1/mol
Boltzmanns constant k 1.3806505(24) 10
23
J/K
Faraday constant T 96485.3383(83) C/mol
Gas constant R 8.314472(15) J/mol-K
Atomic mass unit u 1.66053886(28) 10
27
kg
The constants given above were taken from the National Institute of Science and Technology (NIST)
web pages at:
http://physics.nist.gov/constants.html
These derive from Peter J. Mohr and Barry N. Taylor, CODATA Recommended Values of the
Fundamental Physical Constants, 2002, published in Review of Modern Physics, 77, 1, (2005).
1
The electric constant
0
and the magnetic constant
0
are related by the relation
0
0
= 1/c
2
where c is the
speed of light and
0
is dened to be exactly 4 10
7
N/A
2
229
List of Works Consulted
Alder, B. J, and Wainwright, T. E., Phase transition for a hard sphere system, J. Chem.
Phys., 27, 1208-1209, (1957).
Callen, Herbert B., Thermodynamics and an Introduction to Thermostatics, 2nd Edition,
Wiley, 1985
Dymond, J. H. and Smith, E. B, The Virial Coecients of Gases: A Critical Compilation,
Oxford, 1969.
Ehrenfest, Paul and Ehrenfest-Afanassjewa, Tatiana, The Conceptual Foundations of the Sta-
tistical Approach in Mechanics, translated by Michael J. Moravesik, Dover, 2002. This is
a translation of the original article Begriiche Grundlagen der statistichen Auassung
in der Mechanik, originally published in Encyklopadie Der Mathematischen Wissenschaften,
Leipzig, 1912, No. 6 of Volume VI:2:II. The Dover edition is a reprint of the translation by
Moravesik published by Cornell in 1959.
Fowler, R. and E.A. Guggenheim, Statistical Thermodynamics, Cambridge, 1939.
Gibbs, J. Willard, Elementary Principles in Statistical Mechanics, Dover, 1960. Originally
published by Yale, 1902.
Goldstein, Herbert, Classical Mechanics, Addison-Wesley, 1953.
Hill, Terrell L., Statistical Mechanics, McGraw-Hill, 1956.
Hill, Terrell L., An Introduction to Statistical Thermodyamics, Dover, 1986. Originally
published by Addison-Wesley, 1960.
Margenau, Henry and Murphy, George W., The Mathematics of Physics and Chemistry, D.
Van Nostrand, 1943.
Massieu, M. F., Sur les fonctions des divers uides, Comptes Rendus Acad. Sci., 69, 858-862,
(1869).
McQuarrie, Donald A., Statistical Thermodynamics, Harper & Row, 1973.
McQuarrie, Donald A., Statistical Mechanics, Harper & Row, 1976.
McQuarrie, Donald A., Mathematical Methods for Scientists and Engineers, University
Science, 2003.
Mathews, Jon and Walker, R. L., Mathematical Methods of Physics, Benjamin, 1964.
Mayer, Joseph E. and Mayer, Maria Goeppert, Statistical Mechanics, Wiley, 1940.
Pathria, R. K., Statistical Mechanics, Pergammon Press, 1972.
Pitzer, Kenneth S., Quantum Chemistry, Prentice-Hall, 1953.
Planes, Antoni and Vives, Eduard, Entropic Formulation of Statistical Mechanics, J. Stat.
Phys. 106, Nos. 3/4, 827 (2002).
Slater, John C, and Frank, N. H., Mechanics, McGraw-Hill, 1947
Whittaker, Edmond Taylor, Analytical Dynamics, 4th Edition, Dover, 1944. Originally published
by Cambridge University Press, 1937.
230

Notes For A Course On Statistical Mechanics PDF

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Notes For A Course On Statistical Mechanics PDF

Încărcat de

Drepturi de autor:

Formate disponibile

Notes for a Course in

(z) stands for d(z)/dz.

(k) = (1 2x)(1 +x)

+kT lna , (7.7.5)

is identically zero in the standard state. Thus the name absolute

have? First, on a microscopic level p

must be dened here by

. That is, a rotation

is needed to return a macroscopic object to its starting position.

results in returning it to its starting position. There is no way to

in the last equation.

3/27. The rst term is 0.125000,

/4, when we plug this into Equation (12.2.42) we get:

, where is the chemical potential.

(q, p, t) dqdp = / (14.2.3)

P(q, p, t) dqdp = 1 (14.2.4)

R(p, q) (q, p, t) dqdp

and a pressure of exactly 1 bar using Equation (15.3.26) on page 199.

and 1 bar pressure as 154.8 J/mol-K, we nd that

S-ar putea să vă placă și