210 Course

Lecture Notes on Thermodynamics and Statistical Mechanics
(A Work in Progress)
Daniel Arovas
Department of Physics
University of California, San Diego
April 14, 2011
Contents
0.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
1 Thermodynamics 1
1.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 What is Thermodynamics? . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Thermodynamic systems and state variables . . . . . . . . . . . . . 2
1.2.2 Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.3 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Pressure and Temperature . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.5 Standard temperature and pressure . . . . . . . . . . . . . . . . . . 8
1.2.6 Exact and Inexact Dierentials . . . . . . . . . . . . . . . . . . . . 9
1.3 The Zeroth Law of Thermodynamics . . . . . . . . . . . . . . . . . . . . . . 11
1.4 The First Law of Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . 11
1.4.1 Single component systems . . . . . . . . . . . . . . . . . . . . . . . 12
1.4.2 Ideal gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.3 Adiabatic transformations of ideal gases . . . . . . . . . . . . . . . 17
1.4.4 Adiabatic free expansion . . . . . . . . . . . . . . . . . . . . . . . . 18
1.5 Heat Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.5.1 Engines and refrigerators . . . . . . . . . . . . . . . . . . . . . . . . 20
1.5.2 Nothing beats a Carnot engine . . . . . . . . . . . . . . . . . . . . . 22
1.5.3 The Carnot cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.5.4 The Stirling cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
i
ii CONTENTS
1.5.5 The Otto and Diesel cycles . . . . . . . . . . . . . . . . . . . . . . . 27
1.5.6 The Joule-Brayton cycle . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5.7 Carnot engine at maximum power output . . . . . . . . . . . . . . . 32
1.6 The Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6.1 The Third Law of Thermodynamics . . . . . . . . . . . . . . . . . . 35
1.6.2 Entropy changes in cyclic processes . . . . . . . . . . . . . . . . . . 35
1.6.3 Gibbs-Duhem relation . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.6.4 Entropy for an ideal gas . . . . . . . . . . . . . . . . . . . . . . . . 37
1.6.5 Example system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.6.6 Measuring the entropy of a substance . . . . . . . . . . . . . . . . . 41
1.7 Thermodynamic Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.7.1 Energy E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1.7.2 Helmholtz free energy F . . . . . . . . . . . . . . . . . . . . . . . . 42
1.7.3 Enthalpy H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
1.7.4 Gibbs free energy G . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.7.5 Grand potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1.8 Maxwell Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.8.1 Relations deriving from E(S, V, N) . . . . . . . . . . . . . . . . . . 45
1.8.2 Relations deriving from F(T, V, N) . . . . . . . . . . . . . . . . . . 46
1.8.3 Relations deriving from H(S, p, N) . . . . . . . . . . . . . . . . . . 46
1.8.4 Relations deriving from G(T, p, N) . . . . . . . . . . . . . . . . . . 47
1.8.5 Relations deriving from (T, V, ) . . . . . . . . . . . . . . . . . . . 47
1.8.6 Generalized thermodynamic potentials . . . . . . . . . . . . . . . . 48
1.9 Equilibrium and Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
1.10 Applications of Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . 51
1.10.1 Adiabatic free expansion revisited . . . . . . . . . . . . . . . . . . . 52
1.10.2 Maxwell relations from S(E, V, N) . . . . . . . . . . . . . . . . . . . 53
1.10.3 van der Waals equation of state . . . . . . . . . . . . . . . . . . . . 54
CONTENTS iii
1.10.4 Thermodynamic response functions . . . . . . . . . . . . . . . . . . 55
1.10.5 Joule eect: free expansion of a gas . . . . . . . . . . . . . . . . . . 57
1.10.6 Throttling: the Joule-Thompson eect . . . . . . . . . . . . . . . . 59
1.11 Entropy of Mixing and the Gibbs Paradox . . . . . . . . . . . . . . . . . . 62
1.11.1 Entropy and combinatorics . . . . . . . . . . . . . . . . . . . . . . . 64
1.11.2 Weak solutions and osmotic pressure . . . . . . . . . . . . . . . . . 66
1.11.3 Eect of impurities on boiling and freezing points . . . . . . . . . . 68
1.12 Some Concepts in Thermochemistry . . . . . . . . . . . . . . . . . . . . . . 69
1.12.1 Chemical reactions and the law of mass action . . . . . . . . . . . . 69
1.12.2 Enthalpy of formation . . . . . . . . . . . . . . . . . . . . . . . . . 71
1.12.3 Bond enthalpies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
1.13 Phase Transitions and Phase Equilibria . . . . . . . . . . . . . . . . . . . . 76
1.13.1 p-v-T surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
1.13.2 The Clausius-Clapeyron relation . . . . . . . . . . . . . . . . . . . . 78
1.13.3 Liquid-solid line in H
2
O . . . . . . . . . . . . . . . . . . . . . . . . 80
1.13.4 Slow melting of ice : a quasistatic but irreversible process . . . . . . 82
1.13.5 Gibbs phase rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
1.13.6 Binary solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
1.13.7 The van der Waals system . . . . . . . . . . . . . . . . . . . . . . . 93
1.14 Appendix I : Integrating factors . . . . . . . . . . . . . . . . . . . . . . . . 100
1.15 Appendix II : Legendre Transformations . . . . . . . . . . . . . . . . . . . . 101
1.16 Appendix III : Useful Mathematical Relations . . . . . . . . . . . . . . . . 103
2 Ergodicity and the Approach to Equilibrium 109
2.1 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
2.2 The Master Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
2.2.1 Example: radioactive decay . . . . . . . . . . . . . . . . . . . . . . 110
2.2.2 Decomposition of
ij
. . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.3 Boltzmanns H-theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
iv CONTENTS
2.4 Hamiltonian Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
2.5 Evolution of Phase Space Volumes . . . . . . . . . . . . . . . . . . . . . . . 116
2.5.1 Liouvilles Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
2.6 Irreversibility and Poincare Recurrence . . . . . . . . . . . . . . . . . . . . 120
2.6.1 Poincare recurrence theorem . . . . . . . . . . . . . . . . . . . . . . 120
2.7 Kac Ring Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
2.8 Remarks on Ergodic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . 125
2.8.1 The microcanonical ensemble . . . . . . . . . . . . . . . . . . . . . 128
2.8.2 Ergodicity and mixing . . . . . . . . . . . . . . . . . . . . . . . . . 128
3 Statistical Ensembles 133
3.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
3.2 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
3.2.1 Central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 136
3.2.2 Multidimensional Gaussian integral . . . . . . . . . . . . . . . . . . 137
3.3 Microcanonical Ensemble (CE) . . . . . . . . . . . . . . . . . . . . . . . . 138
3.3.1 Density of states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
3.3.2 Arbitrariness in the denition of S(E) . . . . . . . . . . . . . . . . 142
3.3.3 Ultra-relativistic ideal gas . . . . . . . . . . . . . . . . . . . . . . . 143
3.4 The Quantum Mechanical Trace . . . . . . . . . . . . . . . . . . . . . . . . 143
3.4.1 The density matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
3.4.2 Averaging the DOS . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
3.4.3 Coherent states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
3.5 Thermal Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
3.6 Ordinary Canonical Ensemble (OCE) . . . . . . . . . . . . . . . . . . . . . 149
3.6.1 Averages within the OCE . . . . . . . . . . . . . . . . . . . . . . . 150
3.6.2 Entropy and free energy . . . . . . . . . . . . . . . . . . . . . . . . 150
3.6.3 Fluctuations in the OCE . . . . . . . . . . . . . . . . . . . . . . . . 151
3.6.4 Thermodynamics revisited . . . . . . . . . . . . . . . . . . . . . . . 153
CONTENTS v
3.6.5 Generalized susceptibilities . . . . . . . . . . . . . . . . . . . . . . . 154
3.7 Grand Canonical Ensemble (GCE) . . . . . . . . . . . . . . . . . . . . . . . 155
3.7.1 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
3.7.2 Gibbs-Duhem relation . . . . . . . . . . . . . . . . . . . . . . . . . 157
3.7.3 Generalized susceptibilities in the GCE . . . . . . . . . . . . . . . . 157
3.7.4 Fluctuations in the GCE . . . . . . . . . . . . . . . . . . . . . . . . 158
3.8 Gibbs Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
3.9 Statistical Ensembles from Maximum Entropy . . . . . . . . . . . . . . . . 159
3.9.1 CE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
3.9.2 OCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
3.9.3 GCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
3.10 Ideal Gas Statistical Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 161
3.10.1 Maxwell velocity distribution . . . . . . . . . . . . . . . . . . . . . . 162
3.10.2 Equipartition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
3.11 Selected Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
3.11.1 Spins in an external magnetic eld . . . . . . . . . . . . . . . . . . 165
3.11.2 Negative temperature (!) . . . . . . . . . . . . . . . . . . . . . . . . 167
3.11.3 Adsorption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
3.11.4 Elasticity of wool . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
3.11.5 Noninteracting spin dimers . . . . . . . . . . . . . . . . . . . . . . . 171
3.12 Quantum Statistics and the Boltzmann Limit . . . . . . . . . . . . . . . . . 172
3.13 Statistical Mechanics of Molecular Gases . . . . . . . . . . . . . . . . . . . 174
3.13.1 Ideal gas law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
3.13.2 The internal coordinate partition function . . . . . . . . . . . . . . 176
3.13.3 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
3.13.4 Vibrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
3.13.5 Two-level systems : Schottky anomaly . . . . . . . . . . . . . . . . 179
3.13.6 Electronic and nuclear excitations . . . . . . . . . . . . . . . . . . . 181
vi CONTENTS
3.14 Dissociation of Molecular Hydrogen . . . . . . . . . . . . . . . . . . . . . . 183
3.15 Lee-Yang Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
3.15.1 Electrostatic analogy . . . . . . . . . . . . . . . . . . . . . . . . . . 186
3.15.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
3.16 Appendix I : Additional Examples . . . . . . . . . . . . . . . . . . . . . . . 188
3.16.1 Three state system . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
3.16.2 Spins and vacancies on a surface . . . . . . . . . . . . . . . . . . . . 189
3.16.3 Fluctuating interface . . . . . . . . . . . . . . . . . . . . . . . . . . 191
3.17 Appendix II : Canonical Transformations in Hamiltonian Mechanics . . . . 193
4 Noninteracting Quantum Systems 195
4.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
4.2 Grand Canonical Ensemble for Quantum Systems . . . . . . . . . . . . . . 196
4.2.1 Maxwell-Boltzmann limit . . . . . . . . . . . . . . . . . . . . . . . . 197
4.2.2 Single particle density of states . . . . . . . . . . . . . . . . . . . . 198
4.3 Quantum Ideal Gases : Low Density Expansions . . . . . . . . . . . . . . . 199
4.3.1 Virial expansion of the equation of state . . . . . . . . . . . . . . . 201
4.3.2 Ballistic dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
4.4 Photon Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
4.4.1 Classical arguments for the photon gas . . . . . . . . . . . . . . . . 206
4.4.2 Surface temperature of the earth . . . . . . . . . . . . . . . . . . . . 207
4.4.3 Distribution of blackbody radiation . . . . . . . . . . . . . . . . . . 208
4.4.4 What if the sun emitted ferromagnetic spin waves? . . . . . . . . . 209
4.5 Lattice Vibrations : Einstein and Debye Models . . . . . . . . . . . . . . . 210
4.5.1 One-dimensional chain . . . . . . . . . . . . . . . . . . . . . . . . . 210
4.5.2 General theory of lattice vibrations . . . . . . . . . . . . . . . . . . 212
4.5.3 Einstein and Debye models . . . . . . . . . . . . . . . . . . . . . . . 216
4.5.4 Melting and the Lindemann criterion . . . . . . . . . . . . . . . . . 217
4.5.5 Goldstone bosons . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
CONTENTS vii
4.6 The Ideal Bose Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
4.6.1 Isotherms for the ideal Bose gas . . . . . . . . . . . . . . . . . . . . 223
4.6.2 The -transition in Liquid
4
He . . . . . . . . . . . . . . . . . . . . . 225
4.6.3 Fountain eect in superuid
4
He . . . . . . . . . . . . . . . . . . . . 227
4.6.4 Bose condensation in optical traps . . . . . . . . . . . . . . . . . . . 228
4.6.5 Example problem from Fall 2004 UCSD graduate written exam . . 230
4.7 The Ideal Fermi Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
4.7.1 The Fermi distribution . . . . . . . . . . . . . . . . . . . . . . . . . 233
4.7.2 T = 0 and the Fermi surface . . . . . . . . . . . . . . . . . . . . . . 233
4.7.3 Spin-split Fermi surfaces . . . . . . . . . . . . . . . . . . . . . . . . 235
4.7.4 The Sommerfeld expansion . . . . . . . . . . . . . . . . . . . . . . . 237
4.7.5 Chemical potential shift . . . . . . . . . . . . . . . . . . . . . . . . 239
4.7.6 Specic heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
4.7.7 Magnetic susceptibility and Pauli paramagnetism . . . . . . . . . . 241
4.7.8 Landau diamagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . 243
4.7.9 White dwarf stars . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
5 Interacting Systems 249
5.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
5.2 Ising Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
5.2.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
5.2.2 Ising model in one dimension . . . . . . . . . . . . . . . . . . . . . . 250
5.2.3 H = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
5.2.4 Chain with free ends . . . . . . . . . . . . . . . . . . . . . . . . . . 252
5.3 Potts Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
5.3.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
5.3.2 Transfer matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
5.4 Weakly Nonideal Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
5.4.1 Mayer cluster expansion . . . . . . . . . . . . . . . . . . . . . . . . 257
viii CONTENTS
5.4.2 Cookbook recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
5.4.3 Lowest order expansion . . . . . . . . . . . . . . . . . . . . . . . . . 261
5.4.4 Hard sphere gas in three dimensions . . . . . . . . . . . . . . . . . . 263
5.4.5 Weakly attractive tail . . . . . . . . . . . . . . . . . . . . . . . . . . 264
5.4.6 Spherical potential well . . . . . . . . . . . . . . . . . . . . . . . . . 265
5.4.7 Hard spheres with a hard wall . . . . . . . . . . . . . . . . . . . . . 266
5.5 Liquid State Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
5.5.1 The many-particle distribution function . . . . . . . . . . . . . . . . 269
5.5.2 Averages over the distribution . . . . . . . . . . . . . . . . . . . . . 270
5.5.3 Virial equation of state . . . . . . . . . . . . . . . . . . . . . . . . . 274
5.5.4 Correlations and scattering . . . . . . . . . . . . . . . . . . . . . . . 276
5.5.5 Correlation and response . . . . . . . . . . . . . . . . . . . . . . . . 279
5.5.6 BBGKY hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
5.5.7 Ornstein-Zernike theory . . . . . . . . . . . . . . . . . . . . . . . . . 282
5.5.8 Percus-Yevick equation . . . . . . . . . . . . . . . . . . . . . . . . . 283
5.5.9 Long wavelength behavior and the Ornstein-Zernike approximation 285
5.6 Coulomb Systems : Plasmas and the Electron Gas . . . . . . . . . . . . . . 287
5.6.1 Electrostatic potential . . . . . . . . . . . . . . . . . . . . . . . . . 287
5.6.2 Debye-H uckel theory . . . . . . . . . . . . . . . . . . . . . . . . . . 288
5.6.3 The electron gas : Thomas-Fermi screening . . . . . . . . . . . . . . 290
6 Mean Field Theory 295
6.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
6.2 The Lattice Gas and the Ising Model . . . . . . . . . . . . . . . . . . . . . 296
6.2.1 Fluid and magnetic phase diagrams . . . . . . . . . . . . . . . . . . 297
6.2.2 Gibbs-Duhem relation for magnetic systems . . . . . . . . . . . . . 299
6.3 Order-Disorder Transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
6.4 Mean Field Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
6.4.1 h = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
CONTENTS ix
6.4.2 Specic heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
6.4.3 h ,= 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
6.4.4 Magnetization dynamics . . . . . . . . . . . . . . . . . . . . . . . . 307
6.4.5 Beyond nearest neighbors . . . . . . . . . . . . . . . . . . . . . . . . 310
6.5 Ising Model with Long-Ranged Forces . . . . . . . . . . . . . . . . . . . . . 310
6.6 Variational Density Matrix Method . . . . . . . . . . . . . . . . . . . . . . 312
6.6.1 Variational density matrix for the Ising model . . . . . . . . . . . . 313
6.6.2 Mean Field Theory of the Potts Model . . . . . . . . . . . . . . . . 316
6.6.3 Mean Field Theory of the XY Model . . . . . . . . . . . . . . . . . 318
6.7 Landau Theory of Phase Transitions . . . . . . . . . . . . . . . . . . . . . . 321
6.7.1 Cubic terms in Landau theory : rst order transitions . . . . . . . . 323
6.7.2 Magnetization dynamics . . . . . . . . . . . . . . . . . . . . . . . . 324
6.7.3 Sixth order Landau theory : tricritical point . . . . . . . . . . . . . 326
6.7.4 Hysteresis for the sextic potential . . . . . . . . . . . . . . . . . . . 328
6.8 Correlation and Response in Mean Field Theory . . . . . . . . . . . . . . . 330
6.8.1 Calculation of the response functions . . . . . . . . . . . . . . . . . 332
6.9 Global Symmetries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
6.9.1 Lower critical dimension . . . . . . . . . . . . . . . . . . . . . . . . 336
6.9.2 Continuous symmetries . . . . . . . . . . . . . . . . . . . . . . . . . 338
6.10 Random Systems : Imry-Ma Argument . . . . . . . . . . . . . . . . . . . . 339
6.11 Ginzburg-Landau Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
6.11.1 Domain wall prole . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
6.11.2 Derivation of Ginzburg-Landau free energy . . . . . . . . . . . . . . 344
6.12 Ginzburg Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
6.13 Appendix I : Equivalence of the Mean Field Descriptions . . . . . . . . . . 349
6.13.1 Variational Density Matrix . . . . . . . . . . . . . . . . . . . . . . . 350
6.13.2 Mean Field Approximation . . . . . . . . . . . . . . . . . . . . . . . 352
6.14 Appendix II : Blume-Capel Model . . . . . . . . . . . . . . . . . . . . . . . 353
x CONTENTS
6.15 Appendix III : Ising Antiferromagnet in an External Field . . . . . . . . . . 354
6.16 Appendix IV : Canted Quantum Antiferromagnet . . . . . . . . . . . . . . 358
6.17 Appendix V : Coupled Order Parameters . . . . . . . . . . . . . . . . . . . 360
7 Nonequilibrium Phenomena 367
7.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
7.2 Equilibrium, Nonequilibrium and Local Equilibrium . . . . . . . . . . . . . 368
7.3 Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
7.3.1 Collisionless Boltzmann equation . . . . . . . . . . . . . . . . . . . 371
7.3.2 Collisional invariants . . . . . . . . . . . . . . . . . . . . . . . . . . 372
7.3.3 Scattering processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
7.3.4 Detailed balance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
7.4 H-Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
7.5 Weakly Inhomogeneous Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
7.6 Relaxation Time Approximation . . . . . . . . . . . . . . . . . . . . . . . . 380
7.6.1 Computation of the scattering time . . . . . . . . . . . . . . . . . . 381
7.6.2 Thermal conductivity . . . . . . . . . . . . . . . . . . . . . . . . . . 382
7.6.3 Viscosity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
7.6.4 Quick and Dirty Treatment of Transport . . . . . . . . . . . . . . . 386
7.6.5 Thermal diusivity, kinematic viscosity, and Prandtl number . . . . 387
7.6.6 Oscillating external force . . . . . . . . . . . . . . . . . . . . . . . . 388
7.7 Nonequilibrium Quantum Transport . . . . . . . . . . . . . . . . . . . . . . 389
7.8 Linearized Boltzmann Equation . . . . . . . . . . . . . . . . . . . . . . . . 391
7.8.1 Linear algebraic properties of L . . . . . . . . . . . . . . . . . . . . 392
7.8.2 Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393
7.9 Stochastic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394
7.9.1 Langevin equation and Brownian motion . . . . . . . . . . . . . . . 394
7.9.2 Langevin equation for a particle in a harmonic well . . . . . . . . . 399
7.9.3 General Linear Autonomous Inhomogeneous ODEs . . . . . . . . . 400
CONTENTS xi
7.9.4 Discrete random walk . . . . . . . . . . . . . . . . . . . . . . . . . . 403
7.9.5 Fokker-Planck equation . . . . . . . . . . . . . . . . . . . . . . . . . 404
7.9.6 Brownian motion redux . . . . . . . . . . . . . . . . . . . . . . . . . 405
7.10 Appendix I : Example Problem (advanced) . . . . . . . . . . . . . . . . . . 406
7.11 Appendix II : Distributions and Functionals . . . . . . . . . . . . . . . . . . 409
7.12 Appendix III : More on Inhomogeneous Autonomous Linear ODES . . . . 412
7.13 Appendix IV : Kramers-Kr onig Relations . . . . . . . . . . . . . . . . . . . 415
xii CONTENTS
0.1 Preface
This is a proto-preface. A more complete preface will be written after these notes are
completed.
These lecture notes are intended to supplement a course in statistical physics at the upper
division undergraduate or beginning graduate level.
I was fortunate to learn this subject from one of the great statistical physicists of our time,
John Cardy.
I am grateful to my wife Joyce and to my children Ezra and Lily for putting up with all the
outrageous lies Ive told them about getting o the computer in just a few minutes while
working on these notes.
These notes are dedicated to the only two creatures I know who are never angry with me:
my father and my dog.
Figure 1: My father (Louis) and my dog (Henry).
Chapter 1
Thermodynamics
1.1 References
E. Fermi, Thermodynamics (Dover, 1956)
This outstanding and inexpensive little book is a model of clarity.
A. H. Carter, Classical and Statistical Thermodynamics
(Benjamin Cummings, 2000)
A very relaxed treatment appropriate for undergraduate physics majors.
H. B. Callen, Thermodynamics and an Introduction to Thermostatistics
(2
nd
edition, Wiley, 1985)
A comprehensive text appropriate for an extended course on thermodynamics.
D. V. Schroeder, An Introduction to Thermal Physics (Addison-Wesley, 2000)
An excellent thermodynamics text appropriate for upper division undergraduates.
Contains many illustrative practical applications.
D. Kondepudi and I. Prigogine, Modern Thermodynamics: From Heat Engines to
Dissipative Structures (Wiley, 1998)
Lively modern text with excellent choice of topics and good historical content. More
focus on chemical and materials applications than in Callen.
L. E. Reichl, A Modern Course in Statistical Physics (2nd edition, Wiley, 1998)
A graduate level text with an excellent and crisp section on thermodynamics.
1
2 CHAPTER 1. THERMODYNAMICS
1.2 What is Thermodynamics?
Thermodynamics is the study of relations among the state variables describing a thermo-
dynamic system, and of transformations of heat into work and vice versa.
1.2.1 Thermodynamic systems and state variables
Thermodynamic systems contain large numbers of constituent particles, and are described
by a set of state variables which describe the systems properties in an average sense.
State variables, which describe bulk or average properties of a thermodynamic system, are
classied as being either extensive or intensive.
Extensive variables, such as volume V , particle number N, total internal energy E, mag-
netization M, etc., scale linearly with the system size, i.e. as the rst power of the system
volume. If we take two identical thermodynamic systems, place them next to each other,
and remove any barriers between them, then all the extensive variables will double in size.
Intensive variables, such as the pressure p, the temperature T, the chemical potential ,
the electric eld

E, etc., are independent of system size, scaling as the zeroth power of the
volume. They are the same throughout the system, if that system is in an appropriate
state of equilibrium. The ratio of any two extensive variables is an intensive variable. For
example, we write n = N/V for the number density, which scales as V
0
.
Classically, the full motion of a system of N point particles requires 6N variables to fully
describe it (3N positions and 3N velocities or momenta, in three space dimensions)
1
. Since
the constituents are very small, N is typically very large. A typical solid or liquid, for
example, has a mass density on the order of 1 g/cm
3
; for gases, 10
3
g/cm
3
.
The constituent atoms have masses of 10
0
to 10
2
per mole, where one mole of X contains
N
A
of X, and N
A
= 6.0221415 10
23
is Avogadros number. Thus, for solids and liquids
we roughly expect number densities n of 10
2
10
0
mol/cm
3
for solids and liquids, and
10
5
10
3
mol/cm
3
for gases. Clearly we are dealing with fantastically large numbers of
constituent particles in a typical thermodynamic system. The underlying theoretical basis
for thermodynamics, where we use a small number of state variables to describe a system,
is provided by the microscopic theory of statistical mechanics, which we shall study in the
weeks ahead.
Intensive quantities such as p, T, and n ultimately involve averages over both space and
time. Consider for example the case of a gas enclosed in a container. We can measure the
pressure (relative to atmospheric pressure) by attaching a spring to a moveable wall, as
shown in g. 1.2. From the displacement of the spring and the value of its spring constant
k we determine the force F. This force is due to the dierence in pressures, so p = p
0
+F/A.
Microscopically, the gas consists of constituent atoms or molecules, which are constantly
undergoing collisions with each other and with the walls of the container. When a particle
1
For a system of N molecules which can freely rotate, we must then specify 3N additional orientational
variables the Euler angles and their 3N conjugate momenta. The dimension of phase space is then 12N.
1.2. WHAT IS THERMODYNAMICS? 3
Figure 1.1: From microscale to macroscale : physical versus social sciences.
bounces o a wall, it imparts an impulse 2 n( n p), where p is the particles momentum
and n is the unit vector normal to the wall. (Only particles with p n > 0 will hit the
wall.) Multiply this by the number of particles colliding with the wall per unit time, and
one nds the net force on the wall; dividing by the area gives the pressure p. Within the
gas, each particle travels for a distance , called the mean free path, before it undergoes a
collision. We can write = v, where v is the average particle speed and is the mean
free time. When we study the kinetic theory of gases, we will derive formulas for and v
(and hence ). For now it is helpful to quote some numbers to get an idea of the relevant
distance and time scales. For O
2
gas at standard temperature and pressure (T = 0
C,
p = 1 atm), the mean free path is 1.1 10
5
cm, the average speed is v 480 m/s,
and the mean free time is 2.5 10
10
s. Thus, particles in the gas undergo collisions
at a rate
1
4.0 10
9
s
1
. A measuring device, such as our spring, or a thermometer,
eectively performs time and space averages. If there are N
c
collisions with a particular
patch of wall during some time interval on which our measurement device responds, then
the root mean square relative uctuations in the local pressure will be on the order of N
1/2
c
times the average. Since N
c
is a very large number, the uctuations are negligible.
If the system is in steady state, the state variables do not change with time. If furthermore
there are no macroscopic currents of energy or particle number owing through the system,
the system is said to be in equilibrium. A continuous succession of equilibrium states is
known as a thermodynamic path, which can be represented as a smooth curve in a multi-
dimensional space whose axes are labeled by state variables. A thermodynamic process is
any change or succession of changes which results in a change of the state variables. In a
cyclic process, the initial and nal states are the same. In a quasistatic process, the system
passes through a continuous succession of equilibria. A reversible process is one where the
external conditions and the thermodynamic path of the system can be reversed (at rst this
seems to be a tautology). All reversible processes are quasistatic, but not all quasistatic
processes are reversible. For example, the slow expansion of a gas against a piston head,
whose counter-force is always innitesimally less than the force pA exerted by the gas, is
reversible. To reverse this process, we simply add innitesimally more force to pA and the
gas compresses. A quasistatic process which is not reversible: slowly dragging a block across
the oor, or the slow leak of air from a tire. Irreversible processes, as a rule, are dissipative.
Other special processes include isothermal (dT = 0) isobaric (dp = 0), isochoric (dV = 0),
Figure 1.2: The pressure p of a gas is due to an average over space and time of the impulses
due to the constituent particles.
and adiabatic ( dQ = 0, i.e. no heat exchange):
reversible: dQ = T dS isothermal: dT = 0
spontaneous: dQ < T dS isochoric: dV = 0
adiabatic: dQ = 0 isobaric: dp = 0
quasistatic: innitely slowly
We shall discuss later the entropy S and its connection with irreversibility.
How many state variables are necessary to fully specify the equilibrium state of a thermo-
dynamic system? For a single component system, such as water which is composed of one
constituent molecule, the answer is three. These can be taken to be T, p, and V . One
always must specify at least one extensive variable, else we cannot determine the overall
size of the system. For a multicomponent system with g dierent species, we must specify
g +2 state variables, which may be T, p, N
1
, . . . , N
g
, where N
a
is the number of particles
of species a. Another possibility is the set (T, p, V, x
1
, . . . , x
g1
, where the concentration
of species a is x
a
= N
a
/N. Here, N =
g
a=1
N
a
is the total number of particles. Note that
g
a=1
x
a
= 1.
If then follows that if we specify more than g +2 state variables, there must exist a relation
among them. Such relations are known as equations of state. The most famous example is
the ideal gas law,
pV = Nk
B
T , (1.1)
relating the four state variables T, p, V , and N. Here k
B
= 1.3806503 10
16
erg/K is
Boltzmanns constant. Another example is the van der Waals equation,
_
p +
aN
2
V
2
_
(V bN) = Nk
B
T , (1.2)
where a and b are constants which depend on the molecule which forms the gas. For a third
example, consider a paramagnet, where
M
V
=
CH
T
, (1.3)
where M is the magnetization, H the magnetic eld, and C the Curie constant.
Any quantity which, in equilibrium, depends only on the state variables is called a state
function. For example, the total internal energy E of a thermodynamics system is a state
function, and we may write E = E(T, p, V ). State functions can also serve as state variables,
although the most natural state variables are those which can be directly measured.
1.2.2 Heat
Once thought to be a type of uid, heat is now understood in terms of the kinetic theory of
gases, liquids, and solids as a form of energy stored in the disordered motion of constituent
particles. The units of heat are therefore units of energy, and it is appropriate to speak of
heat energy, which we shall simply abbreviate as heat:
2
1 J = 10
7
erg = 6.242 10
18
eV = 2.390 10
4
kcal = 9.478 10
4
BTU . (1.4)
We will use the symbol Q to denote the amount of heat energy absorbed by a system
during some given thermodynamic process, and dQ to denote a dierential amount of heat
energy. The symbol d indicates an inexact dierential, about which we shall have more
to say presently. This means that heat is not a state function: there is no heat function
Q(T, p, V ).
1.2.3 Work
In general we will write the dierential element of work dW done by the system as
dW =
i
F
i
dX
i
, (1.5)
where F
i
is a generalized force and dX
i
a generalized displacement. The generalized forces
and displacements are themselves state variables, and by convention we will take the gen-
eralized forces to be intensive and the generalized displacements to be extensive. As an
example, in a simple one-component system, we have dW = p dV . More generally, we write
dW =
P
j
y
j
dX
j
..
_
p dV
H d
E d
P dA +. . .
_
P
a
a
dN
a
..
_
1
dN
1
+
2
dN
2
+. . .
_
(1.6)
2
One calorie (cal) is the amount of heat needed to raise 1 g of H
2
O from T
0
= 14.5
C to T
1
= 15.5
C at
a pressure of p
0
= 1 atm. One British Thermal Unit (BTU) is the amount of heat needed to raise 1 lb. of
H
2
O from T
0
= 63
F to T
1
= 64
F at a pressure of p
0
= 1 atm.
Figure 1.3: The constant volume gas thermometer. The gas is placed in thermal contact
with an object of temperature T. An incompressible uid of density is used to measure
the pressure dierence p = p
gas
p
0
.
Here we distinguish between two types of work. The rst involves changes in quantities such
as volume, magnetization, electric polarization, area, etc. The conjugate forces y
i
applied
to the system are then p, the magnetic eld

H, the electric eld

E, the surface tension ,
respectively. The second type of work involves changes in the number of constituents of a
given species. For example, energy is required in order to dissociate two hydrogen atoms in
an H
2
molecule. The eect of such a process is dN
H
2
= 1 and dN
H
= +2.
As with heat, dW is an inexact dierential, and work W is not a state variable, since it is
path-dependent. There is no work function W(T, p, V ).
1.2.4 Pressure and Temperature
The units of pressure (p) are force per unit area. The SI unit is the Pascal (Pa): 1 Pa =
1 N/m
2
= 1 kg/ms
2
. Other units of pressure we will encounter:
1 bar 10
5
Pa
1 atm 1.01325 10
5
Pa
1 torr 133.3 Pa .
Temperature (T) has a very precise denition from the point of view of statistical mechanics,
as we shall see. Many physical properties depend on the temperature such properties
are called thermometric properties. For example, the resistivity of a metal (T, p) or the
number density of a gas n(T, p) are both thermometric properties, and can be used to dene
Figure 1.4: A sketch of the phase diagram of H
2
O (water). Two special points are identied:
the triple point (T
t
, p
t
) at which there is three phase coexistence, and the critical point
(T
c
, p
c
), where the latent heat of transformation from liquid to gas vanishes. Not shown are
transitions between several dierent solid phases.
a temperature scale. Consider the device known as the constant volume gas thermometer
depicted in g. 1.3, in which the volume or pressure of a gas may be used to measure
temperature. The gas is assumed to be in equilibrium at some pressure p, volume V ,
and temperature T. An incompressible uid of density is used to measure the pressure
dierence p = p p
0
, where p
0
is the ambient pressure at the top of the reservoir:
p p
0
= g(h
2
h
1
) , (1.7)
where g is the acceleration due to gravity. The height h
1
of the left column of uid in the
U-tube provides a measure of the change in the volume of the gas:
V (h
1
) = V (0) Ah
1
, (1.8)
where A is the (assumed constant) cross-sectional area of the left arm of the U-tube. The
device can operate in two modes:
Constant pressure mode : The height of the reservoir is adjusted so that the height
dierence h
2
h
1
is held constant. This xes the pressure p of the gas. The gas
volume still varies with temperature T, and we can dene
T
T
ref
=
V
V
ref
, (1.9)
where T
ref
and V
ref
are the reference temperature and volume, respectively.
Figure 1.5: As the gas density tends to zero, the readings of the constant volume gas
thermometer converge.
Constant volume mode : The height of the reservoir is adjusted so that h
1
= 0, hence
the volume of the gas is held xed, and the pressure varies with temperature. We
then dene
T
T
ref
=
p
p
ref
, (1.10)
where T
ref
and p
ref
are the reference temperature and pressure, respectively.
What should we use for a reference? One might think that a pot of boiling water will
do, but anyone who has gone camping in the mountains knows that water boils at lower
temperatures at high altitude (lower pressure). This phenomenon is reected in the phase
diagram for H
2
O, depicted in g. 1.4. There are two special points in the phase diagram,
however. One is the triple point, where the solid, liquid, and vapor (gas) phases all coexist.
The second is the critical point, which is the terminus of the curve separating liquid from gas.
At the critical point, the latent heat of transition between liquid and gas phases vanishes
(more on this later on). The triple point temperature T
t
at thus unique and is by denition
T
t
= 273.16 K. The pressure at the triple point is 611.7 Pa = 6.056 10
3
atm.
A question remains: are the two modes of the thermometer compatible? E.g. it we boil
water at p = p
0
= 1 atm, do they yield the same value for T? And what if we use a dierent
gas in our measurements? In fact, all these measurements will in general be incompatible,
yielding dierent results for the temperature T. However, in the limit that we use a very
low density gas, all the results converge. This is because all low density gases behave as
ideal gases, and obey the ideal gas equation of state pV = Nk
B
T.
1.2.5 Standard temperature and pressure
It is customary in the physical sciences to dene certain standard conditions with respect to
which conditions may be compared. In thermodynamics, there is a notion of standard tem-
perature and pressure, abbreviated STP. Unfortunately, there are two dierent denitions
of STP currently in use, one from the International Union of Pure and Applied Chemistry
(IUPAC), and the other from the U.S. National Institute of Standards and Technology
(NIST). The two standards are:
IUPAC : T
0
= 0
C = 273.15 K , p
0
= 10
5
Pa
NIST : T
0
= 20
C = 293.15 K , p
0
= 1 atm = 1.01325 10
5
Pa
To make matters worse, in the past it was customary to dene STP as T
0
= 0
C and
p
0
= 1 atm. We will use the NIST denition in this course. Unless I slip and use the IUPAC
denition. Figuring out what I mean by STP will keep you on your toes.
The volume of one mole of ideal gas at STP is then
V =
N
A
k
B
T
0
p
0
=
_
22.711 (IUPAC)
24.219 (NIST) ,
(1.11)
where 1 = 10
6
cm
3
= 10
3
m
3
is one liter. Under the old denition of STP as T
0
= 0
C
and p
0
= 1 atm, the volume of one mole of gas at STP is 22.414 , which is a gure I
remember from my 10
th
grade chemistry class with Mr. Lawrence.
1.2.6 Exact and Inexact Dierentials
The dierential
dF =
k
i=1
A
i
dx
i
(1.12)
is called exact if there is a function F(x
1
, . . . , x
k
) whose dierential gives the right hand
side of eqn. 1.12. In this case, we have
A
i
=
F
x
i
A
i
x
j
=
A
j
x
i
i, j . (1.13)
For exact dierentials, the integral between xed endpoints is path-independent:
B
_
A
dF = F(x
B
1
, . . . , x
B
k
) F(x
A
1
, . . . , x
A
k
) , (1.14)
from which it follows that the integral of dF around any closed path must vanish:
_
dF = 0 . (1.15)
When the cross derivatives are not identical, i.e. when A
i
/x
j
,= A
j
/x
i
, the dierential
is inexact. In this case, the integral of dF is path dependent, and does not depend solely
on the endpoints.
Figure 1.6: Two distinct paths with identical endpoints.
As an example, consider the dierential
dF = K
1
y dx +K
2
xdy . (1.16)
Lets evaluate the integral of dF, which is the work done, along each of the two paths in
g. 1.6:
W
(I)
= K
1
x
B
_
x
A
dx y
A
+K
2
y
B
_
y
A
dy x
B
= K
1
y
A
(x
B
x
A
) +K
2
x
B
(y
B
y
A
) (1.17)
W
(II)
= K
1
x
B
_
x
A
dx y
B
+K
2
y
B
_
y
A
dy x
A
= K
1
y
B
(x
B
x
A
) +K
2
x
A
(y
B
y
A
) . (1.18)
Note that in general W
(I)
,= W
(II)
. Thus, if we start at point A, the kinetic energy at point
B will depend on the path taken, since the work done is path-dependent.
The dierence between the work done along the two paths is
W
(I)
W
(II)
=
_
dF = (K
2
K
1
) (x
B
x
A
) (y
B
y
A
) . (1.19)
Thus, we see that if K
1
= K
2
, the work is the same for the two paths. In fact, if K
1
= K
2
,
the work would be path-independent, and would depend only on the endpoints. This is
true for any path, and not just piecewise linear paths of the type depicted in g. 1.6. Thus,
if K
1
= K
2
, we are justied in using the notation dF for the dierential in eqn. 1.16;
explicitly, we then have F = K
1
xy. However, if K
1
,= K
2
, the dierential is inexact, and
we will henceforth write dF in such cases.
1.3. THE ZEROTH LAW OF THERMODYNAMICS 11
1.3 The Zeroth Law of Thermodynamics
Equilibrium is established by the exchange of energy, volume, or particle number between
dierent systems or subsystems:
energy exchange = T = constant = thermal equilibrium
volume exchange = p = constant = mechanical equilibrium
particle exchange = = constant = chemical equilibrium
Equilibrium is transitive, so
If A is in equilibrium with B, and B is in equilibrium with C, then A is in
equilibrium with C.
This known as the Zeroth Law of Thermodynamics
3
.
1.4 The First Law of Thermodynamics
The rst law is a statement of energy conservation, and is depicted in g. 1.7. It says, quite
simply, that during a thermodynamic process, the change in a systems internal energy E
is given by the heat energy Q added to the system, minus the work W done by the system:
E = QW . (1.20)
The dierential form of this, the First Law of Thermodynamics, is
dE = dQ dW . (1.21)
Consider a volume V of uid held in a ask, initially at temperature T
0
, and held at
atmospheric pressure. The internal energy is then E
0
= E(T
0
, p, V ). Now let us contemplate
changing the temperature in two dierent ways. The rst method (A) is to place the ask
on a hot plate until the temperature of the uid rises to a value T
1
. The second method (B)
is to stir the uid vigorously. In the rst case, we add heat Q
A
> 0 but no work is done, so
W
A
= 0. In the second case, if we thermally insulate the ask and use a stirrer of very low
thermal conductivity, then no heat is added, i.e. Q
B
= 0. However, the stirrer does work
W
B
> 0 on the uid (remember W is the work done by the system). If we end up at the
same temperature T
1
, then the nal energy is E
1
= E(T
1
, p, V ) in both cases. We then have
E = E
1
E
0
= Q
A
= W
B
. (1.22)
It also follows that for any cyclic transformation, where the state variables are the same at
the beginning and the end, we have
E
cyclic
= QW = 0 = Q = W (cyclic) . (1.23)
3
As we shall see further below, mechanical equilibrium in fact leads to constant p/T, and chemical
equilibrium to constant /T. If there is thermal equilibrium, then T is already constant, and so mechanical
and chemical equilibria guarantee, respectively, the additional constancy of p and .
Figure 1.7: The rst law of thermodynamics is a statement of energy conservation.
1.4.1 Single component systems
A single component system is specied by three state variables. In many applications,
the total number of particles N is conserved, so it is useful to take N as one of the state
variables. The remaining two can be (T, V ) or (T, p) or (p, V ). The dierential form of the
rst law says
dE = dQ dW
= dQp dV +dN . (1.24)
The quantity is called the chemical potential . Here we shall be interested in the case
dN = 0 so the last term will not enter into our considerations. We ask: how much heat is
required in order to make an innitesimal change in temperature, pressure, or volume? We
start by rewriting eqn. 1.24 as
dQ = dE +p dV dN . (1.25)
We now must roll up our sleeves and do some work with partial derivatives.
(T, V, N) systems : If the state variables are (T, V, N), we write
dE =
_
E
T
_
V,N
dT +
_
E
V
_
T,N
dV +
_
E
N
_
T,V
dN . (1.26)
Then
dQ =
_
E
T
_
V,N
dT +
_
_
E
V
_
T,N
+p
_
dV +
_
_
E
N
_
T,V
_
dN . (1.27)
(T, p, N) systems : If the state variables are (T, p, N), we write
dE =
_
E
T
_
p,N
dT +
_
E
p
_
T,N
dp +
_
E
N
_
T,p
dN . (1.28)
We also write
dV =
_
V
T
_
p,N
dT +
_
V
p
_
T,N
dp +
_
V
N
_
T,p
dN . (1.29)
1.4. THE FIRST LAW OF THERMODYNAMICS 13
Then
dQ =
_
_
E
T
_
p,N
+p
_
V
T
_
p,N
_
dT +
_
_
E
p
_
T,N
+p
_
V
p
_
T,N
_
dp
+
_
_
E
N
_
T,p
+p
_
V
N
_
T,p
_
dN . (1.30)
(p, V, N) systems : If the state variables are (p, V, N), we write
dE =
_
E
p
_
V,N
dp +
_
E
V
_
p,N
dV +
_
E
N
_
p,V
dN . (1.31)
Then
dQ =
_
E
p
_
V,N
dp +
_
_
E
V
_
p,N
+p
_
dV +
_
_
E
N
_
p,V
_
dN . (1.32)
The heat capacity of a body, C, is by denition the ratio dQ/dT of the amount of heat
absorbed by the body to the associated innitesimal change in temperature dT. The heat
capacity will in general be dierent if the body is heated at constant volume or at constant
pressure. Setting dV = 0 gives, from eqn. 1.27,
C
V,N
=
_
dQ
dT
_
V,N
=
_
E
T
_
V,N
. (1.33)
Similarly, if we set dp = 0, then eqn. 1.30 yields
C
p,N
=
_
dQ
dT
_
p,N
=
_
E
T
_
p,N
+p
_
V
T
_
p,N
. (1.34)
Unless explicitly stated as otherwise, we shall assume that N is xed, and will write C
V
for
C
V,N
and C
p
for C
p,N
.
The units of heat capacity are energy divided by temperature, e.g. J/K. The heat capacity
is an extensive quantity, scaling with the size of the system. If we divide by the number of
moles N/N
A
, we obtain the molar heat capacity, sometimes called the molar specic heat:
c = C/, where = N/N
A
is the number of moles of substance. Specic heat is also
sometimes quoted in units of heat capacity per gram of substance. We shall dene
c =
C
mN
=
c
M
=
heat capacity per mole
mass per mole
. (1.35)
Here m is the mass per particle and M is the mass per mole: M = N
A
m.
Suppose we raise the temperature of a body from T = T
A
to T = T
B
. How much heat is
required? We have
Q =
T
B
_
T
A
dT C(T) , (1.36)
c
p
c
p
c
p
c
p
SUBSTANCE (J/mol K) (J/g K) SUBSTANCE (J/mol K) (J/g K)
Air 29.07 1.01 H
2
O (25
C) 75.34 4.181
Aluminum 24.2 0.897 H
2
O (100
+
C) 37.47 2.08
Copper 24.47 0.385 Iron 25.1 0.450
CO
2
36.94 0.839 Lead 26.4 0.127
Diamond 6.115 0.509 Lithium 24.8 3.58
Ethanol 112 2.44 Neon 20.786 1.03
Gold 25.42 0.129 Oxygen 29.38 0.918
Helium 20.786 5.193 Paran (wax) 900 2.5
Hydrogen 28.82 5.19 Uranium 27.7 0.116
H
2
O (10
C) 38.09 2.05 Zinc 25.3 0.387

Table 1.1: Specic heat (at 25
C, unless otherwise noted) of some common substances.

(Source: Wikipedia)
where C = C
V
or C = C
p
depending on whether volume or pressure is held constant. For
ideal gases, as we shall discuss below, C(T) is constant, and thus
Q = C(T
B
T
A
) = T
B
= T
A
+
Q
C
. (1.37)
In metals at very low temperatures one nds C = T, where is a constant
4
. We then
have
Q =
T
B
_
T
A
dT C(T) =
1
2
_
T
2
B
T
2
A
_
(1.38)
T
B
=
_
T
2
A
+ 2
1
Q . (1.39)
1.4.2 Ideal gases
The ideal gas equation of state is pV = Nk
B
T. In order to invoke the formulae in eqns. 1.27,
1.30, and 1.32, we need to know the state function E(T, V, N). A landmark experiment by
Joule in the mid-19th century established that the energy of a low density gas is independent
of its volume
5
. Essentially, a gas at temperature T was allowed to freely expand from one
volume V to a larger volume V
> V , with no added heat Q and no work W done. Therefore

the energy cannot change. What Joule found was that the temperature also did not change.
This means that E(T, V, N) = E(T, N) cannot be a function of the volume.
4
In most metals, the dierence between C
V
and C
p
is negligible.
5
See the description in E. Fermi, Thermodynamics, pp. 22-23.
Figure 1.8: Heat capacity C
V
for one mole of hydrogen (H
2
) gas. At the lowest temperatures,
only translational degrees of freedom are relevant, and f = 3. At around 200 K, two
rotational modes are excitable and f = 5. Above 1000 K, the vibrational excitations begin
to contribute. Note the logarithmic temperature scale. (Data from H. W. Wooley et al.,
Jour. Natl. Bureau of Standards, 41, 379 (1948).)
Since E is extensive, we conclude that
E(T, V, N) = (T) , (1.40)
where = N/N
A
is the number of moles of substance. Note that is an extensive variable.
From eqns. 1.33 and 1.34, we conclude
C
V
(T) =
(T) , C
p
(T) = C
V
(T) +R , (1.41)
where we invoke the ideal gas law to obtain the second of these. Empirically it is found that
C
V
(T) is temperature independent over a wide range of T, far enough from boiling point.
We can then write C
V
= c
V
, where N/N
A
is the number of moles, and where c
V
is
the molar heat capacity. We then have
c
p
= c
V
+R , (1.42)
where R = N
A
k
B
= 8.31457 J/mol K is the gas constant. We denote by = c
p
/c
V
the ratio
of specic heat at constant pressure and at constant volume.
From the kinetic theory of gases, one can show that
monatomic gases: c
V
=
3
2
R , c
p
=
5
2
R , =
5
3
diatomic gases: c
V
=
5
2
R , c
p
=
7
2
R , =
7
5
polyatomic gases: c
V
= 3R , c
p
= 4R , =
4
3
.
Figure 1.9: Molar heat capacities c
V
for three solids. The solid curves correspond to the
predictions of the Debye model, which we shall discuss later.
Digression : kinetic theory of gases
We will conclude in general from noninteracting classical statistical mechanics that the
specic heat of a substance is c
v
=
1
2
fR, where f is the number of phase space coordinates,
per particle, for which there is a quadratic kinetic or potential energy function. For example,
a point particle has three translational degrees of freedom, and the kinetic energy is a
quadratic function of their conjugate momenta: H
0
= (p
2
x
+ p
2
y
+ p
2
z
)/2m. Thus, f = 3.
Diatomic molecules have two additional rotational degrees of freedom we dont count
rotations about the symmetry axis and their conjugate momenta also appear quadratically
in the kinetic energy, leading to f = 5. For polyatomic molecules, all three Euler angles
and their conjugate momenta are in play, and f = 6.
The reason that f = 5 for diatomic molecules rather than f = 6 is due to quantum me-
chanics. While translational eigenstates form a continuum, or are quantized in a box with
k
= 2/L
being very small, since the dimensions L
are macroscopic, angular momen-

tum, and hence rotational kinetic energy, is quantized. For rotations about a principal axis
with very low moment of inertia I, the corresponding energy scale
2
/2I is very large, and
a high temperature is required in order to thermally populate these states. Thus, degrees
of freedom with a quantization energy on the order or greater than
0
are frozen out for
temperatures T
<
0
/k
B
.
In solids, each atom is eectively connected to its neighbors by springs; such a potential
arises from quantum mechanical and electrostatic consideration of the interacting atoms.
Thus, each degree of freedom contributes to the potential energy, and its conjugate mo-
mentum contributes to the kinetic energy. This results in f = 6. Assuming only lattice
vibrations, then, the high temperature limit for c
V
(T) for any solid is predicted to be
3R = 24.944 J/mol K. This is called the Dulong-Petit law. The high temperature limit is
reached above the so-called Debye temperature, which is roughly proportional to the melting
temperature of the solid.
In table 1.1, we list c
p
and c
p
for some common substances at T = 25
C (unless otherwise
noted). Note that c
p
for the monatomic gases He and Ne is to high accuracy given by the
value from kinetic theory, c
p
=
5
2
R = 20.7864 J/mol K. For the diatomic gases oxygen (O
2
)
and air (mostly N
2
and O
2
), kinetic theory predicts c
p
=
7
2
R = 29.10, which is close to
the measured values. Kinetic theory predicts c
p
= 4R = 33.258 for polyatomic gases; the
measured values for CO
2
and H
2
O are both about 10% higher.
1.4.3 Adiabatic transformations of ideal gases
Assuming dN = 0 and E = (T), eqn. 1.27 tells us that
dQ = C
V
dT +p dV . (1.43)
Invoking the ideal gas law to write p = RT/V , and remembering C
V
= c
V
, we have,
setting dQ = 0,
dT
T
+
R
c
V
dV
V
= 0 . (1.44)
We can immediately integrate to obtain
dQ = 0 = TV
1
= constant (1.45)
= pV
= constant (1.46)
= T
p
1
= constant , (1.47)
where the second two equations are obtained from the rst by invoking the ideal gas law.
These are all adiabatic equations of state. Note the dierence between the adiabatic equation
of state d(pV
) = 0 and the isothermal equation of state d(pV ) = 0. Equivalently, we can

write these three conditions as
V
2
T
f
= V
2
0
T
f
0
, p
f
V
f+2
= p
f
0
V
f+2
0
, T
f+2
p
2
= T
f+2
0
p
2
0
. (1.48)
It turns out that air is a rather poor conductor of heat. This suggests the following model
for an adiabatic atmosphere. The hydrostatic pressure decrease associated with an increase
dh in height is dp = g dz, where is the density and g the acceleration due to gravity.
Assuming the gas is ideal, the density can be written as = Mp/RT, where M is the molar
mass. Thus,
dp
p
=
Mg
RT
dz . (1.49)
If the height changes are adiabatic, then, from d(T
p
1
) = 0, we have
dT =
1
Tdp
p
=
1
Mg
R
dz , (1.50)
with the solution
T(z) = T
0
Mg
R
z =
_
1
1
_
T
0
, (1.51)
where T
0
= T(0) is the temperature at the earths surface, and
=
RT
0
Mg
. (1.52)
With M = 28.88 g and =
7
5
for air, and assuming T
0
= 293 K, we nd = 8.6 km, and
dT/dz = (1
1
) T
0
/ = 9.7 K/km. Note that in this model the atmosphere ends at
a height z
max
= /( 1) = 30 km.
Again invoking the adiabatic equation of state, we can nd p(z):
p(z)
p
0
=
_
T
T
0
_
1
=
_
1
1
_
1
(1.53)
Recall that
e
x
= lim
k
_
1 +
x
k
_
k
. (1.54)
Thus, in the limit 1, where k = /( 1) , we have p(z) = p
0
exp(z/). Finally,
since p/T from the ideal gas law, we have
(z)
0
=
_
1
1
_ 1
1
. (1.55)
1.4.4 Adiabatic free expansion
Consider the situation depicted in g. 1.10. A quantity ( moles) of gas in equilibrium
at temperature T and volume V
1
is allowed to expand freely into an evacuated chamber of
volume V
2
by the removal of a barrier. Clearly no work is done on or by the gas during
this process, hence W = 0. If the walls are everywhere insulating, so that no heat can pass
through them, then Q = 0 as well. The First Law then gives E = QW = 0, and there
is no change in energy.
If the gas is ideal, then since E(T, V, N) = Nc
V
T, then E = 0 gives T = 0, and there
is no change in temperature. (If the walls are insulating against the passage of heat, they
must also prevent the passage of particles, so N = 0.) There is of course a change in
volume: V = V
2
, hence there is a change in pressure. The initial pressure is p = Nk
B
T/V
1
and the nal pressure is p
= Nk
B
T/(V
1
+V
2
).
If the gas is nonideal, then the temperature will in general change. Suppose, for example,
that E(T, V, N) = V
x
N
1x
T
y
, where , x, and y are constants. This form is properly
extensive: if V and N double, then E doubles. If the volume changes from V to V
under
an adiabatic free expansion, then we must have, from E = 0,
_
V
V

_
x
=
_
T
T
_
y
= T
= T
_
V
V
_
x/y
. (1.56)
If x/y > 0, the temperature decreases upon the expansion. If x/y < 0, the temperature
increases. Without an equation of state, we cant say what happens to the pressure.
1.5. HEAT ENGINES 19
Figure 1.10: In the adiabatic free expansion of a gas, there is volume expansion with no
work or heat exchange with the environment: E = Q = W = 0.
Adiabatic free expansion of a gas is a spontaneous process, arising due to the natural internal
dynamics of the system. It is also irreversible. If we wish to take the gas back to its original
state, we must do work on it to compress it. If the gas is ideal, then we can follow a
thermodynamic path along an isotherm. The work done on the gas during compression is
then
J = Nk
B
T
V
i
_
V
f
dV
V
= Nk
B
T ln
_
V
f
V
i
_
= Nk
B
T ln
_
1 +
V
2
V
1
_
(1.57)
The work done by the gas is W =
_
p dV = J. During the compression, heat energy
Q = W < 0 is transferred to the gas. Thus, Q = J > 0 is given o by the gas to its
environment.
1.5 Heat Engines
A heat engine is a device which takes a thermodynamic system through a repeated cycle
which can be represented as a succession of equilibrium states: A B C A. The
net result of such a cyclic process is to convert heat into mechanical work, or vice versa.
For a system in equilibrium at temperature T, there is a thermodynamically large amount
of internal energy stored in the random internal motion of its constituent particles. Later,
when we study statistical mechanics, we will see how each quadratic degree of freedom
in the Hamiltonian contributes
1
2
k
B
T to the total internal energy. An immense body in
equilibrium at temperature T has an enormous heat capacity C, hence extracting a nite
quantity of heat Q from it results in a temperature change T = Q/C which is utterly
negligible. Such a body is called a heat bath, or thermal reservoir. A perfect engine would,
in each cycle, extract an amount of heat Q from the bath and convert it into work. Since
E = 0 for a cyclic process, the First Law then gives W = Q. This situation is depicted
schematically in g. 1.11. One could imagine running this process virtually indenitely,
Figure 1.11: A perfect engine would extract heat Q from a thermal reservoir at some tem-
perature T and convert it into useful mechanical work W. This process is alas impossible,
according to the Second Law of thermodynamics. The inverse process, where work J is
converted into heat Q, is always possible.
slowly sucking energy out of an immense heat bath, converting the random thermal motion
of its constituent molecules into useful mechanical work. Sadly, this is not possible:
A transformation whose only nal result is to extract heat from a source at xed
temperature and transform that heat into work is impossible.
This is known as the Postulate of Lord Kelvin. It is equivalent to the postulate of Clausius,
A transformation whose only result is to transfer heat from a body at a given
temperature to a body at higher temperature is impossible.
These postulates which have been repeatedly validated by empirical observations, constitute
the Second Law of Thermodynamics.
1.5.1 Engines and refrigerators
While it is not possible to convert heat into work with 100% eciency, it is possible to
transfer heat from one thermal reservoir to another one, at lower temperature, and to
convert some of that heat into work. This is what an engine does. The energy accounting
for one cycle of the engine is depicted in the left hand panel of g. 1.12. An amount of heat
Q
2
> 0 is extracted- from the reservoir at temperature T
2
. Since the reservoir is assumed
to be enormous, its temperature change T
2
= Q
2
/C
2
is negligible, and its temperature
remains constant this is what it means for an object to be a reservoir. A lesser amount of
heat, Q
1
, with 0 < Q
1
< Q
2
, is deposited in a second reservoir at a lower temperature T
1
.
Its temperature change T
1
= +Q
1
/C
1
is also negligible. The dierence W = Q
2
Q
1
is
extracted as useful work. We dene the eciency, , of the engine as the ratio of the work
done to the heat extracted from the upper reservoir, per cycle:
=
W
Q
2
= 1
Q
1
Q
2
. (1.58)
Figure 1.12: An engine (left) extracts heat Q
2
from a reservoir at temperature T
2
and
deposits a smaller amount of heat Q
1
into a reservoir at a lower temperature T
1
, during
each cycle. The dierence W = Q
2
Q
1
is transformed into mechanical work. A refrigerator
(right) performs the inverse process, drawing heat Q
1
from a low temperature reservoir and
depositing heat Q
2
= Q
1
+J into a high temperature reservoir, where J is the mechanical
(or electrical) work done per cycle.
This is a natural denition of eciency, since it will cost us fuel to maintain the temperature
of the upper reservoir over many cycles of the engine. Thus, the eciency is proportional
to the ratio of the work done to the cost of the fuel.
A refrigerator works according to the same principles, but the process runs in reverse. An
amount of heat Q
1
is extracted from the lower reservoir the inside of our refrigerator
and is pumped into the upper reservoir. As Clausius form of the Second Law asserts, it
is impossible for this to be the only result of our cycle. Some amount of work J must be
performed on the refrigerator in order for it to extract the heat Q
1
. Since E = 0 for the
cycle, a heat Q
2
= J + Q
1
must be deposited into the upper reservoir during each cycle.
The analog of eciency here is called the coecient of refrigeration, , dened as
=
Q
1
J
=
Q
1
Q
2
Q
1
. (1.59)
Thus, is proportional to the ratio of the heat extracted to the cost of electricity, per cycle.
Please note the deliberate notation here. I am using symbols Q and W to denote the heat
supplied to the engine (or refrigerator) and the work done by the engine, respectively, and
Q and J to denote the heat taken from the engine and the work done on the engine.
A perfect engine has Q
1
= 0 and e = 1; a perfect refrigerator has Q
1
= Q
2
and = .
Both violate the Second Law. Sadi Carnot (1796 1832) realized that a reversible cyclic
engine operating between two thermal reservoirs must produce the maximum amount of
work W, and that the amount of work produced is independent of the material properties
of the engine. We call any such engine a Carnot engine.
The eciency of a Carnot engine may be used to dene a temperature scale. We know from
Carnots observations that the eciency
C
can only be a function of the temperatures T
1
and T
2
:
C
=
C
(T
1
, T
2
). We can then dene
T
1
T
2
1
C
(T
1
, T
2
) . (1.60)
Below, in 1.5.3, we will see that how, using an ideal gas as the working substance of the
Carnot engine, this temperature scale coincides precisely with the ideal gas temperature
scale from 1.2.4.
1.5.2 Nothing beats a Carnot engine
The Carnot engine is the most ecient engine possible operating between two thermal
reservoirs. To see this, lets suppose that an amazing wonder engine has an eciency even
greater than that of the Carnot engine. A key feature of the Carnot engine is its reversibility
we can just go around its cycle in the opposite direction, creating a Carnot refrigerator.
Lets use our notional wonder engine to drive a Carnot refrigerator, as depicted in g. 1.13.
We assume that
W
Q
2
=
wonder
>
Carnot
=
J
2
. (1.61)
But from the gure, we have
W = Q
2
Q
1
= Q
2
Q
1
= J
. (1.62)
Therefore Q
2
> Q
2
, and we have transferred heat from the lower reservoir to the upper:
Q
2
Q
2
= Q
1
Q
1
> 0 . (1.63)
Clearly Q
2
Q
2
is the total heat extracted from the upper reservoir, while Q
1
Q
1
is the
total heat deposited in the lower reservoir. These quantities must be equal, since there is
no net work done, and by our argument both are positive. Therefore, the existence of the
wonder engine entails a violation of the Second Law. Since the Second Law is correct Lord
Kelvin articulated it, and who are we to argue with a Lord? the wonder engine cannot
exist.
We further conclude that all reversible engines running between two thermal reservoirs have
the same eciency, which is the eciency of a Carnot engine. For an irreversible engine,
we must have
=
W
Q
2
= 1
Q
1
Q
2
1
T
1
T
2
=
C
. (1.64)
Thus,
Q
2
T
2
Q
1
T
1
0 . (1.65)
Figure 1.13: A wonder engine driving a Carnot refrigerator.
1.5.3 The Carnot cycle
Let us now consider a specic cycle, known as the Carnot cycle, depicted in g. 1.14. The
cycle consists of two adiabats and two isotherms. The work done per cycle is simply the
area inside the curve on our p V diagram:
W =
_
p dV . (1.66)
The gas inside our Carnot engine is called the working substance. Whatever it may be,
the system obeys the First Law,
dE = dQ dW = dQp dV . (1.67)
We will now assume that the working material is an ideal gas, and we compute W as well
as Q
1
and Q
2
to nd the eciency of this cycle. In order to do this, we will rely upon the
ideal gas equations,
E =
RT
1
(1.68)
pV = RT , (1.69)
where = c
p
/c
v
= 1 +
2
f
, where f is the eective number of molecular degrees of freedom
contributing to the internal energy. Recall f = 3 for monatomic gases, f = 5 for diatomic
gases, and f = 6 for polyatomic gases. The nite dierence form of the rst law is
E = E
f
E
i
= Q
if
W
if
, (1.70)
where i denotes the initial state and f the nal state.
AB: This stage is an isothermal expansion at temperature T
2
. It is the power stroke of
Figure 1.14: The Carnot cycle consists of two adiabats (dark red) and two isotherms (blue).
the engine. We have
W
AB
=
V
B
_
V
A
dV
RT
2
V
= RT
2
ln
_
V
B
V
A
_
(1.71)
E
A
= E
B
=
RT
2
1
, (1.72)
hence
Q
AB
= E
AB
+W
AB
= RT
2
ln
_
V
B
V
A
_
. (1.73)
BC: This stage is an adiabatic expansion. We have
Q
BC
= 0 (1.74)
E
BC
= E
C
E
B
=
R
1
(T
1
T
2
) . (1.75)
The energy change is negative, and the heat exchange is zero, so the engine still does
some work during this stage:
W
BC
= Q
BC
E
BC
=
R
1
(T
2
T
1
) . (1.76)
CD: This stage is an isothermal compression, and we may apply the analysis of the isother-
mal expansion, mutatis mutandis:
W
CD
=
V
D
_
V
C
dV
RT
2
V
= RT
1
ln
_
V
D
V
C
_
(1.77)
E
C
= E
D
=
RT
1
1
, (1.78)
hence
Q
CD
= E
CD
+W
CD
= RT
1
ln
_
V
D
V
C
_
. (1.79)
DA: This last stage is an adiabatic compression, and we may draw on the results from the
adiabatic expansion in BC:
Q
DA
= 0 (1.80)
E
DA
= E
D
E
A
=
R
1
(T
2
T
1
) . (1.81)
The energy change is positive, and the heat exchange is zero, so work is done on the
engine:
W
DA
= Q
DA
E
DA
=
R
1
(T
1
T
2
) . (1.82)
We now add up all the work values from the individual stages to get for the cycle
W = W
AB
+W
BC
+W
CD
+W
DA
(1.83)
= RT
2
ln
_
V
B
V
A
_
+RT
1
ln
_
V
D
V
C
_
. (1.84)
Since we are analyzing a cyclic process, we must have E = 0, we must have Q = W,
which can of course be veried explicitly, by computing Q = Q
AB
+Q
BC
+Q
CD
+Q
DA
. To
nish up, recall the adiabatic ideal gas equation of state, d(TV
1
) = 0. This tells us that
T
2
V
1
B
= T
1
V
1
C
(1.85)
T
2
V
1
A
= T
1
V
1
D
. (1.86)
Dividing these two equations, we nd
V
B
V
A
=
V
C
V
D
, (1.87)
and therefore
W = R(T
2
T
1
) ln
_
V
B
V
A
_
(1.88)
Q
AB
= RT
2
ln
_
V
B
V
A
_
. (1.89)
Finally, the eciency is given by the ratio of these two quantities:
=
W
Q
AB
= 1
T
1
T
2
. (1.90)
Figure 1.15: A Stirling cycle consists of two isotherms (blue) and two isochores (green).
1.5.4 The Stirling cycle
Many other engine cycles are possible. The Stirling cycle, depicted in g. 1.15, consists
of two isotherms and two isochores. Recall the isothermal ideal gas equation of state,
d(pV ) = 0. Thus, for an ideal gas Stirling cycle, we have
p
A
V
1
= p
B
V
2
, p
D
V
1
= p
C
V
2
, (1.91)
which says
p
B
p
A
=
p
C
p
D
=
V
1
V
2
. (1.92)
AB: This isothermal expansion is the power stroke. Assuming moles of ideal gas through-
out, we have pV = RT
2
= p
1
V
1
, hence
W
AB
=
V
2
_
V
1
dV
RT
2
V
= RT
2
ln
_
V
2
V
1
_
. (1.93)
Since AB is an isotherm, we have E
A
= E
B
, and from E
AB
= 0 we conclude Q
AB
=
W
AB
.
BC: Isochoric cooling. Since dV = 0 we have W
BC
= 0. The energy change is given by
E
BC
= E
C
E
B
=
R(T
1
T
2
)
1
, (1.94)
which is negative. Since W
BC
= 0, we have Q
BC
= E
BC
.
CD: Isothermal compression. Clearly
W
CD
=
V
1
_
V
2
dV
RT
1
V
= RT
1
ln
_
V
2
V
1
_
. (1.95)
Since CD is an isotherm, we have E
C
= E
D
, and from E
CD
= 0 we conclude Q
CD
=
W
CD
.
DA: Isochoric heating. Since dV = 0 we have W
DA
= 0. The energy change is given by
E
DA
= E
A
E
D
=
R(T
2
T
1
)
1
, (1.96)
which is positive, and opposite to E
BC
. Since W
DA
= 0, we have Q
DA
= E
DA
.
We now add up all the work contributions to obtain
W = W
AB
+W
BC
+W
CD
+W
DA
(1.97)
= R(T
2
T
1
) ln
_
V
2
V
1
_
. (1.98)
The cycle eciency is once again
=
W
Q
AB
= 1
T
1
T
2
. (1.99)
1.5.5 The Otto and Diesel cycles
The Otto cycle is a rough approximation to the physics of a gasoline engine. It consists of
two adiabats and two isochores, and is depicted in g. 1.16. Assuming an ideal gas, along
the adiabats we have d(pV

) = 0. Thus,
p
A
V

1
= p
B
V
2
, p
D
V
1
= p
C
V
2
, (1.100)
which says
p
B
p
A
=
p
C
p
D
=
_
V
1
V
2
_
. (1.101)
Figure 1.16: An Otto cycle consists of two adiabats (dark red) and two isochores (green).
AB: Adiabatic expansion, the power stroke. The heat transfer is Q
AB
= 0, so from the
First Law we have W
AB
= E
AB
= E
A
E
B
, thus
W
AB
=
p
A
V
1
p
B
V
2
1
=
p
A
V
1
1
_
1
_
V
1
V
2
_
1
_
. (1.102)
Note that this result can also be obtained from the adiabatic equation of state pV
=
p
A
V
1
:
W
AB
=
V
2
_
V
1
p dV = p
A
V

1
V
2
_
V
1
dV V
=
p
A
V
1
1
_
1
_
V
1
V
2
_
1
_
. (1.103)
BC: Isochoric cooling (exhaust); dV = 0 hence W
BC
= 0. The heat Q
BC
absorbed is then
Q
BC
= E
C
E
B
=
V
2
1
(p
C
p
B
) . (1.104)
In a realistic engine, this is the stage in which the old burned gas is ejected and new
gas is inserted.
CD: Adiabatic compression; Q
CD
= 0 and W
CD
= E
C
E
D
:
W
CD
=
p
C
V
2
p
D
V
1
1
=
p
D
V
1
1
_
1
_
V
1
V
2
_
1
_
. (1.105)
DA: Isochoric heating, i.e. the combustion of the gas. As with BC we have dV = 0, and
thus W
DA
= 0. The heat Q
DA
absorbed by the gas is then
Q
DA
= E
A
E
D
=
V
1
1
(p
A
p
D
) . (1.106)
The total work done per cycle is then
W = W
AB
+W
BC
+W
CD
+W
DA
(1.107)
=
(p
A
p
D
)V
1
1
_
1
_
V
1
V
2
_
1
_
, (1.108)
and the eciency is dened to be

W
Q
DA
= 1
_
V
1
V
2
_
1
. (1.109)
The ratio V
2
/V
1
is called the compression ratio. We can make our Otto cycle more ecient
simply by increasing the compression ratio. The problem with this scheme is that if the fuel
mixture becomes too hot, it will spontaneously preignite, and the pressure will jump up
before point D in the cycle is reached. A Diesel engine avoids preignition by compressing
the air only, and then later spraying the fuel into the cylinder when the air temperature is
sucient for fuel ignition. The rate at which fuel is injected is adjusted so that the ignition
process takes place at constant pressure. Thus, in a Diesel engine, step DA is an isobar.
The compression ratio is r V
B
/V
D
, and the cuto ratio is s V
A
/V
D
. This renement of
the Otto cycle allows for higher compression ratios (of about 20) in practice, and greater
engine eciency.
For the Diesel cycle, we have, briey,
W = p
A
(V
A
V
D
) +
p
A
V
A
p
B
V
B
1
+
p
C
V
C
p
D
V
D
1
=
p
A
(V
A
V
D
)
1

(p
B
p
C
)V
B
1
(1.110)
and
Q
DA
=
p
A
(V
A
V
D
)
1
. (1.111)
To nd the eciency, we will need to eliminate p
B
and p
C
in favor of p
A
using the adiabatic
equation of state d(pV

) = 0. Thus,
p
B
= p
A
_
V
A
V
B
_
, p
C
= p
A
_
V
D
V
B
_
, (1.112)
Figure 1.17: A Diesel cycle consists of two adiabats (dark red), one isobar (light blue), and
one isochore (green).
where weve used p
D
= p
A
and V
C
= V
B
. Putting it all together, the eciency of the Diesel
cycle is
=
W
Q
DA
= 1
1
r
1
(s
1)
s 1
. (1.113)
1.5.6 The Joule-Brayton cycle
Our nal example is the Joule-Brayton cycle, depicted in g. 1.18, consisting of two adiabats
and two isobars. Along the adiabats we have Thus,
p
2
V

A
= p
1
V
D
, p
2
V
B
= p
1
V
C
, (1.114)
which says
V
D
V
A
=
V
C
V
B
=
_
p
2
p
1
_
1
. (1.115)
Figure 1.18: A Joule-Brayton cycle consists of two adiabats (dark red) and two isobars
(light blue).
AB: This isobaric expansion at p = p
2
is the power stroke. We have
W
AB
=
V
B
_
V
A
dV p
2
= p
2
(V
B
V
A
) (1.116)
E
AB
= E
B
E
A
=
p
2
(V
B
V
A
)
1
(1.117)
Q
AB
= E
AB
+W
AB
=
p
2
(V
B
V
A
)
1
. (1.118)
BC: Adiabatic expansion; Q
BC
= 0 and W
BC
= E
B
E
C
. The work done by the gas is
W
BC
=
p
2
V
B
p
1
V
C
1
=
p
2
V
B
1
_
1
p
1
p
2
V
C
V
B
_
=
p
2
V
B
1
_
1
_
p
1
p
2
_
1
1
_
. (1.119)
CD: Isobaric compression at p = p
1
.
W
CD
=
V
D
_
V
C
dV p
1
= p
1
(V
D
V
C
) = p
2
(V
B
V
A
)
_
p
1
p
2
_
1
1
(1.120)
E
CD
= E
D
E
C
=
p
1
(V
D
V
C
)
1
(1.121)
Q
CD
= E
CD
+W
CD
=
p
2
1
(V
B
V
A
)
_
p
1
p
2
_
1
1
. (1.122)
BC: Adiabatic expansion; Q
DA
= 0 and W
DA
= E
D
E
A
. The work done by the gas is
W
DA
=
p
1
V
D
p
2
V
A
1
=
p
2
V
A
1
_
1
p
1
p
2
V
D
V
A
_
=
p
2
V
A
1
_
1
_
p
1
p
2
_
1
1
_
. (1.123)
The total work done per cycle is then
W = W
AB
+W
BC
+W
CD
+W
DA
(1.124)
=
p
2
(V
B
V
A
)
1
_
1
_
p
1
p
2
_
1
1
_
(1.125)
and the eciency is dened to be

W
Q
AB
= 1
_
p
1
p
2
_
1
1
. (1.126)
1.5.7 Carnot engine at maximum power output
While the Carnot engine described above in 1.5.3 has maximum eciency, it is practically
useless, because the isothermal processes must take place innitely slowly in order for the
working material to remain in thermal equilibrium with each reservoir. Thus, while the
work done per cycle is nite, the cycle period is innite, and the engine power is zero.
A modication of the ideal Carnot cycle is necessary to create a practical engine. The idea
6
is as follows. During the isothermal expansion stage, the working material is maintained
at a temperature T
2w
< T
2
. The temperature dierence between the working material and
the hot reservoir drives a thermal current,
dQ
2
dt
=
2
(T
2
T
2w
) . (1.127)
6
See F. L. Curzon and B. Ahlborn, Am. J. Phys. 43, 22 (1975).
Here,
2
is a transport coecient which describes the thermal conductivity of the chamber
walls, multiplied by a geometric parameter (which is the ratio of the total wall area to its
thickness). Similarly, during the isothermal compression, the working material is maintained
at a temperature T
1w
> T
1
, which drives a thermal current to the cold reservoir,
dQ
1
dt
=
1
(T
1w
T
1
) . (1.128)
Now let us assume that the upper isothermal stage requires a duration t
2
and the lower
isotherm a duration t
1
. Then
Q
2
=
2
t
2
(T
2
T
2w
) (1.129)
Q
1
=
1
t
1
(T
1w
T
1
) . (1.130)
Since the engine is reversible, we must have
Q
1
T
1w
=
Q
2
T
2w
, (1.131)
which says
t
1
t
2
=

2
T
2w
(T
1w
T
1
)
1
T
1w
(T
2
T
2w
)
. (1.132)
The power is
P =
Q
2
Q
1
(1 +) (t
1
+ t
2
)
, (1.133)
where we assume that the adiabatic stages require a combined time of (t
1
+t
2
). Thus,
we nd
P =

1
2
(T
2w
T
1w
) (T
1w
T
1
) (T
2
T
2w
)
(1 +) [
1
T
2
(T
1w
T
1
) +
2
T
1
(T
2
T
2w
) + (
2
1
) (T
1w
T
1
) (T
2
T
2w
)]
(1.134)
We optimize the engine by maximizing P with respect to the temperatures T
1w
and T
2w
.
This yields
T
2w
= T
2
T
2
_
T
1
T
2
1 +
_
2
/
1
(1.135)
T
1w
= T
1
+
_
T
1
T
2
T
1
1 +
_
1
/
2
. (1.136)
The eciency at maximum power is then
=
Q
2
Q
1
Q
2
= 1
T
1w
T
2w
= 1
T
1
T
2
. (1.137)
One also nds at maximum power
t
2
t
1
=
2
. (1.138)
Power source T
1
(
C) T
2
(
C)
Carnot
(theor.) (obs.)
West Thurrock (UK)
Coal Fired Steam Plant 25 565 0.641 0.40 0.36
CANDU (Canada)
PHW Nuclear Reactor 25 300 0.480 0.28 0.30
Larderello (Italy)
Geothermal Steam Plant 80 250 0.323 0.175 0.16
Table 1.2: Observed performances of real heat engines, taken from table 1 from Curzon and
Albhorn (1975).
Finally, the maximized power is
P
max
=

1
2
1 +
_
_
T
2
_
T
1
_
1
+
_
2
_
2
. (1.139)
Table 1.2, taken from the article of Curzon and Albhorn (1975), shows how the eciency of
this practical Carnot cycle, given by eqn. 1.137, rather accurately predicts the eciencies
of functioning power plants.
1.6 The Entropy
The Second Law guarantees us that an engine operating between two heat baths at tem-
peratures T
1
and T
2
must satisfy
Q
1
T
1
+
Q
2
T
2
0 , (1.140)
with the equality holding for reversible processes. This is a restatement of eqn. 1.65, after
writing Q
1
= Q
1
for the heat transferred to the engine from reservoir #1. Consider now
an arbitrary curve in the p V plane. We can describe such a curve, to arbitrary accuracy,
as a combination of Carnot cycles, as shown in g. 1.19. Each little Carnot cycle consists
of two adiabats and two isotherms. We then conclude
i
Q
i
T
i
_
(
dQ
T
0 , (1.141)
with equality holding if all the cycles are reversible. Rudolf Clausius, in 1865, realized that
one could then dene a new state function, which he called the entropy, S, that depended
only on the initial and nal states of a reversible process:
dS =
dQ
T
= S
B
S
A
=
B
_
A
dQ
T
. (1.142)
Since Q is extensive, so is S; the units of entropy are [S] = J/K.
1.6. THE ENTROPY 35
Figure 1.19: An arbitrarily shaped cycle in the p V plane can be decomposed into a
number of smaller Carnot cycles. Red curves indicate isotherms and blue curves adiabats,
with =
5
3
.
1.6.1 The Third Law of Thermodynamics
Eqn. 1.142 determines the entropy up to a constant. By choosing a standard state , we can
dene S
= 0, and then by taking A = in the above equation, we can dene the absolute
entropy S for any state. However, it turns out that this seemingly arbitrary constant S
in the entropy does have consequences, for example in the theory of gaseous equilibrium.
The proper denition of entropy, from the point of view of statistical mechanics, will lead
us to understand how the zero temperature entropy of a system is related to its quantum
mechanical ground state degeneracy. Walther Nernst, in 1906, articulated a principle which
is sometimes called the Third Law of Thermodynamics,
The entropy of every system at absolute zero temperature always vanishes.
Again, this is not quite correct, and quantum mechanics tells us that S(T = 0) = k
B
ln g,
where g is the ground state degeneracy. Nernsts law holds when g = 1.
We can combine the First and Second laws to write
dE + dW = dQ T dS , (1.143)
where the equality holds for reversible processes.
1.6.2 Entropy changes in cyclic processes
For a cyclic process, whether reversible or not, the change in entropy around a cycle is zero:
S
CYC
= 0. This is because the entropy S is a state function, with a unique value for every
equilibrium state. A cyclical process returns to the same equilibrium state, hence S must
return as well to its corresponding value from the previous cycle.
Consider now a general engine, as in g. 1.12. Let us compute the total entropy change in
the entire Universe over one cycle. We have
(S)
TOTAL
= (S)
ENGINE
+ (S)
HOT
+ (S)
COLD
, (1.144)
written as a sum over entropy changes of the engine itself, the hot reservoir, and the cold
reservoir
7
. Clearly (S)
ENGINE
= 0. The changes in the reservoir entropies are
(S)
HOT
=
_
T=T
2
dQ
HOT
T
=
Q
2
T
2
< 0 (1.145)
(S)
COLD
=
_
T=T
1
dQ
COLD
T
=
Q
1
T
1
=
Q
1
T
1
> 0 , (1.146)
because the hot reservoir loses heat Q
2
> 0 to the engine, and the cold reservoir gains heat
Q
1
= Q
1
> 0 from the engine. Therefore,
(S)
TOTAL
=
_
Q
1
T
1
+
Q
2
T
2
_
0 . (1.147)
Thus, for a reversible cycle, the net change in the total entropy of the engine plus reservoirs
is zero. For an irreversible cycle, there is an increase in total entropy, due to spontaneous
processes.
1.6.3 Gibbs-Duhem relation
Recall eqn. 1.6:
dW =
j
y
j
dX
j

a
dN
a
. (1.148)
For reversible systems, we can therefore write
dE = T dS +
j
y
j
dX
j
+
a
dN
a
. (1.149)
This says that the energy E is a function of the entropy S, the generalized displacements
X
j
, and the particle numbers N
a
:
E = E
_
S, X
j
, N
a
_
. (1.150)
Furthermore, we have
T =
_
E
S
_
X
j
,N
a
, y
j
=
_
E
X
j
_
S,X
i(=j)
,N
a
,
a
=
_
E
N
a
_
S,X
j
,N
b(=a)
(1.151)
7
We neglect any interfacial contributions to the entropy change, which will be small compared with the
bulk entropy change in the thermodynamic limit of large system size.
1.6. THE ENTROPY 37
Since E and all its arguments are extensive, we have
E = E
_
S, X
j
, N
a
_
. (1.152)
We now dierentiate the LHS and RHS above with respect to , setting = 1 afterward.
The result is
E = S
E
S
+
j
X
j
E
X
j
+
a
N
a
E
N
a
(1.153)
= TS +
j
y
j
X
j
+
a
N
a
. (1.154)
Mathematically astute readers will recognize this result as an example of Eulers theorem
for homogeneous functions. Taking the dierential of eqn. 1.154, and then subtracting eqn.
1.149, we obtain
S dT +
j
X
j
dy
j
+
a
N
a
d
a
= 0 . (1.155)
This is called the Gibbs-Duhem relation. It says that there is one equation of state which
may be written in terms of all the intensive quantities alone. For example, for a single
component system, we must have p = p(T, ), which follows from
S dT V dp +N d = 0 . (1.156)
1.6.4 Entropy for an ideal gas
For an ideal gas, we have E =
1
2
fNk
B
T, and
dS =
1
T
dE +
p
T
dV

T
dN
=
1
2
fNk
B
dT
T
+
p
T
dV +
_
1
2
fk
B

T
_
dN . (1.157)
Invoking the ideal gas equation of state pV = Nk
B
T, we have
dS
N
=
1
2
fNk
B
d ln T +Nk
B
d ln V . (1.158)
Integrating, we obtain
S(T, V, N) =
1
2
fNk
B
ln T +Nk
B
lnV +(N) , (1.159)
where (N) is an arbitrary function. Extensivity of S places restrictions on (N), so that
the most general case is
S(T, V, N) =
1
2
fNk
B
ln T +Nk
B
ln
_
V
N
_
+Na , (1.160)
where a is a constant. Equivalently, we could write
S(E, V, N) =
1
2
fNk
B
ln
_
E
N
_
+Nk
B
ln
_
V
N
_
+Nb , (1.161)
where b = a
1
2
fk
B
ln(
1
2
fk
B
) is another constant. When we study statistical mechanics, we
will nd that for the monatomic ideal gas the entropy is
S(T, V, N) = Nk
B
_
5
2
+ ln
_
V
N
3
T
_
_
, (1.162)
where
T
=
_
2
2
/mk
B
T is the thermal wavelength, which involved Plancks constant.
Lets now contrast two illustrative cases.
Adiabatic free expansion Suppose the volume freely expands from V
i
to V
f
= r V
i
,
with r > 1. Such an expansion can be eected by a removal of a partition between
two chambers that are otherwise thermally insulated (see g. 1.10). We have already
seen how this process entails
E = Q = W = 0 . (1.163)
But the entropy changes! According to eqn. 1.161, we have
S = S
f
S
i
= Nk
B
ln r . (1.164)
Reversible adiabatic expansion If the gas expands quasistatically and reversibly,
then S = S(E, V, N) holds everywhere along the thermodynamic path. We then
have, assuming dN = 0,
0 = dS =
1
2
fNk
B
dE
E
+Nk
B
dV
V
= Nk
B
d ln
_
V E
f/2
_
. (1.165)
Integrating, we nd
E
E
0
=
_
V
0
V
_
2/f
. (1.166)
Thus,
E
f
= r
2/f
E
i
T
f
= r
2/f
T
i
. (1.167)
1.6.5 Example system
Consider a model thermodynamic system for which
E(S, V, N) =
aS
3
NV
, (1.168)
1.6. THE ENTROPY 39
where a is a constant. We have
dE = T dS p dV +dN , (1.169)
and therefore
T =
_
E
S
_
V,N
=
3aS
2
NV
(1.170)
p =
_
E
V
_
S,N
=
aS
3
NV
2
(1.171)
=
_
E
N
_
S,V
=
aS
3
N
2
V
. (1.172)
Choosing any two of these equations, we can eliminate S, which is inconvenient for experi-
mental purposes. This yields three equations of state,
T
3
p
2
= 27a
V
N
,
T
3
2
= 27a
N
V
,
p
=
N
V
, (1.173)
only two of which are independent.
What about C
V
and C
p
? To nd C
V
, we recast eqn. 1.170 as
S =
_
NV T
3a
_
1/2
. (1.174)
We then have
C
V
= T
_
S
T
_
V,N
=
1
2
_
NV T
3a
_
1/2
=
N
18a
T
2
p
, (1.175)
where the last equality on the RHS follows upon invoking the rst of the equations of
state in eqn. 1.173. To nd C
p
, we eliminate V from eqns. 1.170 and 1.171, obtaining
T
2
/p = 9aS/N. From this we obtain
C
p
= T
_
S
T
_
p,N
=
2N
9a
T
2
p
. (1.176)
Thus, C
p
/C
V
= 4.
We can derive still more. To nd the isothermal compressibility
T
=
1
V
_
V
p
_
T,N
, use
the rst of the equations of state in eqn. 1.173. To derive the adiabatic compressibility
S
=
1
V
_
V
p
_
S,N
, use eqn. 1.171, and then eliminate the inconvenient variable S.
Suppose we use this system as the working substance for a Carnot engine. Lets compute
the work done and the engine eciency. To do this, it is helpful to eliminate S in the
expression for the energy, and to rewrite the equation of state:
E = pV =
_
N
27a
V
1/2
T
3/2
, p =
_
N
27a
T
3/2
V
1/2
. (1.177)
We assume dN = 0 throughout. We now see that for isotherms,
dT = 0 :
E
V
= constant (1.178)
Furthermore, since
dW
T
=
_
N
27a
T
3/2
dV
V
1/2
= 2 dE
T
, (1.179)
we conclude that
dT = 0 : W
if
= 2(E
f
E
i
) , Q
if
= E
f
E
i
+W
if
= 3(E
f
E
i
) . (1.180)
For adiabats, eqn. 1.170 says d(TV ) = 0, and therefore
dQ = 0 : TV = constant ,
E
T
= constant , EV = constant (1.181)
as well as W
if
= E
i
E
f
. We can use these relations to derive the following:
E
B
=
V
B
V
A
E
A
, E
C
=
T
1
T
2
V
B
V
A
E
A
, E
D
=
T
1
T
2
E
A
. (1.182)
Now we can write
W
AB
= 2(E
B
E
A
) = 2
_
V
B
V
A
1
_
E
A
(1.183)
W
BC
= (E
B
E
C
) =
V
B
V
A
_
1
T
1
T
2
_
E
A
(1.184)
W
CD
= 2(E
D
E
C
) = 2
T
1
T
2
_
1
V
B
V
A
_
E
A
(1.185)
W
DA
= (E
D
E
A
) =
_
T
1
T
2
1
_
E
A
(1.186)
(1.187)
Adding up all the work, we obtain
W = W
AB
+W
BC
+W
CD
+W
DA
(1.188)
= 3
_
V
B
V
A
1
__
1
T
1
T
2
_
E
A
. (1.189)
Since
Q
AB
= 3(E
B
E
A
) =
3
2
W
AB
= 3
_
V
B
V
A
1
_
E
A
, (1.190)
we nd once again
=
W
Q
AB
= 1
T
1
T
2
. (1.191)
1.7. THERMODYNAMIC POTENTIALS 41
1.6.6 Measuring the entropy of a substance
If we can measure the heat capacity C
V
(T) or C
p
(T) of a substance as a function of tem-
perature down to the lowest temperatures, then we can measure the entropy. At constant
pressure, for example, we have T dS = C
p
dT, hence
S(p, T) = S(p, T = 0) +
T
_
0
dT
C
p
(T
)
T
. (1.192)
The zero temperature entropy is S(p, T = 0) = k
B
ln g where g is the quantum ground state
degeneracy at pressure p. In all but highly unusual cases, g = 1 and S(p, T = 0) = 0.
1.7 Thermodynamic Potentials
Thermodynamic systems may do work on their environments. Under certain constraints,
the work done may be bounded from above by the change in an appropriately dened
thermodynamic potential .
1.7.1 Energy E
Suppose we wish to create a thermodynamic system from scratch. Lets imagine that we
create it from scratch in a thermally insulated box of volume V . The work we must to to
assemble the system is then
J = E . (1.193)
After we bring all the constituent particles together, pulling them in from innity (say),
the system will have total energy E. After we nish, the system may not be in thermal
equilibrium. Spontaneous processes will then occur so as to maximize the systems entropy,
but the internal energy remains at E.
We have, from the First Law, dE = dQ dW. For equilibrium systems, we have
dE = T dS p dV +dN , (1.194)
which says that E = E(S, V, N), and
T =
_
E
S
_
V,N
, p =
_
E
V
_
S,N
, =
_
E
N
_
S,V
. (1.195)
The Second Law, in the form dQ T dS, then yields
dE T dS p dV +dN . (1.196)
This form is valid for single component systems and is easily generalized to multicomponent
systems, or magnetic systems, etc. Now consider a process at xed (S, V, N). We then have
dE 0. This says that spontaneous processes in a system with dS = dV = dN = 0 always
lead to a reduction in the internal energy E. Therefore, spontaneous processes drive the
internal energy E to a minimum in systems at xed (S, V, N).
Allowing for other work processes, we have
dW T dS dE . (1.197)
Hence, the work done by a thermodynamic system under conditions of constant entropy is
bounded above by dE, and the maximum dW is achieved for a reversible process.
It is useful to dene the quantity
dW
free
= dW p dV , (1.198)
which is the dierential work done by the system other than that required to change its
volume. Then
dW
free
T dS p dV dE , (1.199)
and we conclude that for systems at xed (S, V ) that dW
free
dE.
1.7.2 Helmholtz free energy F
Suppose that when we spontaneously create our system while it is in constant contact with
a thermal reservoir at temperature T. Then as we create our system, it will absorb heat
from the reservoir. Therefore, we dont have to supply the full internal energy E, but rather
only E Q, since the system receives heat energy Q from the reservoir. In other words, we
must perform work
J = E TS (1.200)
to create our system, if it is constantly in equilibrium at temperature T. The quantity
E TS is known as the Helmholtz free energy, F, which is related to the energy E by a
Legendre transformation,
F = E TS . (1.201)
The general properties of Legendre transformations are discussed in Appendix II, 1.15.
Under equilibrium conditions, we have
dF = S dT p dV +dN . (1.202)
Thus, F = F(T, V, N), whereas E = E(S, V, N), and
S =
_
F
T
_
V,N
, p =
_
F
V
_
T,N
, =
_
F
N
_
T,V
. (1.203)
In general, the Second Law tells us that
dF S dT p dV +dN . (1.204)
1.7. THERMODYNAMIC POTENTIALS 43
The equality holds for reversible processes, and the inequality for spontaneous processes.
Therefore, spontaneous processes drive the Helmholtz free energy E to a minimum in systems
at xed (T, V, N).
We may also write
dW S dT dF , (1.205)
In other words, the work done by a thermodynamic system under conditions of constant
temperature is bounded above by dF, and the maximum dW is achieved for a reversible
process. We also have
dW
free
S dT p dV dF , (1.206)
and we conclude, for systems at xed (T, V ), that dW
free
dF.
1.7.3 Enthalpy H
Suppose that when we spontaneously create our system while it is thermally insulated, but
in constant mechanical contact with a volume bath at pressure p. For example, we could
create our system inside a thermally insulated chamber with one movable wall where the
external pressure is xed at p. Thus, when creating the system, in addition to the systems
internal energy E, we must also perform work pV in order to make room for the it. In other
words, we must perform work
J = E +pV . (1.207)
The quantity E +pV is known as the enthalpy, H.
The enthalpy is obtained from the energy via a dierent Legendre transformation:
H = E +pV . (1.208)
In equilibrium, then,
dH = T dS +V dp +dN , (1.209)
which says H = H(S, p, N), with
T =
_
H
S
_
p,N
, V =
_
H
p
_
S,N
, =
_
H
N
_
S,p
. (1.210)
In general, we have
dH T dS +V dp +dN , (1.211)
hence spontaneous processes drive the enthalpy H to a minimum in systems at xed (S, p, N).
For general systems,
dH T dS dW +p dV +V dp , (1.212)
hence
dW
free
T dS +V dp dH , (1.213)
and we conclude, for systems at xed (S, p), that dW
free
dH.
1.7.4 Gibbs free energy G
If we create a thermodynamic system at conditions of constant temperature T and constant
pressure p, then it absorbs heat energy Q = TS from the reservoir and we must expend
work energy pV in order to make room for it. Thus, the total amount of work we must do
in assembling our system is
J = E TS +pV . (1.214)
This is the Gibbs free energy, G.
The Gibbs free energy is obtained by a second Legendre transformation:
G = E TS +pV (1.215)
Note that G = F +pV = H TS. For equilibrium systems, the dierential of G is
dG = S dT +V dp +dN , (1.216)
therefore G = G(T, p, N), with
S =
_
G
T
_
p,N
, V =
_
G
p
_
T,N
, =
_
G
N
_
T,p
. (1.217)
From eqn. 1.154, we have
E = TS pV +N , (1.218)
therefore
G = N . (1.219)
The Second Law says that
dG S dT +V dp +dN , (1.220)
hence spontaneous processes drive the Gibbs free energy G to a minimum in systems at xed
(T, p, N). For general systems,
dW
free
S dT +V dp dG . (1.221)
Accordingly, we conclude, for systems at xed (T, p), that dW
free
dG.
1.7.5 Grand potential
The grand potential, sometimes called the Landau free energy, is dened by
= E TS N . (1.222)
Its dierential is
d = S dT p dV N d , (1.223)
1.8. MAXWELL RELATIONS 45
hence
S =
_
T
_
V,
, p =
_
V
_
T,
, N =
_
_
T,V
. (1.224)
Again invoking eqn. 1.154, we nd
= pV . (1.225)
The Second Law tells us
d dW S dT dN N d , (1.226)
hence
d
W
free
dW
free
+dN S dT p dV N d d . (1.227)
We conclude, for systems at xed (T, V, ), that d
W
free
d.
1.8 Maxwell Relations
Maxwell relations are conditions equating certain derivatives of state variables which follow
from the exactness of the dierentials of the various state functions.
1.8.1 Relations deriving from E(S, V, N)
The energy E(S, V, N) is a state function, with
dE = T dS p dV +dN , (1.228)
and therefore
T =
_
E
S
_
V,N
, p =
_
E
V
_
S,N
, =
_
E
N
_
S,V
. (1.229)
Taking the mixed second derivatives, we nd
2
E
S V
=
_
T
V
_
S,N
=
_
p
S
_
V,N
(1.230)
2
E
S N
=
_
T
N
_
S,V
=
_
S
_
V,N
(1.231)
2
E
V N
=
_
p
N
_
S,V
=
_
V
_
S,N
. (1.232)
1.8.2 Relations deriving from F(T, V, N)
The energy F(T, V, N) is a state function, with
dF = S dT p dV +dN , (1.233)
and therefore
S =
_
F
T
_
V,N
, p =
_
F
V
_
T,N
, =
_
F
N
_
T,V
. (1.234)
2
F
T V
=
_
S
V
_
T,N
=
_
p
T
_
V,N
(1.235)
2
F
T N
=
_
S
N
_
T,V
=
_
T
_
V,N
(1.236)
2
F
V N
=
_
p
N
_
T,V
=
_
V
_
T,N
. (1.237)
1.8.3 Relations deriving from H(S, p, N)
The enthalpy H(S, p, N) satises
dH = T dS +V dp +dN , (1.238)
which says H = H(S, p, N), with
T =
_
H
S
_
p,N
, V =
_
H
p
_
S,N
, =
_
H
N
_
S,p
. (1.239)
2
H
S p
=
_
T
p
_
S,N
=
_
V
S
_
p,N
(1.240)
2
H
S N
=
_
T
N
_
S,p
=
_
S
_
p,N
(1.241)
2
H
p N
=
_
V
N
_
S,p
=
_
p
_
S,N
. (1.242)
1.8. MAXWELL RELATIONS 47
1.8.4 Relations deriving from G(T, p, N)
The Gibbs free energy G(T, p, N) satises
dG = S dT +V dp +dN , (1.243)
therefore G = G(T, p, N), with
S =
_
G
T
_
p,N
, V =
_
G
p
_
T,N
, =
_
G
N
_
T,p
. (1.244)
2
G
T p
=
_
S
p
_
T,N
=
_
V
T
_
p,N
(1.245)
2
G
T N
=
_
S
N
_
T,p
=
_
T
_
p,N
(1.246)
2
G
p N
=
_
V
N
_
T,p
=
_
p
_
T,N
. (1.247)
1.8.5 Relations deriving from (T, V, )
The grand potential (T, V, ) satised
d = S dT p dV N d , (1.248)
hence
S =
_
T
_
V,
, p =
_
V
_
T,
, N =
_
_
T,V
. (1.249)
T V
=
_
S
V
_
T,
=
_
p
T
_
V,
(1.250)
T
=
_
S
_
T,V
=
_
N
T
_
V,
(1.251)
V
=
_
p
_
T,V
=
_
N
V
_
T,
. (1.252)
1.8.6 Generalized thermodynamic potentials
We have up until now assumed a generalized force-displacement pair (y, X) = (p, V ).
But the above results also generalize to e.g. magnetic systems, where (y, X) = (
H,

M). In
general, we have
THIS SPACE AVAILABLE dE = T dS +y dX +dN (1.253)
F = E TS dF = S dT +y dX +dN (1.254)
H = E yX dH = T dS X dy +dN (1.255)
G = E TS yX dG = S dT X dy +dN (1.256)
= E TS N d = S dT +y dX N d . (1.257)
Generalizing (p, V ) (y, X), we also obtain, mutatis mutandis, the following Maxwell
relations:
_
T
X
_
S,N
=
_
y
S
_
X,N
_
T
N
_
S,X
=
_
S
_
X,N
_
y
N
_
S,X
=
_
X
_
S,N
_
T
y
_
S,N
=
_
X
S
_
y,N
_
T
N
_
S,y
=
_
S
_
y,N
_
X
N
_
S,y
=
_
y
_
S,N
_
S
X
_
T,N
=
_
y
T
_
X,N
_
S
N
_
T,X
=
_
T
_
X,N
_
y
N
_
T,X
=
_
X
_
T,N
_
S
y
_
T,N
=
_
X
T
_
y,N
_
S
N
_
T,y
=
_
T
_
y,N
_
X
N
_
T,y
=
_
y
_
T,N
_
S
X
_
T,
=
_
y
T
_
X,
_
S
_
T,X
=
_
N
T
_
X,
_
y
_
T,X
=
_
N
X
_
T,
.
1.9 Equilibrium and Stability
Suppose we have two systems, A and B, which are free to exchange energy, volume, and
particle number, subject to overall conservation rules
E
A
+E
B
= E , V
A
+V
B
= V , N
A
+N
B
= N , (1.258)
where E, V , and N are xed. Now let us compute the change in the total entropy of the
combined systems when they are allowed to exchange energy, volume, or particle number.
1.9. EQUILIBRIUM AND STABILITY 49
Figure 1.20: To check for an instability, we compare the energy of a system to its total
energy when we reapportion its energy, volume, and particle number slightly unequally.
We assume that the entropy is additive, i.e.
dS =
_
_
S
A
E
A
_
V
A
,N
A
_
S
B
E
B
_
V
B
,N
B
_
dE
A
+
_
_
S
A
V
A
_
E
A
,N
A
_
S
B
V
B
_
E
B
,N
B
_
dV
A
+
_
_
S
A
N
A
_
E
A
,V
A
_
S
B
N
B
_
E
B
,V
B
_
dN
A
. (1.259)
Note that we have used dE
B
= dE
A
, dV
B
= dV
A
, and dN
B
= dN
A
. Now we know from
the Second Law that spontaneous processes result in T dS > 0, which means that S tends
to a maximum. If S is a maximum, it must be that the coecients of dE
A
, dV
A
, and dN
A
all vanish, else we could increase the total entropy of the system by a judicious choice of
these three dierentials. From T dS = dE +p dV , dN, we have
1
T
=
_
S
E
_
V,N
,
p
T
=
_
S
V
_
E,N
,

T
=
_
S
N
_
E,V
. (1.260)
Thus, we conclude that in order for the system to be in equilibrium, so that S is maximized
and can increase no further under spontaneous processes, we must have
T
A
= T
B
(thermal equilibrium) (1.261)
p
A
T
A
=
p
B
T
B
(mechanical equilibrium) (1.262)
A
T
A
=

B
T
B
(chemical equilibrium) (1.263)
Now consider a uniform system with energy E
= 2E, volume V
= 2V , and particle number

N
= 2N. We wish to check that this system is not unstable with respect to spontaneously
becoming inhomogeneous. To that end, we imagine dividing the system in half. Each half
would have energy E, volume V , and particle number N. But suppose we divided up these
quantities dierently, so that the left half had slightly dierent energy, volume, and particle
number than the right, as depicted in g. 1.20. Does the entropy increase or decrease? We
have
S = S(E + E, V + V, N + N) +S(E E, V V, N N) S(2E, 2V, 2N)
=
1
2
2
S
E
2
(E)
2
+
1
2
2
S
V
2
(V )
2
+
1
2
2
S
N
2
(N)
2
(1.264)
+

2
S
E V
E V +

2
S
E N
E N +

2
S
V N
V N . (1.265)
Thus, we can write
S =
1
2
Q
ij
(X
i
) (X
j
) , (1.266)
where
Q =
_
_
_
_
_
_
_
2
S
E
2
2
S
EV
2
S
E N
2
S
EV
2
S
V
2
2
S
V N
2
S
EN
2
S
V N
2
S
N
2
_
_
_
_
_
_
_
(1.267)
is the matrix of second derivatives, known in mathematical parlance as the Hessian, and
X = (E, V, N). Note that Q is a symmetric matrix.

Since S must be a maximum in order for the system to be in equilibrium, we conclude that
the homogeneous system is stable if and only if all of the eigenvalues of Q are negative. If
one or more of the eigenvalues is positive, then it is possible to choose a set of variations
X such that S > 0, which would contradict the assumption that the homogeneous state
is one of maximum entropy. A matrix with this restriction is said to be negative denite.
Suppose we set N = 0 and we just examine the stability with respect to inhomogeneities
in energy and volume. Then we have a 2 2 matrix to deal with, which is much simpler.
A general symmetric 2 2 matrix may be written
Q =
_
a b
b c
_
(1.268)
It is easy to solve for the eigenvalues of Q. One nds
=
_
a +c
2
_
_
a c
2
_
2
+b
2
. (1.269)
In order for Q to be negative denite, we require
+
< 0 and
< 0. Clearly we must

have a + c < 0, or else
+
> 0 for sure. If a + c < 0 then clearly
< 0, but there still is

a possibility that
+
> 0, if the radical is larger than
1
2
(a + c). Demanding that
+
< 0
therefore yields two conditions:
a +c < 0 and ac > b
2
. (1.270)
Clearly both a and c must be negative, else one of the above two conditions is violated. So
in the end we have three conditions which are necessary and sucient in order that Q be
negative denite:
a < 0 , c < 0 , ac > b
2
. (1.271)
1.10. APPLICATIONS OF THERMODYNAMICS 51
Going back to thermodynamic variables, this requires
2
S
E
2
< 0 ,

2
S
V
2
< 0 ,

2
S
E
2

2
S
V
2
>
_

2
S
E V
_
2
. (1.272)
Another way to say it: the entropy is a concave function of (E, V, N).
Many thermodynamic systems are held at xed (T, p, N), which suggests we examine the
stability criteria for G(T, p, N). Suppose our system is in equilibrium with a reservoir at
temperature T
0
and pressure p
0
. Then, suppressing N (which is assumed constant), we
have
G(T
0
, p
0
) = E T
0
S +p
0
V . (1.273)
Now suppose there is a uctuation in the entropy and the volume of our system. Going to
second order in S and V , we have
G =
_
_
E
S
_
V
T
0
_
S +
_
_
E
V
_
S
+p
0
_
V
+
1
2
_
2
E
S
2
(S)
2
+ 2

2
E
S V
S V +

2
E
V
2
(V )
2
_
+. . . . (1.274)
The condition for equilibrium is that G > 0 for all (S, V ). The linear terms vanish by
the denition since T = T
0
and p = p
0
. Stability then requires that the Hessian matrix Q
be positive denite, with
Q =
_
_
_
2
E
S
2
2
E
S V
2
E
S V
2
E
V
2
_
_
_ . (1.275)
Thus, we have the following three conditions:
2
E
S
2
=
_
T
S
_
V
=
T
C
V
> 0 (1.276)
2
E
V
2
=
_
p
V
_
S
=
1
V
S
> 0 (1.277)
2
E
S
2

2
E
V
2

_

2
E
S V
_
2
=
T
V
S
C
V
_
T
V
_
2
S
> 0 . (1.278)
1.10 Applications of Thermodynamics
A discussion of various useful mathematical relations among partial derivatives may be
found in the appendix in 1.16. Some facility with the dierential multivariable calculus is
extremely useful in the analysis of thermodynamics problems.
Figure 1.21: Adiabatic free expansion via a thermal path. The initial and nal states do
not lie along an adabat! Rather, for an ideal gas, the initial and nal states lie along an
isotherm.
1.10.1 Adiabatic free expansion revisited
Consider once again the adiabatic free expansion of a gas from initial volume V
i
to nal
volume V
f
= rV
i
. Since the system is not in equilibrium during the free expansion process,
the initial and nal states do not lie along an adiabat, i.e. they do not have the same
entropy. Rather, as we found, from Q = W = 0, we have that E
i
= E
f
, which means they
have the same energy, and, in the case of an ideal gas, the same temperature (assuming
N is constant). Thus, the initial and nal states lie along an isotherm. The situation
is depicted in g. 1.21. Now let us compute the change in entropy S = S
f
S
i
by
integrating along this isotherm. Note that the actual dynamics are irreversible and do not
quasistatically follow any continuous thermodynamic path. However, we can use what is a
ctitious thermodynamic path as a means of comparing S in the initial and nal states.
We have
S = S
f
S
i
=
V
f
_
V
i
dV
_
S
V
_
T,N
. (1.279)
But from a Maxwell equation deriving from G, we have
_
S
V
_
T,N
=
_
p
T
_
V
, (1.280)
hence
S =
V
f
_
V
i
dV
_
p
T
_
V,N
. (1.281)
For an ideal gas, we can use the equation of state pV = Nk
B
T to obtain
_
p
T
_
V,N
=
Nk
B
V
. (1.282)
The integral can now be computed:
S =
rV
i
_
V
i
dV
Nk
B
V
= Nk
B
ln r , (1.283)
as we found before, in eqn. 1.164 What is dierent about this derivation? Previously, we
derived the entropy change from the explicit formula for S(E, V, N). Here, we did not need
to know this function. The Maxwell relation allowed us to compute the entropy change
using only the equation of state.
1.10.2 Maxwell relations from S(E, V, N)
We can also derive Maxwell relations based on the entropy S(E, V, N) itself. For example,
we have
dS =
1
T
dE +
p
T
dV

T
dN . (1.284)
Therefore S = S(E, V, N) and
2
S
E V
=
_
(T
1
)
V
_
E,N
=
_
(pT
1
)
E
_
V,N
, (1.285)
et cetera. Suppose we are given an energy function E(T, V, N). Then
dE = T dS p dV +dN
= T
_
S
T
_
V,N
dT +
_
T
_
S
V
_
T,N
p
_
dV +
_
T
_
S
N
_
T,V
_
dN
= T
_
S
T
_
V,N
dT +
_
T
_
p
T
_
V,N
p
_
dV
_
T
_
T
_
V,N
+
_
dN , (1.286)
where weve used the Maxwell relations deriving from F to go from the second line to the
third line above. How did we know to use those particular Maxwell relations? Because the
variables being held constant were T, V, N, which are the natural state variables for the
Helmholtz free energy F. At any rate, we now have the relation
_
E
V
_
T,N
= T
_
p
T
_
V,N
p . (1.287)
The ideal gas law pV = Nk
B
T results in the vanishing of the RHS, hence for any substance
obeying the ideal gas law we must have E = (T) = N (T)/N
A
, which is the only
possibility for an extensive, volume-independent function E(T, V, N).
1.10.3 van der Waals equation of state
It is clear that the same conclusion follows for any equation of state of the form p(T, V, N) =
T f(V/N), where f(V/N) is an arbitrary function of its argument: the ideal gas law remains
valid
8
. This is not true, however, for the van der Waals equation of state,
_
p +
a
v
2
_
_
v b) = RT , (1.288)
for which we nd (always assuming constant N),
_
E
V
_
T
=
_
v
_
T
= T
_
p
T
_
V
p =
a
v
2
, (1.289)
where E(T, V, N) (T, v). We can integrate this to obtain
(T, v) = (T)
a
v
, (1.290)
where (T) is arbitrary. From eqn. 1.33, we immediately have
c
V
=
_
T
_
v
=
(T) . (1.291)
What about c
p
? This requires a bit of work. We start with eqn. 1.34,
c
p
=
_
T
_
p
+p
_
v
T
_
p
(1.292)
=
(T) +
_
p +
a
v
2
__
v
T
_
p
(1.293)
We next take the dierential of the equation of state (at constant N):
RdT =
_
p +
a
v
2
_
dv +
_
v b
_
_
dp
2a
v
dv
_
=
_
p
a
v
2
+
2ab
v
3
_
dv +
_
v b
_
dp . (1.294)
We can now read o the result for the volume expansion coecient,
p
=
1
v
_
v
T
_
p
=
1
v

R
p
a
v
2
+
2ab
v
3
. (1.295)
We now have for c
p
,
c
p
=
(T) +
_
p +
a
v
2
_
R
p
a
v
2
+
2ab
v
3
=
(T) +
R
2
Tv
3
RTv
3
2a(v b)
2
. (1.296)
8
Note V/N = v/NA.
where v = V N
A
/N is the molar volume.
To x (T), we consider the v limit, where the density of the gas vanishes. In
this limit, the gas must be ideal, hence eqn. 1.290 says that (T) =
1
2
fRT. Therefore
c
V
(T, v) =
1
2
fR, just as in the case of an ideal gas. However, rather than c
p
= c
V
+ R,
which holds for ideal gases, c
p
(T, v) is given by eqn. 1.296. Thus,
c
VDW
V
=
1
2
fR (1.297)
c
VDW
p
=
1
2
fR+
R
2
Tv
3
RTv
3
2a(v b)
2
. (1.298)
Note that c
p
(a 0) = c
V
+R, which is the ideal gas result.
1.10.4 Thermodynamic response functions
Consider the entropy S expressed as a function of T, V , and N:
dS =
_
S
T
_
V,N
dT +
_
S
V
_
T,N
dV +
_
S
N
_
T,V
dN . (1.299)
Dividing by dT, multiplying by T, and assuming dN = 0 throughout, we have
C
p
C
V
= T
_
S
V
_
T
_
V
T
_
p
. (1.300)
Appealing to a Maxwell relation derived from F(T, V, N), and then appealing to eqn. 1.556,
we have
_
S
V
_
T
=
_
p
T
_
V
=
_
p
V
_
T
_
V
T
_
p
. (1.301)
This allows us to write
C
p
C
V
= T
_
p
V
_
T
_
V
T
_
2
p
. (1.302)
We dene the response functions,
isothermal compressibility:
T
=
1
V
_
V
p
_
T
=
1
V
2
G
p
2
(1.303)
adiabatic compressibility:
S
=
1
V
_
V
p
_
S
=
1
V
2
H
p
2
(1.304)
thermal expansivity:
p
=
1
V
_
V
T
_
p
. (1.305)
Thus,
C
p
C
V
= V
T
2
p
T
, (1.306)
or, in terms of intensive quantities,
c
p
c
V
=
v T
2
p
T
, (1.307)
where, as always, v = V N
A
/N is the molar volume.
This above relation generalizes to any conjugate force-displacement pair (p, V ) (y, X):
C
y
C
X
= T
_
y
T
_
X
_
X
T
_
y
(1.308)
= T
_
y
X
_
T
_
X
T
_
2
y
. (1.309)
For example, we could have (y, X) = (H, M).
A similar relationship can be derived between the compressibilities
T
and
S
. We then
clearly must start with the volume, writing
dV =
_
V
p
_
S,N
dp +
_
V
S
_
p,N
dS +
_
V
p
_
S,p
dN . (1.310)
Dividing by dp, multiplying by V
1
, and keeping N constant, we have
T

S
=
1
V
_
V
S
_
p
_
S
p
_
T
. (1.311)
Again we appeal to a Maxwell relation, writing
_
S
p
_
T
=
_
V
T
_
p
, (1.312)
and after invoking the chain rule,
_
V
S
_
p
=
_
V
T
_
p
_
T
S
_
p
=
T
C
p
_
V
T
_
p
, (1.313)
we obtain
T

S
=
v T
2
p
c
p
. (1.314)
Comparing eqns. 1.307 and 1.314, we nd
(c
p
c
V
)
T
= (
T

S
) c
p
= v T
2
p
. (1.315)
This result entails
c
p
c
V
=

T
S
. (1.316)
The corresponding result for magnetic systems is
(c
H
c
M
)
T
= (
S
) c
H
= T
_
m
T
_
2
H
, (1.317)
where m = M/ is the magnetization per mole of substance, and
isothermal susceptibility:

T
=
_
m
H
_
T
=
1
2
G
H
2
(1.318)
adiabatic susceptibility:

S
=
_
m
H
_
S
=
1
2
H
H
2
. (1.319)
Here the enthalpy and Gibbs free energy are
H = E HM dH = T dS MdH (1.320)
G = E TS HM dG = S dT MdH . (1.321)
Remark: The previous discussion has assumed an isotropic magnetic system where

M and
H are collinear, hence

H

M = HM.
T
=
_
m
_
T
=
1
2
G
H
(1.322)
S
=
_
m
_
S
=
1
2
H
H
. (1.323)
Here the enthalpy and Gibbs free energy are
H = E
H

M dH = T dS

M d
H (1.324)
G = E TS
H

M dG = S dT

M d
H . (1.325)
1.10.5 Joule eect: free expansion of a gas
Previously we considered the adiabatic free expansion of an ideal gas. We found that
Q = W = 0 hence E = 0, which means the process is isothermal, since E = (T) is
volume-independent. The entropy changes, however, since S(E, V, N) = Nk
B
ln(V/N) +
1
2
fNk
B
ln(E/N) +Ns
0
. Thus,
S
f
= S
i
+Nk
B
ln
_
V
f
V
i
_
. (1.326)
What happens if the gas is nonideal?
We integrate along a ctitious thermodynamic path connecting initial and nal states, where
dE = 0 along the path. We have
0 = dE =
_
E
V
_
T
dV +
_
E
T
_
V
dT (1.327)
gas a
_
L
2
bar
mol
2
_
b
_
L
mol
_
p
c
(bar) T
c
(K) v
c
(L/mol)
Acetone 14.09 0.0994 52.82 505.1 0.2982
Argon 1.363 0.03219 48.72 150.9 0.0966
Carbon dioxide 3.640 0.04267 7404 304.0 0.1280
Ethanol 12.18 0.08407 63.83 516.3 0.2522
Freon 10.78 0.0998 40.09 384.9 0.2994
Helium 0.03457 0.0237 2.279 5.198 0.0711
Hydrogen 0.2476 0.02661 12.95 33.16 0.0798
Mercury 8.200 0.01696 1055 1723 0.0509
Methane 2.283 0.04278 46.20 190.2 0.1283
Nitrogen 1.408 0.03913 34.06 128.2 0.1174
Oxygen 1.378 0.03183 50.37 154.3 0.0955
Water 5.536 0.03049 220.6 647.0 0.0915
Table 1.3: Van der Waals parameters for some common gases. (Source: Wikipedia)
hence
_
T
V
_
E
=
(E/V )
T
(E/T)
V
=
1
C
V
_
E
V
_
T
. (1.328)
We also have
_
E
V
_
T
= T
_
S
V
_
T
p = T
_
p
T
_
V
p . (1.329)
Thus,
_
T
V
_
E
=
1
C
V
_
p T
_
p
T
_
V
_
. (1.330)
Note that the term in square brackets vanishes for any system obeying the ideal gas law.
For a nonideal gas,
T =
V
f
_
V
i
dV
_
T
V
_
E
, (1.331)
which is in general nonzero.
Now consider a van der Waals gas, for which
_
p +
a
v
2
_
(v b) = RT .
We then have
p T
_
p
T
_
V
=
a
v
2
=
a
2
V
2
. (1.332)
In 1.10.3 we concluded that C
V
=
1
2
fR for the van der Waals gas, hence
T =
2a
fR
V
f
_
V
i
dV
V
2
=
2a
fR
_
1
v
f
1
v
i
_
. (1.333)
Thus, if V
f
> V
i
, we have T
f
< T
i
and the gas cools upon expansion.
Consider O
2
gas with an initial specic volume of v
i
= 22.4 L/mol, which is the STP value
for an ideal gas, freely expanding to a volume v
f
= for maximum cooling. According to
table 1.3, a = 1.378 L
2
bar/mol
2
, and we have T = 2a/fRv
i
= 0.296 K, which is a
pitifully small amount of cooling. Adiabatic free expansion is a very inecient way to cool
a gas.
1.10.6 Throttling: the Joule-Thompson eect
In a throttle, depicted in g. 1.22, a gas is forced through a porous plug which separates
regions of dierent pressures. According to the gure, the work done on a given element of
gas is
W =
V
f
_
0
dV p
f

V
i
_
0
dV p
i
= p
f
V
f
p
i
V
i
. (1.334)
Now we assume that the system is thermally isolated so that the gas exchanges no heat
with its environment, nor with the plug. Then Q = 0 so E = W, and
E
i
+p
i
V
i
= E
f
+p
f
V
f
(1.335)
H
i
= H
f
, (1.336)
where H is enthalpy. Thus, the throttling process is isenthalpic. We can therefore study it
by dening a ctitious thermodynamic path along which dH = 0. The, choosing T and p
as state variables,
0 = dH =
_
H
T
_
p
dp +
_
H
p
_
T
dT (1.337)
hence
_
T
p
_
H
=
(H/p)
T
(H/T)
p
. (1.338)
The numerator on the RHS is computed by writing dH = T dS + V dp and then dividing
by dp, to obtain
_
H
p
_
T
= V +T
_
S
p
_
T
= V T
_
V
T
_
p
. (1.339)
The denominator is
_
H
T
_
p
=
_
H
S
_
p
_
S
T
_
p
= T
_
S
T
_
p
= C
p
. (1.340)
Figure 1.22: In a throttle, a gas is pushed through a porous plug separating regions of
dierent pressure. The change in energy is the work done, hence enthalpy is conserved
during the throttling process.
Thus,
_
T
p
_
H
=
1
c
p
_
T
_
v
T
_
p
v
_
=
v
c
p
_
T
p
1
_
, (1.341)
where
p
=
1
V
_
V
T
_
p
is the volume expansion coecient.
From the van der Waals equation of state, we obtain, from eqn. 1.295,
T
p
=
T
v
_
v
T
_
p
=
RT/v
p
a
v
2
+
2ab
v
3
=
v b
v
2a
RT
_
vb
v
_
2
. (1.342)
Assuming v
a
RT
, b, we have
_
T
p
_
H
=
1
c
p
_
2a
RT
b
_
. (1.343)
Thus, for T > T
=
2a
Rb
, we have
_
T
p
_
H
< 0 and the gas heats up upon an isenthalpic
pressure decrease. For T < T
, the gas cools under such conditions.

In fact, there are two inversion temperatures T
1,2
for the van der Waals gas. To see this,
we set T
p
= 1, which is the criterion for inversion. From eqn. 1.342 it is easy to derive
b
v
= 1
_
bRT
2a
. (1.344)
We insert this into the van der Waals equation of state to derive a relationship T = T
(p)
at which T
p
= 1 holds. After a little work, we nd
p =
3RT
2b
+
_
8aRT
b
3

a
b
2
. (1.345)
This is a quadratic equation for T, the solution of which is
T
(p) =
2a
9 bR
_
2
_
1
3b
2
p
a
_
2
. (1.346)
Figure 1.23: Inversion temperature T
(p) for the van der Waals gas. Pressure and temper-
ature are given in terms of p
c
= a/27b
2
and T
c
= 8a/27bR, respectively.
In g. 1.23 we plot pressure versus temperature in scaled units, showing the curve along
which
_
T
p
_
H
= 0. The volume, pressure, and temperature scales dened are
v
c
= 3b , p
c
=
a
27 b
2
, T
c
=
8a
27 bR
. (1.347)
Values for p
c
, T
c
, and v
c
are provided in table 1.3. If we dene v = v/v
c
, p = p/p
c
, and
T = T/T
c
, then the van der Waals equation of state may be written in dimensionless form:
_
p +
3
v
2
_
_
3v 1) = 8T . (1.348)
In terms of the scaled parameters, the equation for the inversion curve
_
T
p
_
H
= 0 becomes
p = 9 36
_
1
_
1
3
T
_
2
T = 3
_
1
_
1
1
9
p
_
2
. (1.349)
Thus, there is no inversion for p > 9 p
c
. We are usually interested in the upper inversion
temperature, T
2
, corresponding to the upper sign in eqn. 1.346. The maximum inversion
temperature occurs for p = 0, where T
max
=
2a
bR
=
27
4
T
c
. For H
2
, from the data in table
1.3, we nd T
max
(H
2
) = 224 K, which is within 10% of the experimentally measured value
of 205 K.
What happens when H
2
gas leaks from a container with T > T
2
? Since
_
T
p
_
H
< 0
and p < 0, we have T > 0. The gas warms up, and the heat facilitates the reaction
2 H
2
+ O
2
2 H
2
O, which releases energy, and we have a nice explosion.
1.11 Entropy of Mixing and the Gibbs Paradox
Entropy is widely understood as a measure of disorder. Of course, such a denition should
be supplemented by a more precise denition of disorder after all, one mans trash is
another mans treasure. To gain some intuition about entropy, let us explore the mixing
of a multicomponent gas. Let N =
a
N
a
be the total number of particles of all species,
and let x
a
= N
a
/N be the concentration of species a. Note that
a
x
a
= 1. For a single
component ideal gas, we have
G(T, p, N) = Nk
B
T
_
ln p +(T)
_
, (1.350)
where (T) is a function of T alone. To see this, start with the energy, assumed to be of
the form
E(T, V, N) = Nk
B
(T) , (1.351)
where (T) is arbitrary. Then, invoking the First Law, we have
dS = Nk
B
(T)
T
dT +
p
T
dV
= Nk
B
(T)
T
dT +
p
T
d
_
Nk
B
T
p
_
= Nk
B
_
(T) + 1
_
dT
T
Nk
B
dp
p
. (1.352)
where we used the ideal gas law pV = Nk
B
T. Thus, we can nd S(T, p, N):
S(T, p, N) = Nk
B
T
_
d
T
_
T
+Nk
B
ln T Nk
B
ln p +Ns
0
, (1.353)
where s
0
is a constant. From Gibbs-Duhem, we know
G = E TS +pV
= Nk
B
(T) Nk
B
T
T
_
d
T
_
T
Nk
B
T ln T +Nk
B
T ln p NTs
0
+Nk
B
T
Nk
B
T
_
ln p +(T)
_
, (1.354)
where
(T) =
0
ln T
T
_
d
T
_
T
2
, (1.355)
where
0
is a constant. For an ideal gas, (T) =
1
2
fT, and
(T) =
0
_
1
2
f + 1
_
ln T . (1.356)
1.11. ENTROPY OF MIXING AND THE GIBBS PARADOX 63
Figure 1.24: A multicomponent system consisting of isolated gases, each at temperature
T and pressure p. Then system entropy increases when all the walls between the dierent
subsystems are removed.
Now consider a multicomponent system, with each subsystem at temperature T and pressure
p, as depicted in g. 1.24. We can imagine that the individual components are separated
from each other by partitions. We then have
G
unmixed
=
a
N
a
k
B
T
_
ln p +
a
(T)
_
. (1.357)
Now remove the partitions and allow the gases to mix. The components can now exchange
volume, and will come to mechanical equilibrium at a constant overall pressure p. The net
pressure is a sum over partial pressures from all the components:
p =
a
p
a
, p
a
= x
a
p . (1.358)
Therefore
G
mixed
=
a
N
a
k
B
T
_
ln x
a
+ ln p +
a
(T)
_
, (1.359)
and we conclude
G
mixed
G
unmixed
= Nk
B
T
a
x
a
ln x
a
. (1.360)
Since E and
a
p V
a
= pV do not change, we conclude that there is a change in entropy:
S = Nk
B
a
x
a
lnx
a
0 . (1.361)
This is called the entropy of mixing.
Now for Gibbs paradox: what if all the components were initially identical? Why should
the entropy change? The answer to this paradox will be found when we discuss quantum
statistics!
1.11.1 Entropy and combinatorics
As we shall learn when we study statistical mechanics, the entropy may be interpreted in
terms of the number of ways W(E, V, N) a system at xed energy and volume can arrange
itself. One has
S(E, V, N) = k
B
ln W(E, V, N) . (1.362)
Consider a system composed of M boxes, each of which can accommodate at most one
particle. If there are N particles, then the number of ways the system can be arranged is
W(M, N) =
_
M
N
_
=
M!
N! (M N)!
. (1.363)
This result assumes that the N particles are all indistinguishable from one another. Thus,
if we have M = 3 boxes and N = 2 particles, there are only
_
3
2
_
= 3 ways the system can
be arranged: the empty box can be either #1, #2, or #3, and in each case the remaining
two boxes are full. If box #1 is empty, then we dont generate a new state by permuting
the particles in boxes #2 and #3. Were the particles all distinguishable, then wed have to
multiply W by N! possible arrangements of the occupied boxes, and wed instead obtain
W
distinct
(M, N) =
M!
(M N)!
. (1.364)
Now let us write N = M, where [0, 1] is a dimensionless measure of the density, and
let us use Stirlings approximation,
ln K! = K ln K K +
1
2
ln K +
1
2
ln(2) +O(K
1
) , (1.365)
which is asymptotically correct when K is large. Well only need the rst two terms on the
RHS, since the remaining terms are not extensive in K. We have
ln W =
_
M ln M M
_
_
M ln(M) M
_
_
(1 )M ln
_
(1 )M) (1 )M
_
= M
_
ln + (1 ) ln(1 )
_
. (1.366)
Note that S = k
B
ln W is extensive, scaling as M
1
.
Now suppose we have a system composed of isolated subsystems, each labeled by an index
a, with a 1, . . . , . Each subsystem is composed of M
a
boxes containing N
a
= M
a
particles. It is important here that the dimensionless density
a
= is the same for each
box, subsystem the analysis is more tedious. If all the subsystems are independent, we must
have
W
total
=
a=1
W
a
, (1.367)
which says that the entropies add:
S =
a=1
S
a
(1.368)
=
_

a=1
M
a
_
k
B
_
ln + (1 ) ln(1 )
_
. (1.369)
Figure 1.25: Two chambers with dierent species of particles, each at the same density and
temperature, are permitted to mix. The resulting entropy is greater by an amount S
mix
,
the entropy of mixing.
This is exactly the result we would have obtained had we removed all the walls between the
dierent subsystems, allowing them to mix. In that case, the total number of boxes would
be M
total
=
a
M
a
, and the number of particles is N
total
=
a
N
a
= M
total
, and applying
eqn. 1.366 we obtain the desired result. We see that mixing the particles if they are all
indistinguishable does not lead to a change in entropy.
However, suppose the dierent subsystems each contained a dierent species of particle.
That is, within a given subsystem, all the particles are the same and are indistinguishable
(say all O
2
molecules), but dierent subsystems contain dierent species (e.g. O
2
, N
2
, H
2
,
He, etc.). The number of possible congurations for the mixed system is now much larger,
by a factor
W
distinguishable
W
identical
=
(N
1
+N
2
+. . . +N
)!
N
1
! N
2
! N
!
, (1.370)
where is the number of species, i.e. the number of subsystems. Why is this the correct
combinatoric factor? Well, we have N
total
=
a=1
N
a
occupied boxes, and if all of the
particles were distinguishable, each such conguration would allow for
_
N
total
!
_
possible
arrangements within the boxes. Since not all the particles are distinguishable, this correction
factor is itself too big. We must divide by the product
a=1
(N
a
!) because for each species
there are N
a
! ways to arrange those particles among themselves, were they distinguishable
this is the degree of overcounting for each species.
We conclude that the entropy of mixing is given by
S
mix
= 0 (all species identical) (1.371)
= Nk
B
a=1
x
a
ln x
a
(all species distinct) (1.372)
where x
a
= N
a
/N, and N =
b=1
N
b
is the total number of particles among all species.
1.11.2 Weak solutions and osmotic pressure
Suppose one of the species is much more plentiful than all the others, and label it with
a = 0. We will call this the solvent. The entropy of mixing is then
S
mix
= k
B
_
N
0
ln
_
N
0
N
0
+N
_
+
a=1
N
a
ln
_
N
a
N
0
+N
_
_
, (1.373)
where N
a=1
N
a
is the total number of solvent molecules, summed over all species. We
assume the solution is weak, which means N
a
N
N
0
. Expanding in powers of N
/N
0
and N
a
/N
0
, we nd
S
mix
= k
B
a=1
_
N
a
ln
_
N
a
N
0
_
N
a
_
+O
_
N
2
/N
0
_
. (1.374)
Consider now a solution consisting of N
0
molecules of a solvent and N
a
molecules of species a
of solute, where a = 1, . . . , . We can expand the Gibbs free energy G(T, p, N
0
, N
1
, . . . , N
K
),
where there are K species of solutes, as a power series in the small quantities N
a
. We have
G
_
T, p, N
0
, N
a
_
= N
0
g
0
(T, p) +k
B
T
a
N
a
ln
_
N
a
eN
0
_
(1.375)
+
a
N
a
a
(T, p) +
1
2N
0
a,b
A
ab
(T, p) N
a
N
b
.
The rst term on the RHS corresponds to the Gibbs free energy of the solvent. The second
term is due to the entropy of mixing. The third term is the contribution to the total free
energy from the individual species. Note the factor of e in the denominator inside the
logarithm, which accounts for the second term in the brackets on the RHS of eqn. 1.374.
The last term is due to interactions between the species; it is truncated at second order in
the solute numbers.
The chemical potential for the solvent is
0
(T, p) =
G
N
0
= g
0
(T, p) k
B
T
a
x
a
1
2
a,b
A
ab
(T, p) x
a
x
b
, (1.376)
and the chemical potential for species a is
a
(T, p) =
G
N
a
= k
B
T ln x
a
+
a
(T, p) +
b
A
ab
(T, p) x
b
, (1.377)
where x
a
= N
a
/N
0
is the concentrations of solute species a. By assumption, the last term
on the RHS of each of these equations is small, since N
solute
N
0
, where N
solute
=
K
a=1
N
a
Figure 1.26: Osmotic pressure causes the column on the right side of the U-tube to rise
higher than the column on the left by an amount h = / g.
is the total number of solute molecules. To lowest order, then, we have
0
(T, p) = g
0
(T, p) xk
B
T (1.378)
a
(T, p) = k
B
T ln x
a
+
a
(T, p) , (1.379)
where x =
a
x
a
is the total solute concentration.
If we add sugar to a solution conned by a semipermeable membrane
9
, the pressure in-
creases! To see why, consider a situation where a rigid semipermeable membrane separates
a solution (solvent plus solutes) from a pure solvent. There is energy exchange through the
membrane, so the temperature is T throughout. There is no volume exchange, however:
dV = dV
= 0, hence the pressure need not be the same. Since the membrane is permeable
to the solvent, we have that the chemical potential
0
is the same on each side. This means
g
0
(T, p
R
) xk
B
T = g
0
(T, p
L
) , (1.380)
where p
L,R
is the pressure on the left and right sides of the membrane, and x = N/N
0
is
again the total solute concentration. This equation once again tells us that the pressure p
cannot be the same on both sides of the membrane. If the pressure dierence is small, we
can expand in powers of the osmotic pressure, p
R
p
L
, and we nd
= xk
B
T
__
0
p
_
T
. (1.381)
But a Maxwell relation (3.6) guarantees
_
p
_
T,N
=
_
V
N
_
T,p
= v(T, p)/N
A
, (1.382)
where v(T, p) is the molar volume of the solvent.
v = xRT , (1.383)
9
Semipermeable in this context means permeable to the solvent but not the solute(s).
which looks very much like the ideal gas law, even though we are talking about dense (but
weak) solutions! The resulting pressure has a demonstrable eect, as sketched in g. 1.26.
Consider a solution containing moles of sucrose (C
12
H
22
O
11
) per kilogram (55.52 mol) of
water at 30
C. We nd = 2.5 atm when = 0.1.

One might worry about the expansion in powers of when is much larger than the
ambient pressure. But in fact the next term in the expansion is smaller than the rst
term by a factor of
T
, where
T
is the isothermal compressibility. For water one has
T
4.4 10
5
(atm)
1
, hence we can safely ignore the higher order terms in the Taylor
expansion.
1.11.3 Eect of impurities on boiling and freezing points
Along the coexistence curve separating liquid and vapor phases, the chemical potentials of
the two phases are identical:
0
L
(T, p) =
0
V
(T, p) . (1.384)
Here we write
0
for to emphasize that we are talking about a phase with no impurities
present. This equation provides a single constraint on the two variables T and p, hence
one can, in principle, solve to obtain T = T
0
(p), which is the equation of the liquid-vapor
coexistence curve in the (T, p) plane. Now suppose there is a solute present in the liquid.
We then have
L
(T, p, x) =
0
L
(T, p) xk
B
T , (1.385)
where x is the dimensionless solute concentration, summed over all species. The condition
for liquid-vapor coexistence now becomes
0
L
(T, p) xk
B
T =
0
V
(T, p) . (1.386)
This will lead to a shift in the boiling temperature at xed p. Assuming this shift is small,
let us expand to lowest order in
_
T T
0
(p)
_
, writing
0
L
(T
0
, p) +
_
0
L
T
_
p
_
T T
0
_
xk
B
T =
0
V
(T
0
, p) +
_
0
V
T
_
p
_
T T
0
_
. (1.387)
Note that
_
T
_
p,N
=
_
S
N
_
T,p
(1.388)
from a Maxwell relation deriving from exactness of dG. Since S is extensive, we can write
S = (N/N
A
) s(T, p), where s(T, p) is the molar entropy. Solving for T, we obtain
T
(p, x) = T
0
(p) +
xR
_
T
0
(p)
v
(p)
, (1.389)
where
v
= T
0
(s
V
s
L
) is the latent heat of the liquid-vapor transition
10
. The shift
T
= T
0
is called the boiling point elevation.
10
We shall discuss latent heat again in 1.13.2 below.
1.12. SOME CONCEPTS IN THERMOCHEMISTRY 69
As an example, consider seawater, which contains approximately 35 g of dissolved Na
+
Cl
per kilogram of H
2
O. The atomic masses of Na and Cl are 23.0 and 35.4, respectively, hence
the total ionic concentration in seawater (neglecting everything but sodium and chlorine) is
x =
2 35
23.0 + 35.4
_
1000
18
0.022 . (1.390)
The latent heat of vaporization of H
2
O at atmospheric pressure is = 40.7 kJ/mol, hence
T
=
(0.022)(8.3 J/mol K)(373 K)
2
4.1 10
4
J/mol
0.6 K . (1.391)
Put another way, the boiling point elevation of H
2
O at atmospheric pressure is about 0.28
C
per percent solute. We can express this as T
= K m, where the molality m is the number

of moles of solute per kilogram of solvent. For H
2
O, we nd K = 0.51
Ckg/mol.
Similar considerations apply at the freezing point. The latent heat of fusion for H
2
O is about
f
= T
0
f
(s
LIQUID
s
SOLID
) = 6.01 kJ/mol
11
We thus predict a freezing point depression of
T
= xR
_
T
2
/
f
= 1.03
C x[%]. This can be expressed once again as T
= Km,
with K = 1.86
Ckg/mol
12
.
1.12 Some Concepts in Thermochemistry
1.12.1 Chemical reactions and the law of mass action
Suppose we have a chemical reaction among species, written as
1
A
1
+
2
A
2
+ +
= 0 , (1.392)
where
A
a
= chemical formula
a
= stoichiometric coecient .
For example, we could have
3 H
2
N
2
+ 2 NH
3
= 0 (3 H
2
+ N
2
2 NH
3
) (1.393)
for which
(H
2
) = 3 , (N
2
) = 1 , (NH
3
) = 2 . (1.394)
When
a
> 0, the corresponding A
a
is a product; when
a
< 0, the corresponding A
a
is a
reactant.
11
See table 1.6, and recall M = 18 g is the molar mass of H
2
O.
12
It is more customary to write T
= T
pure solvent
T
solution
in the case of the freezing point depression,
in which case T
is positive.
Now we ask: what are the conditions for equilibrium? At constant T and p, which is typical
for many chemical reactions, the conditions are that G
_
T, p, N
a
_
be a minimum. Now
dG = S dT +V dp +
a
dN
a
, (1.395)
so if we let the reaction go forward, we have dN
a
=
a
, and if it runs in reverse we have
dN
a
=
a
. Thus, setting dT = dp = 0, we have the equilibrium condition
a=1
a
= 0 . (1.396)
Let us investigate the consequences of this relation for ideal gases. The chemical potential
of the a
th
species is
a
(T, p) = k
B
T
a
(T) +k
B
T ln p
a
, (1.397)
as we found above in eqn. 1.359. Here p
a
= p x
a
is the partial pressure of species a, where
x
a
= N
a
/
b
N
b
the concentration of species a. Chemists sometimes write x
a
= [A
a
] for
the concentration of species a. In equlibrium we must have
a
_
lnp + ln x
a
+
a
(T)
_
= 0 , (1.398)
which says
a
ln x
a
=
a
ln p
a
(T) . (1.399)
Exponentiating, we obtain the law of mass action:
a
x
a
a
= p
P
a
a
exp
_
a
(T)
_
(p, T) . (1.400)
The quantity (p, T) is called the equilibrium constant. When is large, the LHS of the
above equation is large. This favors maximal concentration x
a
for the products (
a
> 0)
and minimal concentration x
a
for the reactants (
a
< 0). This means that the equation
REACTANTS PRODUCTS is shifted to the right, i.e. the products are plentiful and the
reactants are scarce. When is small, the LHS is small and the reaction is shifted to
the left, i.e. the reactants are plentiful and the products are scarce. Remember we are
describing equilibrium conditions here. Now we observe that reactions for which
a
> 0
shift to the left with increasing pressure and shift to the right with increasing pressure,
while reactions for which
a
> 0 the situation is reversed: they shift to the right with
increasing pressure and to the left with decreasing pressure. When
a
= 0 there is no
shift upon increasing or decreasing pressure.
The rate at which the equilibrium constant changes with temperature is given by
_
ln
T
_
p
=
a
(T) . (1.401)
Now from eqn. 1.397 we have that the enthalpy per particle for species i is
h
a
=
a
T
_
a
T
_
p
, (1.402)
since H = G+TS and S =
_
G
T
_
p
. We nd
h
a
= k
B
T
2
a
(T) , (1.403)
and thus
_
ln
T
_
p
=
a
h
a
k
B
T
2
=
h
k
B
T
2
, (1.404)
where h is the enthalpy of the reaction, which is the heat absorbed or emitted as a result
of the reaction.
When h > 0 the reaction is endothermic and the yield increases with increasing T. When
h < 0 the reaction is exothermic and the yield decreases with increasing T.
As an example, consider the reaction H
2
+ I
2
2 HI. We have
(H
2
) = 1 , (I
2
) = 1 (HI) = 2 . (1.405)
Suppose our initial system consists of
0
1
moles of H
2
,
0
2
= 0 moles of I
2
, and
0
3
moles
of undissociated HI . These mole numbers determine the initial concentrations x
0
a
, where
x
a
=
a
/
b
. Dene

x
0
3
x
3
x
3
, (1.406)
in which case we have
x
1
= x
0
1
+
1
2
x
0
3
, x
2
=
1
2
x
0
3
, x
3
= (1 ) x
0
3
. (1.407)
Then the law of mass action gives
4 (1 )
2
( + 2r)
= . (1.408)
where r x
0
1
/x
0
3
=
0
1
/
0
3
. This yields a quadratic equation, which can be solved to nd
(, r). Note that = (T) for this reaction since
a
= 0. The enthalpy of this reaction
is positive: h > 0.
1.12.2 Enthalpy of formation
Most chemical reactions take place under constant pressure. The heat Q
if
associated with
a given isobaric process is
Q
if
=
f
_
i
dE +
f
_
i
p dV = (E
f
E
i
) +p (V
f
V
i
) = H
f
H
i
, (1.409)
where H is the enthalpy,
H = E +pV . (1.410)
Note that the enthalpy H is a state function, since E is a state function and p and V are
state variables. Hence, we can meaningfully speak of changes in enthalpy: H = H
f
H
i
.
If H < 0 for a given reaction, we call it exothermic this is the case when Q
if
< 0 and
thus heat is transferred to the surroundings. Such reactions can occur spontaneously, and,
in really fun cases, can produce explosions. The combustion of fuels is always exother-
mic. If H > 0, the reaction is called endothermic. Endothermic reactions require that
heat be supplied in order for the reaction to proceed. Photosynthesis is an example of an
endothermic reaction.
Suppose we have two reactions
A +B
(H)
1
C (1.411)
and
C +D
(H)
2
E . (1.412)
Then we may write
A+B +D
(H)
3
E , (1.413)
with
(H)
1
+ (H)
2
= (H)
3
. (1.414)
We can use this additivity of reaction enthalpies to dene a standard molar enthalpy of
formation. We rst dene the standard state of a pure substance at a given temperature to
be its state (gas, liquid, or solid) at a pressure p = 1 bar. The standard reaction enthalpies
at a given temperature are then dened to be the reaction enthalpies when the reactants
and products are all in their standard states. Finally, we dene the standard molar enthalpy
of formation H
0
f
(X) of a compound X at temperature T as the reaction enthalpy for the
compound X to be produced by its constituents when they are in their standard state. For
example, if X = SO
2
, then we write
S + O
2
H
0
f
[SO
2
]
SO
2
. (1.415)
H
0
f
H
0
f
Formula Name State kJ/mol Formula Name State kJ/mol
Ag Silver crystal 0.0 NiSO
4
Nickel sulfate crystal -872.9
Al
2
O
3
Aluminum oxide crystal -1657.7 O
3
Ozone gas 142.7
H
3
BO
3
Boric acid crystal -1094.3 ZnSO
4
Zinc sulfate crystal -982.8
CaCl
2
Calcium chloride crystal -795.4 SF
6
Sulfur hexauoride gas -1220.5
CaF
2
Calcium uoride crystal -1228.0 Ca
3
P
2
O
8
Calcium phosphate gas -4120.8
H
2
O Water liquid -285.8 C Graphite crystal 0.0
HCN Hydrogen cyanide liquid 108.9 C Diamond crystal 1.9
Table 1.4: Enthalpies of formation of some common substances.
Figure 1.27: Left panel: reaction enthalpy and activation energy (exothermic case shown).
Right panel: reaction enthalpy as a dierence between enthalpy of formation of reactants
and products.
The enthalpy of formation of any substance in its standard state is zero at all temperatures,
by denition: H
0
f
[O
2
] = H
0
f
[He] = H
0
f
[K] = H
0
f
[Mn] = 0, etc.
Suppose now we have a reaction
a A+b B
H
c C +d D . (1.416)
To compute the reaction enthalpy H, we can imagine forming the components A and B
from their standard state constituents. Similarly, we can imagine doing the same for C and
D. Since the number of atoms of a given kind is conserved in the process, the constituents
of the reactants must be the same as those of the products, we have
H = a H
0
f
(A) b H
0
f
(B) +c H
0
f
(C) +d H
0
f
(D) . (1.417)
A list of a few enthalpies of formation is provided in table 1.4. Note that the reaction
enthalpy is independent of the actual reaction path. That is, the dierence in enthalpy
between Aand B is the same whether the reaction is A B or A X (Y +Z)
B. This statement is known as Hesss Law.
Note that
dH = dE +p dV +V dp = dQ+V dp , (1.418)
hence
C
p
=
_
dQ
dT
_
p
=
_
H
T
_
p
. (1.419)
We therefore have
H(T, p, ) = H(T
0
, p, ) +
T
_
T
0
dT
c
p
(T
) . (1.420)
enthalpy enthalpy enthalpy enthalpy
bond (kJ/mol) bond (kJ/mol) bond (kJ/mol) bond (kJ/mol)
H H 436 CC 348 CS 259 F F 155
H C 412 C = C 612 NN 163 F Cl 254
H N 388 C C 811 N = N 409 Cl Br 219
H O 463 CN 305 N N 945 Cl I 210
H F 565 C = N 613 NO 157 Cl S 250
H Cl 431 C N 890 NF 270 Br Br 193
H Br 366 CO 360 NCl 200 Br I 178
H I 299 C = O 743 NSi 374 Br S 212
H S 338 CF 484 OO 146 I I 151
H P 322 CCl 338 O = O 497 S S 264
H Si 318 CBr 276 OF 185 P P 172
CI 238 OCl 203 Si Si 176
Table 1.5: Average bond enthalpies for some common bonds. (Source: L. Pauling, The
Nature of the Chemical Bond (Cornell Univ. Press, NY, 1960).
For ideal gases, we have c
p
(T) = (1 +
1
2
f) R. For real gases, over a range of temperatures,
there are small variations:
c
p
(T) = + T + T
2
. (1.421)
Two examples (300 K < T < 1500 K, p = 1 atm):
O
2
: = 25.503
J
mol K
, = 13.612 10
3
J
mol K
2
, = 42.553 10
7
J
mol K
3
H
2
O : = 30.206
J
mol K
, = 9.936 10
3
J
mol K
2
, = 11.14 10
7
J
mol K
3
If all the gaseous components in a reaction can be approximated as ideal, then we may write
(H)
rxn
= (E)
rxn
+
a
RT , (1.422)
where the subscript rxn stands for reaction. Here (E)
rxn
is the change in energy from
reactants to products.
1.12.3 Bond enthalpies
The enthalpy needed to break a chemical bond is called the bond enthalpy, h[ ]. The bond
enthalpy is the energy required to dissociate one mole of gaseous bonds to form gaseous
atoms. A table of bond enthalpies is given in tab. 1.5. Bond enthalpies are endothermic,
since energy is required to break chemical bonds. Of course, the actual bond energies can
depend on the location of a bond in a given molecule, and the values listed in the table
reect averages over the possible bond environment.
Figure 1.28: Calculation of reaction enthalpy for the hydrogenation of ethene (ethylene),
C
2
H
4
.
The bond enthalpies in tab. 1.5 may be used to compute reaction enthalpies. Consider, for
example, the reaction 2 H
2
(g) + O
2
(g) 2 H
2
O(l). We then have, from the table,
(H)
rxn
= 2 h[HH] +h[O=O] 4 h[HO]
= 483 kJ/mol O
2
. (1.423)
Thus, 483 kJ of heat would be released for every two moles of H
2
O produced, if the H
2
O were
in the gaseous phase. Since H
2
O is liquid at STP, we should also include the condensation
energy of the gaseous water vapor into liquid water. At T = 100
C the latent heat of

vaporization is

= 2270 J/g, but at T = 20
C, one has

= 2450 J/g, hence with M =
18 we have = 44.1 kJ/mol. Therefore, the heat produced by the reaction 2 H
2
(g) +
O
2
(g)

2 H
2
O(l) is (H)
rxn
= 571.2 kJ / mol O
2
. Since the reaction produces two
moles of water, we conclude that the enthalpy of formation of liquid water at STP is half
this value: H
0
f
[H
2
O] = 285.6 kJ/mol.
Consider next the hydrogenation of ethene (ethylene): C
2
H
4
+ H
2

C
2
H
6
. The product
is known as ethane. The energy accounting is shown in g. 1.28. To compute the enthalpies
of formation of ethene and ethane from the bond enthalpies, we need one more bit of
information, which is the standard enthalpy of formation of C(g) from C(s), since the solid
is the standard state at STP. This value is H
0
f
[C(g)] = 718 kJ/mol. We may now write
2 C(g) + 4 H(g)
2260 kJ
C
2
H
4
(g)
2 C(s)
1436 kJ
2 C(g)
2 H
2
(g)
872 kJ
4 H(g) .
Figure 1.29: Typical thermodynamic phase diagram of a single component pV T system,
showing triple point (three phase coexistence) and critical point. (Source: Univ. of Helsinki)
Thus, using Hesss law, i.e. adding up these reaction equations, we have
2 C(s) + 2 H
2
(g)
48 kJ
C
2
H
4
(g) .
Thus, the formation of ethene is endothermic. For ethane,
2 C(g) + 6 H(g)
2820 kJ
C
2
H
6
(g)
2 C(s)
1436 kJ
2 C(g)
3 H
2
(g)
1306 kJ
6 H(g)
For ethane,
2 C(s) + 3 H
2
(g)
76 kJ
C
2
H
6
(g) ,
which is exothermic.
1.13 Phase Transitions and Phase Equilibria
A typical phase diagram of a p V T system is shown in the g. 1.29. The solid
lines delineate boundaries between distinct thermodynamic phases. These lines are called
coexistence curves. Along these curves, we can have coexistence of two phases, and the
thermodynamic potentials are singular. The order of the singularity is often taken as a
classication of the phase transition. I.e. if the thermodynamic potentials E, F, G, and
H have discontinuous or divergent m
th
derivatives, the transition between the respective
phases is said to be m
th
order. Modern theories of phase transitions generally only recognize
two possibilities: rst order transitions, where the order parameter changes discontinuously
through the transition, and second order transitions, where the order parameter vanishes
1.13. PHASE TRANSITIONS AND PHASE EQUILIBRIA 77
Figure 1.30: Phase diagrams for
3
He (left) and
4
He (right). What a dierence a neutron
makes! (Source: Brittanica)
continuously at the boundary from ordered to disordered phases
13
. Well discuss order
parameters during Physics 140B.
For a more interesting phase diagram, see g. 1.30, which displays the phase diagrams for
3
He and
4
He. The only dierence between these two atoms is that the former has one fewer
neutron: (2p + 1n + 2e) in
3
He versus (2p + 2n + 2e) in
4
He. As we shall learn when
we study quantum statistics, this extra neutron makes all the dierence, because
3
He is a
fermion while
4
He is a boson.
1.13.1 p-v-T surfaces
The equation of state for a single component system may be written as
f(p, v, T) = 0 . (1.424)
This may in principle be inverted to yield p = p(v, T) or v = v(T, p) or T = T(p, v). The
single constraint f(p, v, T) on the three state variables denes a surface in p, v, T space.
An example of such a surface is shown in g. 1.31, for the ideal gas.
Real p-v-T surfaces are much richer than that for the ideal gas, because real systems undergo
phase transitions in which thermodynamic properties are singular or discontinuous along
certain curves on the p-v-T surface. An example is shown in g. 1.32. The high temperature
isotherms resemble those of the ideal gas, but as one cools below the critical temperature
T
c
, the isotherms become singular. Precisely at T = T
c
, the isotherm p = p(v, T
c
) becomes
perfectly horizontal at v = v
c
, which is the critical molar volume. This means that the
isothermal compressibility,
T
=
1
v
_
v
p
_
T
diverges at T = T
c
. Below T
c
, the isotherms
13
Some exotic phase transitions in quantum matter, which do not quite t the usual classication schemes,
have recently been proposed.
Figure 1.31: The surface p(v, T) = RT/v corresponding to the ideal gas equation of state,
and its projections onto the (p, T), (p, v), and (T, v) planes.
have a at portion, as shown in g. 1.33, corresponding to a two phase region where liquid
and vapor coexist. In the (p, T) plane, sketched for H
2
O in g. 1.4 and shown for CO
2
in g.
1.34, this liquid-vapor phase coexistence occurs along a curve, called the vaporization (or
boiling) curve. The density changes discontinuously across this curve; for H
2
O, the liquid
is approximately 1000 times denser than the vapor at atmospheric pressure. The density
discontinuity vanishes at the critical point. Note that one can continuously transform
between liquid and vapor phases, without encountering any phase transitions, by going
around the critical point and avoiding the two-phase region.
In addition to liquid-vapor coexistence, solid-liquid and solid-vapor coexistence also occur,
as shown in g. 1.32. The triple point (T
t
, p
t
) lies at the conuence of these three coexistence
regions. For H
2
O, the location of the triple point and critical point are given by
T
t
= 273.16 K T
c
= 647 K
p
t
= 611.7 Pa = 6.037 10
3
atm p
c
= 22.06 MPa = 217.7 atm
1.13.2 The Clausius-Clapeyron relation
Recall that the homogeneity of E(S, V, N) guaranteed E = TS pV + N, from Eulers
theorem. It also guarantees a relation between the intensive variables T, p, and , according
Figure 1.32: A p v T surface for substance which contracts upon freezing. The red dot
is the critical point and the red dashed line is the critical isotherm. The yellow dot is the
triple point at which there is three phase coexistence of solid, liquid, and vapor.
to eqn. 1.156. Let us dene g G/ = N
A
, the Gibbs free energy per mole. Then
dg = s dT +v dp , (1.425)
where s = S/ and v = V/ are the molar entropy and molar volume, respectively. Along a
coexistence curve between phase #1 and phase #2, we must have g
1
= g
2
, since the phases
are free to exchange energy and particle number, i.e. they are in thermal and chemical
equilibrium. This means
dg
1
= s
1
dT +v
1
dp = s
2
dT +v
2
dp = dg
2
. (1.426)
Therefore, along the coexistence curve we must have
_
dp
dT
_
coex
=
s
2
s
1
v
2
v
1
=

T v
, (1.427)
where
T s = T (s
2
s
1
) (1.428)
is the molar latent heat of transition. A heat must be supplied in order to change from
phase #1 to phase #2, even without changing p or T. If is the latent heat per mole, then
we write

as the latent heat per gram:

= /M, where M is the molar mass.
Figure 1.33: Projection of the p v T surface of g. 1.32 onto the p v plane.
Along the liquid-gas coexistence curve, we typically have v
gas
v
liquid
, and assuming the
vapor is ideal, we may write v v
gas
RT/p. Thus,
_
dp
dT
_
liqgas
=

T v

p
RT
2
. (1.429)
If remains constant throughout a section of the liquid-gas coexistence curve, we may
integrate the above equation to get
dp
p
=

R
dT
T
2
= p(T) = p(T
0
) e
/RT
0
e
/RT
. (1.430)
1.13.3 Liquid-solid line in H
2
O
Life on planet earth owes much of its existence to a peculiar property of water: the solid
is less dense than the liquid along the coexistence curve. For example at T = 273.1 K and
p = 1 atm,
v
water
= 1.00013 cm
3
/g , v
ice
= 1.0907 cm
3
/g . (1.431)
The latent heat of the transition is

= 333 J/g = 79.5 cal/g. Thus,
_
dp
dT
_
liqsol
=
T v
=
333 J/g
(273.1 K) (9.05 10
2
cm
3
/g)
= 1.35 10
8
dyn
cm
2
K
= 134
atm
C
. (1.432)
Figure 1.34: Phase diagram for CO
2
in the (p, T) plane. (Source: www.scifun.org)
The negative slope of the melting curve is invoked to explain the movement of glaciers: as
glaciers slide down a rocky slope, they generate enormous pressure at obstacles
14
Due to
this pressure, the melting temperature decreases, and the glacier melts around the obstacle,
so it can ow past it, after which it refreezes. It is not the case that the bottom of the
glacier melts under the pressure, for consider a glacier of height h = 1 km. The pressure at
the bottom is p gh/ v 10
7
Pa, which is only about 100 atmospheres. Such a pressure
can produce only a small shift in the melting temperature of about T
melt
= 0.75
C.
Does the Clausius-Clapeyron relation explain how we can skate on ice? My seven year old
daughter has a mass of about M = 20 kg. Her ice skates have blades of width about 5 mm
and length about 10 cm. Thus, even on one foot, she only imparts an additional pressure of
p =
Mg
A

20 kg 9.8 m/s
2
(5 10
3
m) (10
1
m)
= 3.9 10
5
Pa = 3.9 atm . (1.433)
The change in the melting temperature is thus minuscule: T
melt
0.03
C.
So why can my daughter skate so nicely? The answer isnt so clear!
15
There seem to be two
relevant issues in play. First, friction generates heat which can locally melt the surface of
the ice. Second, the surface of ice, and of many solids, is naturally slippery. Indeed, this is
the case for ice even if one is standing still, generating no frictional forces. Why is this so?
It turns out that the Gibbs free energy of the ice-air interface is larger than the sum of free
energies of ice-water and water-air interfaces. That is to say, ice, as well as many simple
solids, prefers to have a thin layer of liquid on its surface, even at temperatures well below its
14
The melting curve has a negative slope at relatively low pressures, where the solid has the so-called Ih
hexagonal crystal structure. At pressures above about 2500 atmospheres, the crystal structure changes, and
the slope of the melting curve becomes positive.
15
For a recent discussion, see R. Rosenberg, Physics Today 58, 50 (2005).
Latent Heat Melting Latent Heat of Boiling
Substance of Fusion

f
Point Vaporization

v
Point
J/g

C J/g

C
C
2
H
5
OH 108 -114 855 78.3
NH
3
339 -75 1369 -33.34
CO
2
184 -57 574 -78
He 21 -268.93
H 58 -259 455 -253
Pb 24.5 372.3 871 1750
N
2
25.7 -210 200 -196
O
2
13.9 -219 213 -183
H
2
O 334 0 2270 100
Table 1.6: Latent heats of fusion and vaporization at p = 1 atm.
bulk melting point. If the intermolecular interactions are not short-ranged
16
, theory predicts
a surface melt thickness d (T
m
T)
1/3
. In g. 1.35 we show measurements by Gilpin
(1980) of the surface melt on ice, down to about 50
C. Near 0
C the melt layer thickness

is about 40 nm, but this decreases to 1 nm at T = 35
C. At very low temperatures,

skates stick rather than glide. Of course, the skate material is also important, since that will
aect the energetics of the second interface. The 19th century novel, Hans Brinker, or The
Silver Skates by Mary Mapes Dodge tells the story of the poor but stereotypically decent
and hardworking Dutch boy Hans Brinker, who dreams of winning an upcoming ice skating
race, along with the top prize: a pair of silver skates. All he has are some lousy wooden
skates, which wont do him any good in the race. He has money saved to buy steel skates,
but of course his father desperately needs an operation because I am not making this up
he fell o a dike and lost his mind. The family has no other way to pay for the doctor.
What a story! At this point, I imagine the suspense must be too much for you to bear, but
this isnt an American Literature class, so you can use Google to nd out what happens
(or rent the 1958 movie, directed by Sidney Lumet). My point here is that Hans crappy
wooden skates cant compare to the metal ones, even though the surface melt between the
ice and the air is the same. The skate blade material also makes a dierence, both for the
interface energy and, perhaps more importantly, for the generation of friction as well.
1.13.4 Slow melting of ice : a quasistatic but irreversible process
Suppose we have an ice cube initially at temperature T
0
< = 0
C and we toss it into

a pond of water. We regard the pond as a heat bath at some temperature T
1
> 0
C. Let
the mass of the ice be M. How much heat Q is absorbed by the ice in order to raise its
16
For example, they could be of the van der Waals form, due to virtual dipole uctuations, with an
attractive 1/r
6
tail.
Figure 1.35: Left panel: data from R. R. Gilpin, J. Colloid Interface Sci. 77, 435 (1980)
showing measured thickness of the surface melt on ice at temperatures below 0
C. The
straight line has slope
1
3
, as predicted by theory. Right panel: phase diagram of H
2
O,
showing various high pressure solid phases. (Source : Physics Today, December 2005)
temperature to T
1
? Clearly
Q = M c
S
( T
0
) +M
+M c
L
(T
1
) , (1.434)
where c
S
and c
L
are the specic heats of ice (solid) and water (liquid), respectively
17
, and

is the latent heat of melting per unit mass. The pond must give up this much heat to the
ice, hence the entropy of the pond, discounting the new water which will come from the
melted ice, must decrease:
S
pond
=
Q
T
1
. (1.435)
Now we ask what is the entropy change of the H
2
O in the ice. We have
S
ice
=
_
dQ
T
=
_
T
0
dT
M c
S
T
+
M
+
T
1
_
dT
M c
L
T
= M c
S
ln
_
T
0
_
+
M
+M c
L
ln
_
T
1
_
. (1.436)
17
We assume c
S
(T) and c
L
(T) have no appreciable temperature dependence, and we regard them both as
constants.
The total entropy change of the system is then
S
total
= S
pond
+ S
ice
(1.437)
= M c
S
ln
_
T
0
_
M c
S
_
T
0
T
1
_
+M
_
1

1
T
1
_
+M c
L
ln
_
T
1
_
M c
L
_
T
1
T
1
_
(1.438)
Now since T
0
< < T
1
, we have
M c
S
_
T
0
T
1
_
< M c
S
_
T
0
_
. (1.439)
Therefore,
S > M
_
1

1
T
1
_
+M c
S
f
_
/T
0
_
+M c
L
f
_
T
1
/
_
, (1.440)
where
f(x) = x 1 ln x . (1.441)
Clearly f
(x) = 1 x
1
is negative on the interval (0, 1), which means that the maximum
of f(x) occurs at x = 0 and the minimum at x = 1. But f(0) = and f(1) = 0, which
means that f(x) 0 for x [0, 1]. Therefore, we conclude
S
total
> 0 . (1.442)
1.13.5 Gibbs phase rule
Equilibrium between two phases means that p, T, and (p, T) are identical. From
1
(p, T) =
2
(p, T) , (1.443)
we derive an equation for the slope of the coexistence curve, the Clausius-Clapeyron relation.
Note that we have one equation in two unknowns (T, p), so the solution set is a curve. For
three phase coexistence, we have
1
(p, T) =
2
(p, T) =
3
(p, T) , (1.444)
which gives us two equations in two unknowns. The solution is then a point (or a set of
points). A critical point also is a solution of two simultaneous equations:
critical point = v
1
(p, T) = v
2
(p, T) ,
1
(p, T) =
2
(p, T) . (1.445)
Recall v = N
A
_
p
_
T
. Note that there can be no four phase coexistence for a simple pV T
system.
Now for the general result. Suppose we have species, with particle numbers N
a
, where
a = 1, . . . , . It is useful to briey recapitulate the derivation of the Gibbs-Duhem relation.
The energy E(S, V, N
1
, . . . , N
) is a homogeneous function of degree one:

E(S, V, N
1
, . . . , N
) = E(S, V, N
1
, . . . , N
) . (1.446)
From Eulers theorem for homogeneous functions (just dierentiate with respect to and
then set = 1), we have
E = TS p V +
a=1
a
N
a
. (1.447)
Taking the dierential, and invoking the First Law,
dE = T dS p dV +
a=1
a
dN
a
, (1.448)
we arrive at the relation
S dT V dp +
a=1
N
a
d
a
= 0 , (1.449)
of which eqn. 1.155 is a generalization to additional internal work variables. This says
that the + 2 quantities (T, p,
1
, . . . ,
) are not all independent. We can therefore write
_
T, p,
1
, . . . ,
1
_
. (1.450)
If there are dierent phases, then in each phase j, with j = 1, . . . , , there is a chemical
potential
(j)
a
for each species a. We then have
(j)
=
(j)
_
T, p,
(j)
1
, . . . ,
(j)
1
_
. (1.451)
Here
(j)
a
is the chemical potential of the a
th
species in the j
th
phase. Thus, there are
such equations relating the 2 + variables
_
T, p,
_
(j)
a
__
, meaning that only 2 +( 1)
of them may be chosen as independent. This, then, is the dimension of thermodynamic
space containing a maximal number of intensive variables:
d
TD
(, ) = 2 +( 1) . (1.452)
To completely specify the state of our system, we of course introduce a single extensive
variable, such as the total volume V . Note that the total particle number N =
a1
may
not be conserved in the presence of chemical reactions!
Now suppose we have equilibrium among phases. We have implicitly assumed thermal
and mechanical equilibrium among all the phases, meaning that p and T are constant.
Chemical equilibrium applies on a species-by-species basis. This means
(j)
a
=
(j
)
a
(1.453)
where j, j
1, . . . , . This gives ( 1) independent equations equations

18
. Thus, we
can have phase equilibrium among the phases of species over a region of dimension
d
PE
(, ) = 2 +( 1) ( 1)
= 2 + . (1.454)
18
Set j = 1 and let j
range over the 1 values 2, . . . , .

Figure 1.36: Equation of state for a substance which expands upon freezing, projected to
the (T, v) and (p, v) and (p, T) planes.
Since d
PE
0, we must have + 2. Thus, with two species ( = 2), we could have at
most four phase coexistence.
If the various species can undergo distinct chemical reactions of the form
(r)
1
A
1
+
(r)
2
A
2
+ +
(r)
= 0 , (1.455)
where A
a
is the chemical formula for species a, and
(r)
a
is the stoichiometric coecient for
the a
th
species in the r
th
reaction, with r = 1, . . . , , then we have an additional constraints
of the form
a=1
(r)
a

(j)
a
= 0 . (1.456)
Therefore,
d
PE
(, , ) = 2 + . (1.457)
One might ask what value of j are we to use in eqn. 1.456, or do we in fact have such
equations for each r? The answer is that eqn. 1.453 guarantees that the chemical potential
of species a is the same in all the phases, hence it doesnt matter what value one chooses
for j in eqn. 1.456.
Let us assume that no reactions take place, i.e. = 0, so the total number of particles
b=1
N
b
is conserved. Instead of choosing (T, p,
1
, . . . ,
(j)
1
) as d
TD
intensive variables, we
could have chosen (T, p,
1
, . . . , x
(j)
1
), where x
a
= N
a
/N is the concentration of species a.
Why do phase diagrams in the (p, v) and (T, v) plane look dierent than those in the (p, T)
plane?
19
For example, g. 1.36 shows projections of the p-v-T surface of a typical single
component substance onto the (T, v), (p, v), and (p, T) planes. Coexistence takes place
along curves in the (p, T) plane, but in extended two-dimensional regions in the (T, v)
and (p, v) planes. The reason that p and T are special is that temperature, pressure, and
chemical potential must be equal throughout an equilibrium phase if it is truly in thermal,
mechanical, and chemical equilibrium. This is not the case for an intensive variable such as
specic volume v = N
A
V/N or chemical concentration x
a
= N
a
/N.
1.13.6 Binary solutions
Consider a binary solution, and write the Gibbs free energy G(T, p, N
A
, N
B
) as
G(T, p, N
A
, N
B
) = N
A
0
A
(T, p) +N
B
0
B
(T, p) +N
A
k
B
T ln
_
N
A
N
A
+N
B
_
+N
B
k
B
T ln
_
N
B
N
A
+N
B
_
+
N
A
N
B
N
A
+N
B
. (1.458)
The rst four terms on the RHS represent the free energy of the individual component
uids and the entropy of mixing. The last term is an interaction contribution. With
> 0, the interaction term prefers that the system be either fully A or fully B. The entropy
contribution prefers a mixture, so there is a competition. What is the stable thermodynamic
state?
It is useful to write the Gibbs free energy per particle, g(T, p, x) = G/(N
A
+N
B
), in terms
of T, p, and the concentration x x
B
= N
B
/(N
A
+N
B
) of species B
20
. Then
g(T, p, x) = (1 x)
0
A
+x
0
B
+k
B
T
_
xln x + (1 x) ln(1 x)
_
+x(1 x) . (1.459)
In order for the system to be stable against phase separation into relatively A-rich and B-
rich regions, we must have that g(T, p, x) be a convex function of x. Our rst check should
be for a local instability, i.e. spinodal decomposition. We have
g
x
=
0
B
0
A
+k
B
T ln
_
x
1 x
_
+(1 2x) (1.460)
and
2
g
x
2
=
k
B
T
x
+
k
B
T
1 x
2 . (1.461)
The spinodal is given by the solution to the equation

2
g
x
2
= 0, which is
T
(x) =
2
k
B
x(1 x) . (1.462)
Since x(1 x) achieves its maximum value of
1
4
at x =
1
2
, we have T
k
B
/2.
19
The same can be said for multicomponent systems: the phase diagram in the (T, x) plane at constant p
looks dierent than the phase diagram in the (T, ) plane at constant p.
20
Note that x
A
= 1 x is the concentration of species A.
Figure 1.37: Gibbs free energy per particle for a binary solution as a function of concentra-
tion x = x
B
of the B species (pure A at the left end x = 0 ; pure B at the right end x = 1),
in units of the interaction parameter . Dark red curve: T = 0.65 /k
B
> T
c
; green curve:
T = /2k
B
= T
c
; blue curve: T = 0.40 /k
B
< T
c
. We have chosen
0
A
= 0.60 0.50 k
B
T
and
0
B
= 0.50 0. 50 k
B
T. Note that the free energy g(T, p, x) is not convex in x for
T < T
c
, indicating an instability and necessitating a Maxwell construction.
In g. 1.37 we sketch the free energy g(T, p, x) versus x for three representative tempera-
tures. For T > /2k
B
, the free energy is everywhere convex in . When T < /2k
B
, there
free energy resembles the blue curve in g. 1.37, and the system is unstable to phase sepa-
ration. The two phases are said to be immiscible, or, equivalently, there exists a solubility
gap. To determine the coexistence curve, we perform a Maxwell construction, writing
g(x
2
) g(x
1
)
x
2
x
1
=
g
x
x
1
=
g
x
x
2
. (1.463)
Here, x
1
and x
2
are the boundaries of the two phase region. These equations admit a
symmetry of x 1 x, hence we can set x = x
1
and x
2
= 1 x. We nd
g(1 x) g(x) = (1 2x)
_
0
B
0
A
_
, (1.464)
and invoking eqns. 1.463 and 1.460 we obtain the solution
T
coex
(x) =

k
B
1 2x
ln
_
1x
x
_ . (1.465)
The phase diagram for the binary system is shown in g. 1.38. For T < T
(x), the system

is unstable, and spinodal decomposition occurs. For T
(x) < T < T

coex
(x), the system
Figure 1.38: Phase diagram for the binary system. The black curve is the coexistence curve,
and the dark red curve is the spinodal. A-rich material is to the left and B-rich to the right.
is metastable, just like the van der Waals gas in its corresponding regime. Real binary
solutions behave qualitatively like the model discussed here, although the coexistence curve
is generally not symmetric under x 1 x, and the single phase region extends down to
low temperatures for x 0 and x 1.
It is instructive to consider the phase diagram in the (T, ) plane. We dene the chemical
potential shifts,
A

A
0
A
= k
B
T ln(1 x) +x
2
(1.466)
B

B
0
B
= k
B
T ln x +(1 x)
2
, (1.467)
and their sum and dierence,
B
. (1.468)
From the Gibbs-Duhem relation, we know that we can write
B
as a function of T, p, and
A
. Alternately, we could write
in terms of T, p, and
, so we can choose which

among
+
and
we wish to use in our phase diagram. The results are plotted in

g. 1.39. It is perhaps easiest to understand the phase diagram in the (T,
) plane.
At low temperatures, below T = T
c
= /2k
B
, there is a rst order phase transition at
= 0. For T < T
c
= /2k
B
and
= 0
+
, i.e. innitesimally positive, the system
is in the A-rich phase, but for
= 0
, i.e. innitesimally negative, it is B-rich. The

Figure 1.39: Upper panels: chemical potential shifts
=
A

B
versus concen-
tration x = x
B
. The dashed line is the spinodal, and the dot-dashed line the coexistence
boundary. Temperatures range from T = 0 (dark blue) to T = 0.6 /k
B
(red) in units of
0.1 /k
B
. Lower panels: phase diagram in the (T,
) planes. The black dot is the critical

point.
concentration x = x
B
changes discontinuously across the phase boundary. The critical point
lies at (T,
) = (/2k
B
, 0).
What happens if < 0 ? In this case, both the entropy and the interaction energy prefer a
mixed phase, and there is no instability to phase separation. The two uids are said to be
completely miscible. An example would be benzene, C
6
H
6
, and toluene, C
7
H
8
(C
6
H
5
CH
3
).
At higher temperatures, near the liquid-gas transition, however, we again have an instability
toward phase separation. Let us write the Gibbs free energy per particle g(T, p) for the liquid
Figure 1.40: Gibbs free energy per particle g for a miscible binary solution for tempera-
tures T (T
A
, T
B
). For temperatures in this range, the system is unstable toward phase
separation, and a Maxwell construction is necessary.
and vapor phases of a miscible binary uid:
g
L
(T, p, x) = (1 x)
L
A
(T, p) +x
L
B
(T, p) +k
B
T
_
xln x + (1 x) ln(1 x)
_
+
L
AB
x(1 x)
(1.469)
g
V
(T, p, x) = (1 x)
V
A
(T, p) +x
V
B
(T, p) +k
B
T
_
xln x + (1 x) ln(1 x)
_
+
V
AB
x(1 x) .
(1.470)
We assume
L
AB
< 0 and
V
AB
0. We also assume that the pure A uid boils at T = T
A
(p)
and the pure B uid boils at T = T
B
(p), with T
A
< T
B
. Then we may write
L
A
(T, p) =
L
A
(T
A
, p) (T T
A
) s
L
A
+. . . (1.471)
V
A
(T, p) =
V
A
(T
A
, p) (T T
A
) s
V
A
+. . . (1.472)
for uid A, and
L
B
(T, p) =
L
B
(T
B
, p) (T T
B
) s
L
B
+. . . (1.473)
V
B
(T, p) =
V
B
(T
B
, p) (T T
B
) s
V
B
+. . . (1.474)
Figure 1.41: Phase diagram for a mixture of two similar liquids in the vicinity of boiling,
showing a distillation sequence in (x, T) space.
for uid B. The fact that A boils at T
A
and B at T
B
means that the respective liquid and
vapor phases are in equilibrium at those temperatures:
L
A
(T
A
, p) =
V
B
(T
A
, p) (1.475)
L
B
(T
B
, p) =
V
B
(T
B
, p) . (1.476)
Note that we have used
_
T
_
p,N
=
_
S
N
_
T,p
s(T, p). We assume s
V
A
> s
L
A
and s
V
B
> s
L
B
,
i.e. the vapor has greater entropy than the liquid at any given temperature and pressure.
For the purposes of analysis, is convenient to assume s
L
A,B
0. This leads to the energy
curves in g. 1.40. The dimensionless parameters used in obtaining this gure were:
L
A
(T
A
) =
V
A
(T
A
) = 2.0 T
A
= 3.0 s
L
A
= 0.0 s
V
A
= 0.7
L
AB
= 1.0
L
B
(T
B
) =
V
B
(T
B
) = 3.0 T
B
= 6.0 s
L
B
= 0.0 s
V
B
= 0.4
V
AB
= 0.0 (1.477)
The resulting phase diagram is depicted in g. 1.41.
According to the Gibbs phase rule, with = 2, two phase equilibrium ( = 2) occurs along
a subspace of dimension d
PE
= 2 + = 2. Thus, if we x the pressure p and the
concentration x = x
B
, liquid-gas equilibrium occurs at a particular temperature T
, known
as the boiling point. Since the liquid and the vapor with which it is in equilibrium at T
may have dierent composition, i.e. dierent values of x, one may distill the mixture to
Figure 1.42: Phase diagrams for azeotropes.
separate the two pure substances, as follows. First, given a liquid mixture of A and B, we
bring it to boiling, as shown in g. 1.41. The vapor is at a dierent concentration x than the
liquid (a lower value of x if the boiling point of pure A is less than that of pure B, as shown
in the gure). If we collect the vapor, the remaining uid is at a higher value of x. The
collected vapor is then captured and then condensed, forming a liquid at the lower x value.
This is then brought to a boil, and the resulting vapor is drawn o and condensed, etc The
result is a puried A state. The remaining liquid is then at a higher B concentration. By
repeated boiling and condensation, A and B can be separated.
For many liquid mixtures, the boiling point curve is as shown in g. 1.42. Such cases are
called azeotropes. In an azeotrope, the individual components A and B cannot be separated
by distillation. Rather, the end product of the distillation process is either pure A plus
azeotrope or pure B plus azeotrope, where the azeotrope is the mixture at the extremum
of the boiling curve, where the composition of the liquid and the vapor with which it is in
equilibrium are the same, and equal to x
.
1.13.7 The van der Waals system
Weve already met the van der Waals equation of state,
_
p +
a
v
2
_
(v b) = RT , (1.478)
and we found (see eqn. 1.290) that
E(T, v, ) =
1
2
fRT
a
v
. (1.479)
It is convenient to express p, v, and T in terms of p
c
, v
c
, and T
c
from eqn. 1.347:
p
c
=
a
27 b
2
, v
c
= 3b , T
c
=
8a
27 bR
. (1.480)
We also can express energies in units of p
c
v
c
=
a
9b
=
3
8
RT
c
=
3
8
R, and entropy in units of
p
c
v
c
/T
c
. Writing p = p/p
c
, e = E/p
c
v
c
, etc., we have
8
3
T =
_
p +
3
v
2
_
_
v
1
3
_
(1.481)
e =
4
3
f T
3
v
. (1.482)
Taking the dierentials of these equations, we nd
8
3
dT =
_
p
3
v
2
+
2
v
3
_
dv +
_
v
1
3
_
dp (1.483)
de =
4
3
f dT +
3
v
2
dv (1.484)
Tds = de + p dv =
4
3
f dT +
_
p +
3
v
2
_
dv . (1.485)
From these equations we may derive the various thermodynamic response functions.
For example, setting dp = 0 we obtain
a
p
=
1
v
_
v
T
_
p
=
8
3v

_
p
3
v
2
+
2
v
3
_
1
. (1.486)
Setting dT = 0, we nd the isothermal compressibility,
k
T
=
1
v
_
v
p
_
T
=
1
v
_
v
1
3
_
_
p
3
v
2
+
2
v
3
_
1
. (1.487)
And of course we have
c
V
=
_
e
T
_
v
=
4
3
f . (1.488)
Setting ds = 0 we obtain the adiabatic relation
ds = 0 =
4
3
f dT +
_
p +
3
v
2
_
dv = 0 . (1.489)
From this we derive
c
p
= T
_
s
T
_
p
=
4f
3
+
_
p +
3
v
2
__
v
T
_
p
=
4f
3
+
8
3
_
p +
3
v
2
_
_
p
3
v
2
+
2
v
3
_
1
. (1.490)
Using eqn. 1.489, we then invoke eqn. 1.484 to obtain
ds = 0 =
_
_
1 +
2
f
_
p +
_
2
f
1
_
3
v
2
+
2
v
3
_
dv +
_
v
1
3
_
dp = 0 . (1.491)
Writing 1 +
2
f
as for the ideal gas, we may then read o the adiabatic compressibility
k
S
=
1
v
_
v
p
_
s
=
1
v
_
v
1
3
_
_
p (2 )
3
v
2
+
2
v
3
_
1
. (1.492)
One can now verify the identities
k
T
k
S
=
c
p
c
V
(1.493)
and
c
p
c
V
=
Tv a
2
p
k
T
. (1.494)
Note that to restore physical units, we write
p
= a
p
/p
c
, c
p
= (p
c
v
c
/T
c
) c
p
=
3
8
Rc
p
, etc.
The fact that the thermodynamics of dierent gases, i.e. with dierent van der Waals
parameters a and b, can be expressed in terms of the same universal functions is known as
the law of corresponding states.
Note that the isothermal compressibility k
T
diverges at p = p
(v), where
p
(v) =
3
v
2

2
v
3
. (1.495)
This divergence indicated a thermodynamic instability. To understand better, let us com-
pute the dimensionless molar free energy, f(T, v). First, we compute the entropy
s(T, v) =
_
T
dT
c
V
T
=
4
3
f Tln T + s
0
(v) . (1.496)
We then write f = e Ts, and demanding that p =
_
f
v
_
T
, we x s
0
(v) =
8
3
ln
_
v
1
3
_
.
Thus,
f(T, v) =
4
3
f T
_
1 ln T
_
3
v

8
3
T ln
_
v
1
3
_
+ f
0
, (1.497)
where f
0
is independent of T and v.
We know that under equilibrium conditions, f is driven to a minimum by spontaneous
processes. Now suppose that

2
f
v
2
T
< 0 over some range of v at a given (dimensionless)
temperature T. This would mean that one mole of the system at volume v and temperature
T could lower its energy by rearranging into two half-moles, with respective volumes v v,
each at temperature T. The total volume and temperature thus remain xed, but the
free energy changes by an amount f =
1
2
2
f
v
2
T
(v)
2
< 0. This means that the system is
Figure 1.43: Molar free energy f(T, v) of the van der Waals system for three representative
temperatures. For T < 1, the system is unstable with respect to phase separation. Upper
panel: T = 0.85, with dot-dashed black line showing Maxwell construction connecting molar
volumes v
1,2
on opposite sides of the coexistence curve. Lower panel: series of free energy
curves with temperatures T = 1.4 (dark red), T = 1.0 (green), T = 0.80 (blue), T = 0.60
(pale blue), and T = 0.40 (black).
unstable it can lower its energy by dividing up into two subsystems each with dierent
densities (i.e. molar volumes). Note that the onset of stability occurs when
2
f
v
2
T
=
p
v
T
=
1
v k
p
= 0 , (1.498)
which is to say when k
p
= . As we saw, this occurs at p = p
(v), given in eqn. 1.495.

However, this condition,

2
f
v
2
T
< 0, is in fact too strong. That is, the system can be unstable
even at molar volumes where

2
f
v
2
T
> 0. The reason is shown graphically in g. 1.43. At
the xed temperature T, for any molar volume v between v
1
and v
2
, the system can lower
its free energy by phase separating into regions of dierent molar volumes. In general we
can write
v = (1 x) v
1
+xv
2
, (1.499)
Figure 1.44: Isotherms for the van der Waals system. Black curves: p(T, v) for T = 1.05,
T = 1.00 (dashed dark red curve), T = 0.96, T = 0.91 (thick black curve), and T = 0.86.
The solid red curve marks the spinodal boundary p
(v), which is the locus of points along

which k
T
= . The coexistence curve is shown in blue. Inside the blue curve, the isotherms
p(T, v) must be replaced by a Maxwell construction. The spinodal and coexistence curves
coincide at the critical point, p = v = T = 1.
so v = v
1
when x = 0 and v = v
2
when x = 1. The free energy upon phase separation is
simply
f = (1 x) f
1
+xf
2
, (1.500)
where f
j
= f(v
j
, T). This function is given by the straight black line connecting the points
at volumes v
1
and v
2
in g. 1.43.
The two equations which give us v
1
and v
2
are
f
v
v
1
,T
=
f
v
v
2
,T
(1.501)
and
f(T, v
2
) f(T, v
1
) = (v
2
v
1
)
f
v
v
1
,T
. (1.502)
In terms of the pressure, p =
f
v
T
, these equations are equivalent to
p(T, v
1
) = p(T, v
2
) (1.503)
v
2
_
v
1
dv p(T, v) =
_
v
2
v
1
_
p(T, v
1
) . (1.504)
This procedure is known as the Maxwell construction. The situation is depicted graphically
in g. 1.44. The red curve p = p
(v) is called the spinodal . Below this curve, the system is

unstable to innitesimal uctuations in density, and it will spontaneously separate into two
phases, a process known as spinodal decomposition. The blue curve, called the coexistence
curve, marks the instability boundary for nucleation. In a nucleation process, an energy
barrier must be overcome in order to achieve the lower free energy state. There is no energy
barrier for spinodal decomposition it is a spontaneous process.
We can make some analytic progress by expanding about the critical point, writing
p = 1 + , T = 1 +t , v = 1 + . (1.505)
Expanding the equation of state, we nd
= 4t 6t
3
2
3
+ 9t
2
+. . . (1.506)
Note that for the critical isotherm, i.e. for t = 0, we obtain =
3
2
3
. For t > 0 the
isotherms are monotonic, but for t < 0 we must invoke the Maxwell construction, which
says
1
_
=
2
_
1
d (t, )
= 4t
_
1
_
3t
_
2
2
2
1
_
3
8
_
4
2
4
1
_
+ 3t
_
3
2
3
1
_
+. . . . (1.507)
The overpressure is give by
= 4t 6t
1
3
2
3
1
+ 9t
2
1
+. . . (1.508)
= 4t 6t
2
3
2
3
2
+ 9t
2
2
+. . . . (1.509)
Adding and subtracting these two equations yields
= 4t 3t
_
1
+
2
_
3
4
_
3
1
+
3
2
_
9
2
t
_
2
1
+
2
2
_
+. . . (1.510)
and
0 = 2t
_
1
_
+
1
2
_
3
2
3
1
_
3t
_
2
2
2
1
_
+. . . (1.511)
Substituting eqn. 1.510 into eqn. 1.507, we nd
1
+
2
= 4t +. . . . (1.512)
Invoking this in the dierence equation 1.511, we obtain
_
1
_
2
24 t
_
1
_
+ 16 t (1 + 3t) = 0 . (1.513)
Thus,
1
(t) = 2
t 4t (1.514)
2
(t) = 2
t + 8t . (1.515)
Suppose we follow along an isotherm starting from the high molar volume (gas) phase. If
T > 1, the volume v decreases continuously as the pressure p increases
21
. If T < 1, then at
the instant the isotherm rst intersects the blue curve, there is a discontinuous change in
the molar volume from high (gas) to low (liquid). This discontinuous change is the hallmark
of a rst order phase transition. Note that the volume discontinuity, v =
2
(t)
1
(t)
4(1T)
1/2
. This is an example of a critical behavior in which the order parameter , which
in this case may be taken to be the dierence = v
gas
v
liquid
, behaves as a power law in
T T
c
, where T
c
= 1 is the critical temperature. In this case, we have (T) (1 T)
+
,
where =
1
2
is the exponent, and where (1 T)
+
is dened to be 1 T if T < 1 and 0
otherwise. The inverse isothermal compressibility,
k
1
T
= v
_
p
v
_
T
= (1 +)
_
_
t
= 6t +
9
2
2
18t +. . . , (1.516)
vanishes as one approaches the coexistence curve = 2
t, hence k
T

1
12
(1 T)
1
.
21
In the limiting case of p , the molar volume approaches v
1
3
, i.e. v b, after restoring dimensions.
This is the close packed limit.
1.14 Appendix I : Integrating factors
Suppose we have an inexact dierential
dW = A
i
dx
i
. (1.517)
Here I am adopting the Einstein convention where we sum over repeated indices unless
otherwise explicitly stated; A
i
dx
i
=
i
A
i
dx
i
. An integrating factor e
L(x)
is a function
which, when divided into dF, yields an exact dierential:
dU = e
L
dW =
U
x
i
dx
i
. (1.518)
Clearly we must have
2
U
x
i
x
j
=

x
i
_
e
L
A
j
_
=

x
j
_
e
L
A
i
_
. (1.519)
Applying the Leibniz rule and then multiplying by e
L
yields
A
j
x
i
A
j
L
x
i
=
A
i
x
j
A
i
L
x
j
. (1.520)
If there are K independent variables x
1
, . . . , x
K
, then there are
1
2
K(K 1) independent
equations of the above form one for each distinct (i, j) pair. These equations can be
written compactly as
ijk
L
x
k
= F
ij
, (1.521)
where
ijk
= A
j

ik
A
i
jk
(1.522)
F
ij
=
A
j
x
i
A
i
x
j
. (1.523)
Note that F
ij
is antisymmetric, and resembles a eld strength tensor, and that
ijk
=
jik
is antisymmetric in the rst two indices (but is not totally antisymmetric in all three).
Can we solve these
1
2
K(K1) coupled equations to nd an integrating factor L? In general
the answer is no. However, when K = 2 we can always nd an integrating factor. To see
why, lets call x x
1
and y x
2
. Consider now the ODE
dy
dx
=
A
x
(x, y)
A
y
(x, y)
. (1.524)
This equation can be integrated to yield a one-parameter set of integral curves, indexed by
an initial condition. The equation for these curves may be written as U
c
(x, y) = 0, where c
1.15. APPENDIX II : LEGENDRE TRANSFORMATIONS 101
labels the curves. Then along each curve we have
0 =
dU
c
dx
=
U
x
x
+
U
c
y
dy
dx
=
U
c
x

A
x
A
y
U
c
y
. (1.525)
Thus,
U
c
x
A
y
=
U
c
y
A
x
e
L
A
x
A
y
. (1.526)
This equation denes the integrating factor L:
L = ln
_
1
A
x
U
c
x
_
= ln
_
1
A
y
U
c
y
_
. (1.527)
We now have that
A
x
= e
L
U
c
x
, A
y
= e
L
U
c
y
, (1.528)
and hence
e
L
dW =
U
c
x
dx +
U
c
y
dy = dU
c
. (1.529)
1.15 Appendix II : Legendre Transformations
A convex function of a single variable f(x) is one for which f
(x) > 0 everywhere. The

Legendre transform of a convex function f(x) is a function g(p) dened as follows. Let p
be a real number, and consider the line y = px, as shown in g. 1.45. We dene the point
x(p) as the value of x for which the dierence F(x, p) = px f(x) is greatest. Then dene
g(p) = F
_
x(p), p
_
.
22
The value x(p) is unique if f(x) is convex, since x(p) is determined by
the equation
f
_
x(p)
_
= p . (1.530)
Note that from p = f
_
x(p)
_
we have, according to the chain rule,
d
dp
f
_
x(p)
_
= f
_
x(p)
_
x
(p) = x
(p) =
_
f
_
x(p)
_
_
1
. (1.531)
From this, we can prove that g(p) is itself convex:
g
(p) =
d
dp
_
p x(p) f
_
x(p)
_
_
= p x
(p) +x(p) f
_
x(p)
_
x
(p) = x(p) (1.532)

g
(p) = x
(p) =
_
f
_
x(p)
_
_
1
> 0 . (1.533)
22
Note that g(p) may be a negative number, if the line y = px lies everywhere below f(x).
Figure 1.45: Construction for the Legendre transformation of a function f(x).
In higher dimensions, the generalization of the denition f
(x) > 0 is that a function

F(x
1
, . . . , x
n
) is convex if the matrix of second derivatives, called the Hessian,
H
ij
(x) =

2
F
x
i
x
j
(1.534)
is positive denite. That is, all the eigenvalues of H
ij
(x) must be positive for every x. We
then dene the Legendre transform G(p) as
G(p) = p x F(x) (1.535)
where
p = F . (1.536)
Note that
dG = x dp +p dx F dx = x dp , (1.537)
which establishes that G is a function of p and that
G
p
j
= x
j
. (1.538)
Note also that the Legendre transformation is self dual , which is to say that the Legendre
transform of G(p) is F(x): F G F under successive Legendre transformations.
We can also dene a partial Legendre transformation as follows. Consider a function of q
variables F(x, y), where x = x
1
, . . . , x
m
and y = y
1
, . . . , y
n
, with q = m + n. Dene
p = p
1
, . . . , p
m
, and
G(p, y) = p x F(x, y) , (1.539)
1.16. APPENDIX III : USEFUL MATHEMATICAL RELATIONS 103
where
p
a
=
F
x
a
(a = 1, . . . , m) . (1.540)
These equations are then to be inverted to yield
x
a
= x
a
(p, y) =
G
p
a
. (1.541)
Note that
p
a
=
F
x
a
_
x(p, y), y
_
. (1.542)
Thus, from the chain rule,
ab
=
p
a
p
b
=

2
F
x
a
x
c
x
c
p
b
=

2
F
x
a
x
c
2
G
p
c
p
b
, (1.543)
which says
2
G
p
a
p
b
=
x
a
p
b
= K
1
ab
, (1.544)
where the mm partial Hessian is
2
F
x
a
x
b
=
p
a
x
b
= K
ab
. (1.545)
Note that K
ab
= K
ba
is symmetric. And with respect to the y coordinates,
2
G
y
=

2
F
y
= L
, (1.546)
where
L
=

2
F
y
(1.547)
is the partial Hessian in the y coordinates. Now it is easy to see that if the full q q Hessian
matrix H
ij
is positive denite, then any submatrix such as K
ab
or L
must also be positive

denite. In this case, the partial Legendre transform is convex in p
1
, . . . , p
m
and concave
in y
1
, . . . , y
n
.
1.16 Appendix III : Useful Mathematical Relations
Consider a set of n independent variables x
1
, . . . , x
n
, which can be thought of as a point
in n-dimensional space. Let y
1
, . . . , y
n
and z
1
, . . . , z
n
be other choices of coordinates.
Then
x
i
z
k
=
x
i
y
j
y
j
z
k
. (1.548)
Note that this entails a matrix multiplication: A
ik
= B
ij
C
jk
, where A
ik
= x
i
/z
k
, B
ij
=
x
i
/y
j
, and C
jk
= y
j
/z
k
. We dene the determinant
det
_
x
i
z
k
_
(x
1
, . . . , x
n
)
(z
1
, . . . , z
n
)
. (1.549)
Such a determinant is called a Jacobean. Now if A = BC, then det(A) = det(B) det(C).
Thus,
(x
1
, . . . , x
n
)
(z
1
, . . . , z
n
)
=
(x
1
, . . . , x
n
)
(y
1
, . . . , y
n
)

(y
1
, . . . , y
n
)
(z
1
, . . . , z
n
)
. (1.550)
Recall also that
x
i
x
k
=
ik
. (1.551)
Consider the case n = 2. We have
(x, y)
(u, v)
= det
_
_
_
_
x
u
_
v
_
x
v
_
u
_
y
u
_
v
_
y
v
_
u
_
_
_ =
_
x
u
_
v
_
y
v
_
u
_
x
v
_
u
_
y
u
_
v
. (1.552)
We also have
(x, y)
(u, v)

(u, v)
(r, s)
=
(x, y)
(r, s)
. (1.553)
From this simple mathematics follows several very useful results.
1) First, write
(x, y)
(u, v)
=
_
(u, v)
(x, y)
_
1
.
Now let y = v:
(x, y)
(u, y)
=
_
x
u
_
y
=
1
_
u
x
_
y
.
Thus,
_
x
u
_
y
= 1
_
_
u
x
_
y
(1.554)
2) Second, we have
(x, y)
(u, y)
=
_
x
u
_
y
=
(x, y)
(x, u)

(x, u)
(u, y)
=
_
y
u
_
x
_
x
y
_
u
.
We therefore conclude that
_
x
y
_
u
_
y
u
_
x
_
u
x
_
y
= 1 (1.555)
Invoking eqn. 1.554, we can recast this as
_
x
y
_
u
_
y
u
_
x
=
_
x
u
_
y
(1.556)
3) Third, we have
(x, v)
(u, v)
=
(x, v)
(y, v)

(y, v)
(u, v)
,
which says
_
x
u
_
v
=
_
x
y
_
v
_
y
u
_
v
(1.557)
This is simply the chain rule of partial dierentiation.
4) Fourth, we have
(x, y)
(u, y)
=
(x, y)
(u, v)

(u, v)
(u, y)
=
_
x
u
_
v
_
y
v
_
u
_
v
y
_
u
_
x
v
_
u
_
y
u
_
v
_
v
y
_
u
,
which says
_
x
u
_
y
=
_
x
u
_
v
_
x
y
_
u
_
y
u
_
v
(1.558)
5) Suppose we have a function E(y, v) and we write
dE = xdy +udv . (1.559)
That is,
x =
_
E
y
_
v
E
y
, u =
_
E
v
_
y
E
v
. (1.560)
Writing
dx = E
yy
dy +E
yv
dv (1.561)
du = E
vy
dy +E
vv
dv , (1.562)
and demanding du = 0 yields
_
x
u
_
v
=
E
yy
E
vy
. (1.563)
Note that E
vy
= E
vy
. From the equation du = 0 we also derive
_
y
v
_
u
=
E
vv
E
vy
. (1.564)
Next, we use eqn. 1.562 with du = 0 to eliminate dy in favor of dv, and then substitute
into eqn. 1.561. This yields
_
x
v
_
u
= E
yv
E
yy
E
vv
E
vy
. (1.565)
Finally, eqn. 1.562 with dv = 0 yields
_
y
u
_
v
=
1
E
vy
. (1.566)
Combining the results of eqns. 1.563, 1.564, 1.565, and 1.566, we have
(x, y)
(u, v)
=
_
x
u
_
v
_
y
v
_
u
_
x
v
_
u
_
y
u
_
v
=
_
E
yy
E
vy
__
E
vv
E
vy
_
_
E
yv
E
yy
E
vv
E
vy
__
1
E
vy
_
= 1 . (1.567)
Thus,
(T, S)
(p, V )
= 1 . (1.568)
Nota bene: It is important to understand what other quantities are kept constant, otherwise
we can run into trouble. For example, it would seem that eqn. 1.567 would also yield
(, N)
(p, V )
= 1 . (1.569)
But then we should have
(T, S)
(, N)
=
(T, S)
(p, V )

(p, V )
(, N)
= 1 (WRONG!)
when according to eqn. 1.567 it should be 1. What has gone wrong?
The problem is that we have not properly specied what else is being held constant. For
example, if we add (, N) to the mix, we should write
(T, S, N)
(p, V, N)
=
(p, V, S)
(, N, S)
=
(N, , p)
(T, S, p)
= 1 . (1.570)
If we are careful, then the general result
(T, S, N)
(y, X, N)
= 1 (1.571)
where (y, X) = (p, V ) or (H
, M
) or (E
, P
), can be quite handy, especially when used

in conjunction with eqn. 1.550. For example, we have
_
S
V
_
T,N
=
(T, S, N)
(T, V, N)
=
=1
..
(T, S, N)
(p, V, N)

(p, V, N)
(T, V, N)
=
_
p
T
_
V,N
, (1.572)
which is one of the Maxwell relations derived from the exactness of dF. Some other exam-
ples:
_
V
S
_
p,N
=
(V, p, N)
(S, p, N)
=
(V, p, N)
(S, T, N)

(S, T, N)
(S, p, N)
=
_
T
p
_
S,N
(1.573)
_
S
N
_
T,p
=
(S, T, p)
(N, T, p)
=
(S, T, p)
(, N, p)

(, N, p)
(N, T, p)
=
_
T
_
p,N
. (1.574)
Note that due to the alternating nature of the determinant it is antisymmetric under
interchange of any two rows or columns we have
(x, y, z)
(u, v, w)
=
(y, x, z)
(u, v, w)
=
(y, x, z)
(w, v, u)
= . . . . (1.575)
In general, it is usually advisable to eliminate S from a Jacobean. If we have a Jacobean
involving T, S, and N, we can write
(T, S, N)
( , , N)
=
(T, S, N)
(p, V, N)
(p, V, N)
( , , N)
=
(p, V, N)
( , , N)
, (1.576)
where each is a distinct arbitrary state variable other than N.
If our Jacobean involves the S, V , and N, we write
(S, V, N)
( , , N)
=
(S, V, N)
(T, V, N)

(T, V, N)
( , , N)
=
C
V
T

(T, V, N)
( , , N)
. (1.577)
If our Jacobean involves the S, p, and N, we write
(S, p, N)
( , , N)
=
(S, p, N)
(T, p, N)

(T, p, N)
( , , N)
=
C
p
T

(T, p, N)
( , , N)
. (1.578)
For example,
_
T
p
_
S,N
=
(T, S, N)
(p, S, N)
=
(T, S, N)
(p, V, N)

(p, V, N)
(p, T, N)

(p, T, N)
(p, S, N)
=
T
C
p
_
V
T
_
p,N
(1.579)
_
V
p
_
S,N
=
(V, S, N)
(p, S, N)
=
(V, S, N)
(V, T, N)

(V, T, N)
(p, T, N)

(p, T, N)
(p, S, N)
=
C
V
C
p
_
V
p
_
T,N
. (1.580)
Chapter 2
Ergodicity and the Approach to
Equilibrium
2.1 Equilibrium
Recall that a thermodynamic system is one containing an enormously large number of
constituent particles, a typical large number being Avogadros number, N
A
= 6.02
10
23
. Nevertheless, in equilibrium, such a system is characterized by a relatively small
number of thermodynamic state variables. Thus, while a complete description of a (classical)
system would require us to account for O
_
10
23
_
evolving degrees of freedom, with respect
to the physical quantities in which we are interested, the details of the initial conditions
are eectively forgotten over some microscopic time scale , called the collision time, and
over some microscopic distance scale, , called the mean free path
1
. The equilibrium state
is time-independent.
2.2 The Master Equation
Relaxation to equilibrium is often modeled with something called the master equation. Let
P
i
(t) be the probability that the system is in a quantum or classical state i at time t. Then
write
dP
i
dt
=
j
_
W
ji
P
j
W
ij
P
i
_
. (2.1)
Here, W
ij
is the rate at which i makes a transition to j. Note that we can write this equation
as
dP
i
dt
=
ij
P
j
, (2.2)
1
Exceptions involve quantities which are conserved by collisions, such as overall particle number, mo-
mentum, and energy. These quantities relax to equilibrium in a special way called hydrodynamics.
109
110 CHAPTER 2. ERGODICITY AND THE APPROACH TO EQUILIBRIUM
where
ij
=
_
W
ji
if i ,= j
k
W
jk
if i = j ,
(2.3)
where the prime on the sum indicates that k = j is to be excluded. The constraints on the
W
ij
are that W
ij
0 for all i, j, and we may take W
ii
0 (no sum on i). Fermis Golden
Rule of quantum mechanics says that
W
ji
=
2
i [

V [ j )
2
(E
i
) , (2.4)
where

H
0
i
_
= E
i
i
_
,

V is an additional potential which leads to transitions, and (E
i
) is
the density of nal states at energy E
i
.
If the transition rates W
ij
are themselves time-independent, then we may formally write
P
i
(t) =
_
e
t
_
ij
P
j
(0) . (2.5)
Here we have used the Einstein summation convention in which repeated indices are
summed over (in this case, the j index). Note that
ij
= 0 , (2.6)
which says that the total probability
i
P
i
is conserved:
d
dt
i
P
i
=
i,j
ij
P
j
=
j
_
ij
_
P
j
= 0 . (2.7)
Suppose we have a time-independent solution to the master equation, P
eq
i
. Then we must
have
ij
P
eq
j
= 0 = P
eq
j
W
ji
= P
eq
i
W
ij
. (2.8)
This is called the condition of detailed balance. Assuming W
ij
,= 0 and P
eq
j
= 0, we can
divide to obtain
W
ji
W
ij
=
P
eq
i
P
eq
j
. (2.9)
2.2.1 Example: radioactive decay
Consider a group of atoms, some of which are in an excited state which can undergo nuclear
decay. Let P
n
(t) be the probability that n atoms are excited at some time t. We then model
the decay dynamics by
W
nm
=
_
_
0 if m n
n if m = n 1
0 if m < n 1 .
(2.10)
2.2. THE MASTER EQUATION 111
Here, is the decay rate of an individual atom, which can be determined from quantum
mechanics. The master equation then tells us
dP
n
dt
= (n + 1) P
n+1
n P
n
. (2.11)
The interpretation here is as follows: let
n
_
denote a state in which n atoms are excited.
Then P
n
(t) =
(t) [ n)
2
. Then P
n
(t) will increase due to spontaneous transitions from
[ n+1 ) to [ n), and will decrease due to spontaneous transitions from [ n) to [ n1 ).
The average number of particles in the system is
N(t) =
n=0
nP
n
(t) . (2.12)
Note that
dN
dt
=
n=0
n
_
(n + 1) P
n+1
n P
n
_
=
n=0
_
n(n 1) P
n
n
2
P
n
_
=
n=0
nP
n
= N . (2.13)
Thus,
N(t) = N(0) e
t
. (2.14)
The relaxation time is =
1
, and the equilibrium distribution is
P
eq
n
=
n,0
. (2.15)
Note that this satises detailed balance.
We can go a bit farther here. Let us dene
P(z, t)
n=0
z
n
P
n
(t) . (2.16)
This is sometimes called a generating function. Then
P
t
=
n=0
z
n
_
(n + 1) P
n+1
nP
n
_
=
P
z
z
P
z
. (2.17)
Thus,
1
P
t
(1 z)
P
z
= 0 . (2.18)
We now see that any function f() satises the above equation, where = t ln(1 z).
Thus, we can write
P(z, t) = f
_
t ln(1 z)
_
. (2.19)
Setting t = 0 we have P(z, 0) = f
_
ln(1 z)
_
, and inverting this result we obtain f(u) =
P(1 e
u
, 0), i.e.
P(z, t) = P
_
1 + (z 1) e
t
, 0
_
. (2.20)
The total probability is P(z =1, t) =
n=0
P
n
, which clearly is conserved: P(1, t) = P(1, 0).
The average particle number is
N(t) =
n=0
nP
n
(t) =
P
z
z=1
= e
t
P(1, 0) = N(0) e
t
. (2.21)
2.2.2 Decomposition of
ij
The matrix
ij
is real but not necessarily symmetric. For such a matrix, the left eigenvectors
i
and the right eigenvectors
j
are not the same: general dierent:
i

ij
=
j
(2.22)
ij

j
=
i
. (2.23)
Note that the eigenvalue equation for the right eigenvectors is = while that for the
left eigenvectors is
t
= . The characteristic polynomial is the same in both cases:
F() det ( ) = det (
t
) , (2.24)
which means that the left and right eigenvalues are the same. Note also that
_
F()
=
F(
), hence the eigenvalues are either real or appear in complex conjugate pairs. Multiply-
ing the eigenvector equation for
on the right by
j
and summing over j, and multiplying
the eigenvector equation for
on the left by
i
and summing over i, and subtracting the
two results yields
_
_
= 0 , (2.25)
where the inner product is
_
=
i
. (2.26)
We can now demand
_
=
, (2.27)
in which case we can write
=

ij
=
j
. (2.28)
2.3. BOLTZMANNS H-THEOREM 113
We note that

= (1, 1, . . . , 1) is a left eigenvector with eigenvalue = 0, since
ij
= 0.
We do not know a priori the corresponding right eigenvector, which depends on other details
of
ij
. Now lets expand P
i
(t) in the right eigenvectors of , writing
P
i
(t) =
(t)
i
. (2.29)
Then
dP
i
dt
=
dC
dt

i
=
ij
P
j
=
ij

j
=
i
. (2.30)
This allows us to write
dC
dt
=
= C
(t) = C
(0) e
t
. (2.31)
Hence, we can write
P
i
(t) =
(0) e
i
. (2.32)
It is now easy to see that Re (
) 0 for all , or else the probabilities will become negative.

For suppose Re (
) < 0 for some . Then as t , the sum in eqn. 2.32 will be dominated
by the term for which
has the largest negative real part; all other contributions will be
subleading. But we must have
i
= 0 since
_
must be orthogonal to the left
eigenvector

=0
= (1, 1, . . . , 1). Therefore, at least one component of
i
(i.e. for some
value of i) must have a negative real part, which means a negative probability!
2
We conclude that P
i
(t) P
eq
i
as t , relaxing to the = 0 right eigenvector, with
Re (
) 0 for all .
2.3 Boltzmanns H-theorem
Suppose for the moment that is a symmetric matrix, i.e.
ij
=
ji
. Then construct the
function
H(t) =
i
P
i
(t) ln P
i
(t) . (2.33)
2
Since the probability P
i
(t) is real, if the eigenvalue with the smallest (i.e. largest negative) real part
is complex, there will be a corresponding complex conjugate eigenvalue, and summing over all eigenvectors
will result in a real value for P
i
(t).
Then
dH
dt
=
i
dP
i
dt
_
1 + lnP
i
) =
i
dP
i
dt
ln P
i
=
i,j
ij
P
j
ln P
i
=
i,j
ij
P
j
_
lnP
j
ln P
i
_
, (2.34)
where we have used
ij
= 0. Now switch i j in the above sum and add the terms to
get
dH
dt
=
1
2
i,j
ij
_
P
i
P
j
_ _
ln P
i
lnP
j
_
. (2.35)
Note that the i = j term does not contribute to the sum. For i ,= j we have
ij
= W
ji
0,
and using the result
(x y) (ln x ln y) 0 , (2.36)
we conclude
dH
dt
0 . (2.37)
In equilibrium, P
eq
i
is a constant, independent of i. We write
P
eq
i
=
1
, =
i
1 = H = ln . (2.38)
If
ij
,=
ji
,we can still prove a version of the H-theorem. Dene a new symmetric matrix
W
ij
P
eq
i
W
ij
= P
eq
j
W
ji
= W
ji
, (2.39)
and the generalized H-function,
H(t)
i
P
i
(t) ln
_
P
i
(t)
P
eq
i
_
. (2.40)
Then
dH
dt
=
1
2
i,j
W
ij
_
P
i
P
eq
i
P
j
P
eq
j
_
_
ln
_
P
i
P
eq
i
_
ln
_
P
j
P
eq
j
_
_
0 . (2.41)
2.4 Hamiltonian Evolution
The master equation provides us with a semi-phenomenological description of a dynamical
systems relaxation to equilibrium. It explicitly breaks time reversal symmetry. Yet the
microscopic laws of Nature are (approximately) time-reversal symmetric. How can a system
which obeys Hamiltons equations of motion come to equilibrium?
2.4. HAMILTONIAN EVOLUTION 115
Lets start our investigation by reviewing the basics of Hamiltonian dynamics. Recall the
Lagrangian L = L(q, q, t) = T V . The Euler-Lagrange equations of motion for the action
S
_
q(t)
=
_
dt L are
p
=
d
dt
_
L
q
_
=
L
q
, (2.42)
where p
is the canonical momentum conjugate to the generalized coordinate q
:
p
=
L
q
. (2.43)
The Hamiltonian, H(q, p) is obtained by a Legendre transformation,
H(q, p) =
r
=1
p
L . (2.44)
Note that
dH =
r
=1
_
p
d q
+ q
dp
L
q
dq
L
q
d q
L
t
dt
=
r
=1
_
q
dp
L
q
dq
L
t
dt . (2.45)
Thus, we obtain Hamiltons equations of motion,
H
p
= q
,
H
q
=
L
q
= p
(2.46)
and
dH
dt
=
H
t
=
L
t
. (2.47)
Dene the rank 2r vector by its components,
i
=
_
_
q
i
if 1 i r
p
ir
if r i 2r .
(2.48)
Then we may write Hamiltons equations compactly as

i
= J
ij
H
j
, (2.49)
where
J =
_
0
rr
1
rr
1
rr
0
rr
_
(2.50)
is a rank 2r matrix. Note that J
t
= J, i.e. J is antisymmetric, and that J
2
= 1
2r2r
.
2.5 Evolution of Phase Space Volumes
Consider a general dynamical system,
d
dt
= V () , (2.51)
where (t) is a point in an n-dimensional phase space. Consider now a compact
3
region
1
0
in phase space, and consider its evolution under the dynamics. That is, 1
0
consists of
a set of points
_
[ 1
0
_
, and if we regard each 1
0
as an initial condition, we can
dene the time-dependent set 1(t) as the set of points (t) that were in 1
0
at time t = 0:
1(t) =
_
(t)
(0) 1
0
_
. (2.52)
Now consider the volume (t) of the set 1(t). We have
(t) =
_
1(t)
d (2.53)
where
d = d
1
d
2
d
n
, (2.54)
for an n-dimensional phase space. We then have
(t +dt) =
_
1(t+dt)
d
=
_
1(t)
d
i
(t +dt)
j
(t)
, (2.55)
where
i
(t +dt)
j
(t)
1
, . . . ,
n
)
(
1
, . . . ,
n
)
(2.56)
is a determinant, which is the Jacobean of the transformation from the set of coordinates
_
i
=
i
(t)
_
to the coordinates
_
i
=
i
(t +dt)
_
. But according to the dynamics, we have
i
(t +dt) =
i
(t) +V
i
_
(t)
_
dt +O(dt
2
) (2.57)
and therefore
i
(t +dt)
j
(t)
=
ij
+
V
i
j
dt +O(dt
2
) . (2.58)
We now make use of the equality
ln det M = Tr ln M , (2.59)
for any matrix M, which gives us
4
, for small ,
det
_
1 +A
_
= exp Tr ln
_
1 +A
_
= 1 + Tr A+
1
2

2
_
_
Tr A
_
2
Tr (A
2
)
_
+. . . (2.60)
3
Compact in the parlance of mathematical analysis means closed and bounded.
4
The equality ln det M = Tr ln M is most easily proven by bringing the matrix to diagonal form via a
similarity transformation, and proving the equality for diagonal matrices.
2.5. EVOLUTION OF PHASE SPACE VOLUMES 117
Thus,
(t +dt) = (t) +
_
1(t)
d V dt +O(dt
2
) , (2.61)
which says
d
dt
=
_
1(t)
d V =
_
1(t)
dS n V (2.62)
Here, the divergence is the phase space divergence,
V =
n
i=1
V
i
i
, (2.63)
and we have used Stokes theorem to convert the volume integral of the divergence to a
surface integral of n V , where n is the surface normal and dS is the dierential element
of surface area, and 1 denotes the boundary of the region 1. We see that if V = 0
everywhere in phase space, then (t) is a constant, and phase space volumes are preserved
by the evolution of the system.
For an alternative derivation, consider a function (, t) which is dened to be the density
of some collection of points in phase space at phase space position and time t. This must
satisfy the continuity equation,
t
+(V ) = 0 . (2.64)
This is called the continuity equation. It says that nobody gets lost. If we integrate it over
a region of phase space 1, we have
d
dt
_
1
d =
_
1
d(V ) =
_
1
dS n (V ) . (2.65)
It is perhaps helpful to think of as a charge density, in which case J = V is the current
density. The above equation then says
dQ
1
dt
=
_
1
dS n J , (2.66)
where Q
1
is the total charge contained inside the region 1. In other words, the rate of
increase or decrease of the charge within the region 1is equal to the total integrated current
owing in or out of 1 at its boundary.
The Leibniz rule lets us write the continuity equation as
t
+V + V = 0 . (2.67)
But now suppose that the phase ow is divergenceless, i.e. V = 0. Then we have
D
Dt

_
t
+V
_
= 0 . (2.68)
Figure 2.1: Time evolution of two immiscible uids. The local density remains constant.
The combination inside the brackets above is known as the convective derivative. It tells
us the total rate of change of for an observer moving with the phase ow. That is
d
dt

_
(t), t
_
=

i
d
i
dt
+

t
=
n
i=1
V
i
i
+

t
=
D
Dt
. (2.69)
If D/Dt = 0, the local density remains the same during the evolution of the system. If we
consider the characteristic function
(, t = 0) =
_
1 if 1
0
0 otherwise
(2.70)
then the vanishing of the convective derivative means that the image of the set 1
0
under
time evolution will always have the same volume.
Hamiltonian evolution in classical mechanics is volume preserving. The equations of motion
are
q
i
= +
H
p
i
, p
i
=
H
q
i
(2.71)
A point in phase space is specied by r positions q
i
and r momenta p
i
, hence the dimension
of phase space is n = 2r:
=
_
q
p
_
, V =
_
q
p
_
=
_
H/p
H/q
_
. (2.72)
2.5. EVOLUTION OF PHASE SPACE VOLUMES 119
Hamiltons equations of motion guarantee that the phase space ow is divergenceless:
V =
r
i=1
_
q
i
q
i
+
p
i
p
i
_
=
r
i=1
_

q
i
_
H
p
i
_
+

p
i
_
H
q
i
_
_
= 0 . (2.73)
Thus, we have that the convective derivative vanishes, viz.
D
Dt

t
+V = 0 , (2.74)
for any distribution (, t) on phase space. Thus, the value of the density ((t), t) is
constant, which tells us that the phase ow is incompressible. In particular, phase space
volumes are preserved.
2.5.1 Liouvilles Equation
Let () = (q, p) be a distribution on phase space. Assuming the evolution is Hamiltonian,
we can write
t
= =
r
k=1
_
q
k
q
k
+ p
k
p
k
_
= i
L , (2.75)
where

L is a dierential operator known as the Liouvillian:
L = i
r
k=1
_
H
p
k
q
k
H
q
k
p
k
_
. (2.76)
Eqn. 2.75, known as Liouvilles equation, bears an obvious resemblance to the Schr odinger
equation from quantum mechanics.
Suppose that
a
() is conserved by the dynamics of the system. Typical conserved quan-
tities include the components of the total linear momentum (if there is translational invari-
ance), the components of the total angular momentum (if there is rotational invariance),
and the Hamiltonian itself (if the Lagrangian is not explicitly time-dependent). Now con-
sider a distribution (, t) = (
1
,
2
, . . . ,
k
) which is a function only of these various
conserved quantities. Then from the chain rule, we have
=
a

a
= 0 , (2.77)
since for each a we have
d
a
dt
=
r
=1
_
a
q
+

a
p
_
=
a
= 0 . (2.78)
We conclude that any distribution (, t) = (
1
,
2
, . . . ,
k
) which is a function solely of
conserved dynamical quantities is a stationary solution to Liouvilles equation.
Clearly the microcanonical distribution,
E
() =
_
E H()
_
(E)
=
_
E H()
_
_
d
_
E H()
_ , (2.79)
is a xed point solution of Liouvilles equation.
2.6 Irreversibility and Poincare Recurrence
The dynamics of the master equation describe an approach to equilibrium. These dynamics
are irreversible: (dH/dt) 0. However, the microscopic laws of physics are (almost) time-
reversal invariant
5
, so how can we understand the emergence of irreversibility? Furthermore,
any dynamics which are deterministic and volume-preserving in a nite phase space exhibits
the phenomenon of Poincare recurrence, which guarantees that phase space trajectories are
arbitrarily close to periodic if one waits long enough.
2.6.1 Poincare recurrence theorem
The proof of the recurrence theorem is simple. Let g
be the -advance mapping which

evolves points in phase space according to Hamiltons equations. Assume that g
is invertible
and volume-preserving, as is the case for Hamiltonian ow. Further assume that phase space
volume is nite. Since the energy is preserved in the case of time-independent Hamiltonians,
we simply ask that the volume of phase space at xed total energy E be nite, i.e.
_
d
_
E H(q, p)
_
< , (2.80)
where d = dq dp is the phase space uniform integration measure.
Theorem: In any nite neighborhood 1
0
of phase space there exists a point
0
which will
return to 1
0
after m applications of g
, where m is nite.
Proof: Assume the theorem fails; we will show this assumption results in a contradiction.
Consider the set formed from the union of all sets g
k
1 for all m:
=
_
k=0
g
k
1
0
(2.81)
5
Actually, the microscopic laws of physics are not time-reversal invariant, but rather are invariant under
the product PCT, where P is parity, C is charge conjugation, and T is time reversal.
2.6. IRREVERSIBILITY AND POINCAR
E RECURRENCE 121
Figure 2.2: Successive images of a set 1
0
under the -advance mapping g
, projected onto
a two-dimensional phase plane. The Poincare recurrence theorem guarantees that if phase
space has nite volume, and g
is invertible and volume preserving, then for any set 1

0
there exists an integer m such that 1
0
g
m
1
0
,= .
We assume that the set g
k
1
0
[ k
+
is disjoint. The volume of a union of disjoint sets is
the sum of the individual volumes. Thus,
vol() =
k=0
vol
_
g
k
1
0
_
= vol(1
0
)
k=0
1 = , (2.82)
since vol
_
g
k
1
0
_
= vol
_
1
0
_
from volume preservation. But clearly is a subset of the
entire phase space, hence we have a contradiction, because by assumption phase space is of
nite volume.
Thus, the assumption that the set g
k
1
0
[ k Z
+
is disjoint fails. This means that there
exists some pair of integers k and l, with k ,= l, such that g
k
1
0
g
l
1
0
,= . Without loss
of generality we may assume k < l. Apply the inverse g
1
to this relation k times to get

g
lk
1
0
1
0
,= . Now choose any point
1
g
m
1
0
1
0
, where m = l k, and dene
0
= g
m

1
. Then by construction both
0
and g
m

0
lie within 1
0
and the theorem is
proven.
Poincare recurrence has remarkable implications. Consider a bottle of perfume which is
opened in an otherwise evacuated room, as depicted in g. 2.3. The perfume molecules
evolve according to Hamiltonian evolution. The positions are bounded because physical
space is nite. The momenta are bounded because the total energy is conserved, hence
Figure 2.3: Poincare recurrence guarantees that if we remove the cap from a bottle of
perfume in an otherwise evacuated room, all the perfume molecules will eventually return
to the bottle!
no single particle can have a momentum such that T(p) > E
TOT
, where T(p) is the sin-
gle particle kinetic energy function
6
. Thus, phase space, however large, is still bounded.
Hamiltonian evolution, as we have seen, is invertible and volume preserving, therefore the
system is recurrent. All the molecules must eventually return to the bottle. Whats more,
they all must return with momenta arbitrarily close to their initial momenta! In this case,
we could dene the region 1
0
as
1
0
=
_
(q
1
, . . . , q
r
, p
1
, . . . , p
r
)
[q
i
q
0
i
[ q and [p
j
p
0
j
[ p i, j
_
, (2.83)
which species a hypercube in phase space centered about the point (q
0
, p
0
).
Each of the three central assumptions nite phase space, invertibility, and volume preser-
vation is crucial. If any one of these assumptions does not hold, the proof fails. Obviously
if phase space is innite the ow neednt be recurrent since it can keep moving o in a
particular direction. Consider next a volume-preserving map which is not invertible. An
example might be a mapping f : R R which takes any real number to its fractional part.
Thus, f() = 0.14159265 . . .. Let us restrict our attention to intervals of width less than
unity. Clearly f is then volume preserving. The action of f on the interval [2, 3) is to map
it to the interval [0, 1). But [0, 1) remains xed under the action of f, so no point within
the interval [2, 3) will ever return under repeated iterations of f. Thus, f does not exhibit
Poincare recurrence.
Consider next the case of the damped harmonic oscillator. In this case, phase space volumes
contract. For a one-dimensional oscillator obeying x + 2 x +
2
0
x = 0 one has V =
2 < 0, since > 0 for physical damping. Thus the convective derivative is D
t
=
(V ) = 2 which says that the density increases exponentially in the comoving frame,
as (t) = e
2t
(0). Thus, phase space volumes collapse: (t) = e
22
(0), and are not
preserved by the dynamics. The proof of recurrence therefore fails. In this case, it is possible
6
In the nonrelativistic limit, T = p
2
/2m. For relativistic particles, we have T = (p
2
c
2
+m
2
c
4
)
1/2
mc
2
.
2.7. KAC RING MODEL 123
for the set to be of nite volume, even if it is the union of an innite number of sets
g
k
1
0
, because the volumes of these component sets themselves decrease exponentially, as
vol(g
n
1
0
) = e
2n
vol(1
0
). A damped pendulum, released from rest at some small angle
0
, will not return arbitrarily close to these initial conditions.
2.7 Kac Ring Model
The implications of the Poincare recurrence theorem are surprising even shocking. If one
takes a bottle of perfume in a sealed, evacuated room and opens it, the perfume molecules
will diuse throughout the room. The recurrence theorem guarantees that after some nite
time T all the molecules will go back inside the bottle (and arbitrarily close to their initial
velocities as well). The hitch is that this could take a very long time, e.g. much much longer
than the age of the Universe.
On less absurd time scales, we know that most systems come to thermodynamic equilibrium.
But how can a system both exhibit equilibration and Poincare recurrence? The two concepts
seem utterly incompatible!
A beautifully simple model due to Kac shows how a recurrent system can exhibit the
phenomenon of equilibration. Consider a ring with N sites. On each site, place a spin
which can be in one of two states: up or down. Along the N links of the system, F of
them contain ippers. The conguration of the ippers is set at the outset and never
changes. The dynamics of the system are as follows: during each time step, every spin
moves clockwise a distance of one lattice spacing. Spins which pass through ippers reverse
their orientation: up becomes down, and down becomes up.
The phase space for this system consists of 2
N
discrete congurations. Since each congu-
ration maps onto a unique image under the evolution of the system, phase space volume is
preserved. The evolution is invertible; the inverse is obtained simply by rotating the spins
counterclockwise. Figure 2.4 depicts an example conguration for the system, and its rst
iteration under the dynamics.
Suppose the ippers were not xed, but moved about randomly. In this case, we could focus
on a single spin and determine its conguration probabilistically. Let p
n
be the probability
that a given spin is in the up conguration at time n. The probability that it is up at time
(n + 1) is then
p
n+1
= (1 x) p
n
+x(1 p
n
) , (2.84)
where x = F/N is the fraction of ippers in the system. In words: a spin will be up at
time (n + 1) if it was up at time n and did not pass through a ipper, or if it was down
at time n and did pass through a ipper. If the ipper locations are randomized at each
time step, then the probability of ipping is simply x = F/N. Equation 2.84 can be solved
immediately:
p
n
=
1
2
+ (1 2x)
n
(p
0
1
2
) , (2.85)
Figure 2.4: Left: A conguration of the Kac ring with N = 16 sites and F = 4 ippers. The
ippers, which live on the links, are represented by blue dots. Right: The ring system after
one time step. Evolution proceeds by clockwise rotation. Spins passing through ippers are
ipped.
which decays exponentially to the equilibrium value of p
eq
=
1
2
with time scale
(x) =
1
ln [1 2x[
. (2.86)
We identify (x) as the microscopic relaxation time over which local equilibrium is es-
tablished. If we dene the magnetization m (N
)/N, then m = 2p 1, so
m
n
= (1 2x)
n
m
0
. The equilibrium magnetization is m
eq
= 0. Note that for
1
2
< x < 1
that the magnetization reverses sign each time step, as well as decreasing exponentially in
magnitude.
The assumption that leads to equation 2.84 is called the Stosszahlansatz
7
, a long German
word meaning, approximately, assumption on the counting of hits. The resulting dynamics
are irreversible: the magnetization inexorably decays to zero. However, the Kac ring model
is purely deterministic, and the Stosszahlansatz can at best be an approximation to the
true dynamics. Clearly the Stosszahlansatz fails to account for correlations such as the
following: if spin i is ipped at time n, then spin i +1 will have been ipped at time n 1.
Also if spin i is ipped at time n, then it also will be ipped at time n +N. Indeed, since
the dynamics of the Kac ring model are invertible and volume preserving, it must exhibit
Poincare recurrence. We see this most vividly in gs. 2.5 and 2.6.
The model is trivial to simulate. The results of such a simulation are shown in gure 2.5 for
a ring of N = 1000 sites, with F = 100 and F = 24 ippers. Note how the magnetization
decays and uctuates about the equilibrium value
eq
= 0, but that after N iterations m
7
Unfortunately, many important physicists were German and we have to put up with a legacy of long
German words like Gedankenexperiment, Zitterbewegung, Brehmsstrahlung, Stosszahlansatz, Kartoelsalat,
etc.
2.8. REMARKS ON ERGODIC THEORY 125
Figure 2.5: Two simulations of the Kac ring model, each with N = 1000 sites and with
F = 100 ippers (top panel) and F = 24 ippers (bottom panel). The red line shows the
magnetization as a function of time, starting from an initial conguration in which 90% of
the spins are up. The blue line shows the prediction of the Stosszahlansatz, which yields
an exponentially decaying magnetization with time constant .
recovers its initial value: m
N
= m
0
. The recurrence time for this system is simply N if F is
even, and 2N if F is odd, since every spin will then have ipped an even number of times.
In gure 2.6 we plot two other simulations. The top panel shows what happens when x >
1
2
,
so that the magnetization wants to reverse its sign with every iteration. The bottom panel
shows a simulation for a larger ring, with N = 25000 sites. Note that the uctuations in m
about equilibrium are smaller than in the cases with N = 1000 sites. Why?
2.8 Remarks on Ergodic Theory
A mechanical system evolves according to Hamiltons equations of motion. We have seen
how such a system is recurrent in the sense of Poincare.
There is a level beyond recurrence called ergodicity. In an ergodic system, time averages
over intervals [0, T] with T may be replaced by phase space averages. The time
Figure 2.6: Simulations of the Kac ring model. Top: N = 1000 sites with F = 900 ippers.
The ipper density x = F/N is greater than
1
2
, so the magnetization reverses sign every
time step. Only 100 iterations are shown, and the blue curve depicts the absolute value of
the magnetization within the Stosszahlansatz. Bottom: N = 25, 000 sites with F = 1000
ippers. Note that the uctuations about the equilibrium magnetization m = 0 are much
smaller than in the N = 1000 site simulations.
average of a function f() is dened as
f()
_
T
= lim
T
1
T
T
_
0
dt f
_
(t)
_
. (2.87)
For a Hamiltonian system, the phase space average of the same function is dened by
f()
_
S
=
_
df()
_
E H()
_
__
d
_
E H()
_
, (2.88)
where H() = H(q, p) is the Hamiltonian, and where (x) is the Dirac -function. Thus,
ergodicity
f()
_
T
=
f()
_
S
, (2.89)
for all smooth functions f() for which
f()
_
S
exists and is nite. Note that we do not
average over all of phase space. Rather, we average only over a hypersurface along which
H() = E is xed, i.e. over one of the level sets of the Hamiltonian function. This is
because the dynamics preserves the energy. Ergodicity means that almost all points will,
upon Hamiltonian evolution, move in such a way as to eventually pass through every nite
neighborhood on the energy surface, and will spend equal time in equal regions of phase
space.
Let

1
() be the characteristic function of a region 1:
1
() =
_
1 if 1
0 otherwise,
(2.90)
where H() = E for all 1. Then
1
()
_
T
= lim
T
_
time spent in 1
T
_
. (2.91)
If the system is ergodic, then
1
()
_
T
= P(1) =

1
(E)
(E)
, (2.92)
where P(1) is the a priori probability to nd 1, based solely on the relative volumes
of 1 and of the entire phase space. The latter is given by
(E) =
_
d
_
E H()
_
(2.93)
is the surface area of phase space at energy E, and
1
(E) =
_
1
d
_
E H()
_
. (2.94)
is the surface area of phase space at energy E contained in 1.
Note that
(E)
_
d
_
E H()
_
=
_
S
E
dS
[H[
(2.95)
=
d
dE
_
d
_
E H()
_
=
d(E)
dE
. (2.96)
Here, dS is the dierential surface element, o
E
is the constant H hypersurface H() = E,
and (E) is the volume of phase space over which H() < E. Note also that we may write
d = dEd
E
, (2.97)
where
d
E
=
dS
[H[
H()=E
(2.98)
is the the invariant surface element.
Figure 2.7: Constant phase space velocity at an irrational angle over a toroidal phase space
is ergodic, but not mixing. A circle remains a circle, and a blob remains a blob.
2.8.1 The microcanonical ensemble
The distribution,
E
() =
_
E H()
_
(E)
=
_
E H()
_
_
d
_
E H()
_ , (2.99)
denes the microcanonical ensemble (CE) of Gibbs.
We could also write
f()
_
S
=
1
(E)
_
S
E
d
E
f() , (2.100)
integrating over the hypersurface o
E
rather than the entire phase space.
2.8.2 Ergodicity and mixing
Just because a system is ergodic, it doesnt necessarily mean that (, t)
eq
(), for
consider the following motion on the toroidal space
_
= (q, p)
0 q < 1 , 0 p < 1
_
,
where we identify opposite edges, i.e. we impose periodic boundary conditions. We also
take q and p to be dimensionless, for simplicity of notation. Let the dynamics be given by
q = 1 , p = . (2.101)
The solution is
q(t) = q
0
+t , p(t) = p
0
+t , (2.102)
hence the phase curves are given by
p = p
0
+(q q
0
) . (2.103)
Now consider the average of some function f(q, p). We can write f(q, p) in terms of its
Fourier transform,
f(q, p) =
m,n
f
mn
e
2i(mq+np)
. (2.104)
Figure 2.8: The bakers transformation is a successive stretching, cutting, and restacking.
We have, then,
f
_
q(t), p(t)
_
=
m,n
f
mn
e
2i(mq
0
+np
0
)
e
2i(m+n)t
. (2.105)
We can now perform the time average of f:
f(q, p)
_
T
=

f
00
+ lim
T
1
T
m,n
e
2i(mq
0
+np
0
)
e
2i(m+n)T
1
2i(m +n)
=

f
00
if irrational. (2.106)
Clearly,
f(q, p)
_
S
=
1
_
0
dq
1
_
0
dp f(q, p) =

f
00
=
f(q, p)
_
T
, (2.107)
so the system is ergodic.
The situation is depicted in g. 2.7. If we start with the characteristic function of a disc,
(q, p, t = 0) =
_
a
2
(q q
0
)
2
(p p
0
)
2
_
, (2.108)
then it remains the characteristic function of a disc:
(q, p, t) =
_
a
2
(q q
0
t)
2
(p p
0
t)
2
_
, (2.109)
A stronger condition one could impose is the following. Let A and B be subsets of o
E
.
Dene the measure
(A) =
_
d
E

A
()
__
d
E
=

A
(E)
(E)
, (2.110)
Figure 2.9: The multiply iterated bakers transformation. The set A covers half the phase
space and its area is preserved under the map. Initially, the fraction of B covered by A is
zero. After many iterations, the fraction of B covered by g
n
A approaches
1
2
.
where

A
() is the characteristic function of A. The measure of a set A is the fraction of
the energy surface o
E
covered by A. This means (o
E
) = 1, since o
E
is the entire phase
space at energy E. Now let g be a volume-preserving map on phase space. Given two
measurable sets A and B, we say that a system is mixing if
mixing lim
n
_
g
n
A B
_
= (A) (B) . (2.111)
In other words, the fraction of B covered by the n
th
iterate of A, i.e. g
n
A, is, as n ,
simply the fraction of o
E
covered by A. The iterated map g
n
distorts the region A so
severely that it eventually spreads out evenly over the entire energy hypersurface. Of
course by evenly we mean with respect to any nite length scale, because at the very
smallest scales, the phase space density is still locally constant as one evolves with the
dynamics.
Mixing means that
f()
_
=
_
d(, t) f()
t
_
df()
_
E H()
_
__
d
_
E H()
_
Tr
_
f()
_
E H()
_
__
Tr
_
_
E H()
_
_
. (2.112)
Physically, we can imagine regions of phase space being successively stretched and folded.
During the stretching process, the volume is preserved, so the successive stretch and fold
operations map phase space back onto itself.
Figure 2.10: The Arnold cat map applied to an image of 150 150 pixels. After 300
iterations, the image repeats itself. (Source: Wikipedia)
An example of a mixing system is the bakers transformation, depicted in g. 2.8. The
baker map is dened by
g(q, p) =
_
_
_
2q ,
1
2
p
_
if 0 q <
1
2
_
2q 1 ,
1
2
p +
1
2
_
if
1
2
q < 1 .
(2.113)
Note that g is invertible and volume-preserving. The bakers transformation consists of an
initial stretch in which q is expanded by a factor of two and p is contracted by a factor of
two, which preserves the total volume. The system is then mapped back onto the original
area by cutting and restacking, which we can call a fold. The inverse transformation is
accomplished by stretching rst in the vertical (p) direction and squashing in the horizontal
(q) direction, followed by a slicing and restacking. Explicitly,
g
1
(q, p) =
_
_
_
1
2
q , 2p
_
if 0 p <
1
2
_
1
2
q +
1
2
, 2p 1
_
if
1
2
p < 1 .
(2.114)
Another example of a mixing system is Arnolds cat map
8
g(q, p) =
_
[q +p] , [q + 2p]
_
, (2.115)
where [x] denotes the fractional part of x. One can write this in matrix form as
_
q
_
=
M
..
_
1 1
1 2
_ _
q
p
_
mod Z
2
. (2.116)
8
The cat map gets its name from its initial application, by Arnold, to the image of a cats face.
Figure 2.11: The hierarchy of dynamical systems.
The matrix M is very special because it has integer entries and its determinant is det M = 1.
This means that the inverse also has integer entries. The inverse transformation is then
_
q
p
_
=
M
1
..
_
2 1
1 1
_ _
q
_
mod Z
2
. (2.117)
Now for something cool. Suppose that our image consists of a set of discrete points located
at (n
1
/k , n
2
/k), where the denominator k Z is xed, and where n
1
and n
2
range over the
set 1, . . . , k. Clearly g and its inverse preserve this set, since the entries of M and M
1
are
integers. If there are two possibilities for each pixel (say o and on, or black and white), then
there are 2
(k
2
)
possible images, and the cat map will map us invertibly from one image to
another. Therefore it must exhibit Poincare recurrence! This phenomenon is demonstrated
vividly in g. 2.10, which shows a k = 150 pixel (square) image of a cat subjected to the
iterated cat map. The image is stretched and folded with each successive application of the
cat map, but after 300 iterations the image is restored! How can this be if the cat map is
mixing? The point is that only the discrete set of points (n
1
/k , n
2
/k) is periodic. Points
with dierent denominators will exhibit a dierent periodicity, and points with irrational
coordinates will in general never return to their exact initial conditions, although recurrence
says they will come arbitrarily close, given enough iterations. The bakers transformation
is also dierent in this respect, since the denominator of the p coordinate is doubled upon
each successive iteration.
The student should now contemplate the hierarchy of dynamical systems depicted in g.
2.11, understanding the characteristic features of each successive renement
9
.
9
There is something beyond mixing, called a K-system. A K-system has positive Kolmogorov-Sinai
entropy. For such a system, closed orbits separate exponentially in time, and consequently the Liouvillian
L has a Lebesgue spectrum with denumerably innite multiplicity.
Chapter 3
Statistical Ensembles
3.1 References
F. Reif, Fundamentals of Statistical and Thermal Physics (McGraw-Hill, 1987)
This has been perhaps the most popular undergraduate text since it rst appeared in
1967, and with good reason.
This is the best undergraduate thermodynamics book Ive come across, but only 40%
of the book treats statistical mechanics.
C. Kittel, Elementary Statistical Physics (Dover, 2004)
Remarkably crisp, though dated, this text is organized as a series of brief discussions
of key concepts and examples. Published by Dover, so you cant beat the price.
M. Kardar, Statistical Physics of Particles (Cambridge, 2007)
A superb modern text, with many insightful presentations of key concepts.
M. Plischke and B. Bergersen, Equilibrium Statistical Physics (3
rd
edition, World
Scientic, 2006)
An excellent graduate level text. Less insightful than Kardar but still a good modern
treatment of the subject. Good discussion of mean eld theory.
E. M. Lifshitz and L. P. Pitaevskii, Statistical Physics (part I, 3
rd
edition, Pergamon,
1980)
This is volume 5 in the famous Landau and Lifshitz Course of Theoretical Physics.
Though dated, it still contains a wealth of information and physical insight.
133
134 CHAPTER 3. STATISTICAL ENSEMBLES
3.2 Probability
Consider a system whose possible congurations [ n) can be labeled by a discrete variable
n (, where ( is the set of possible congurations. The total number of possible congu-
rations, which is to say the order of the set (, may be nite or innite. Next, consider an
ensemble of such systems, and let P
n
denote the probability that a given random element
from that ensemble is in the state (conguration) [ n). The collection P
n
forms a discrete
probability distribution. We assume that the distribution is normalized, meaning
n(
P
n
= 1 . (3.1)
Now let A
n
be a quantity which takes values depending on n. The average of A is given by
A) =
n(
P
n
A
n
. (3.2)
Typically, ( is the set of integers (Z) or some subset thereof, but it could be any countable
set. As an example, consider the throw of a single six-sided die. Then P
n
=
1
6
for each
n 1, . . . , 6. Let A
n
= 0 if n is even and 1 if n is odd. Then nd A) =
1
2
, i.e. on average
half the throws of the die will result in an even number.
It may be that the systems congurations are described by several discrete variables
n
1
, n
2
, n
3
, . . .. We can combine these into a vector n and then we write P
n
for the
discrete distribution, with
n
P
n
= 1.
Another possibility is that the systems congurations are parameterized by a collection of
continuous variables, =
1
, . . . ,
n
. We write , where is the phase space (or
conguration space) of the system. Let d be a measure on this space. In general, we can
write
d = W(
1
, . . . ,
n
) d
1
d
2
d
n
. (3.3)
The phase space measure used in classical statistical mechanics gives equal weight W to
equal phase space volumes:
d = (
r
=1
dq
dp
, (3.4)
where ( is a constant we shall discuss later on below
1
.
Any continuous probability distribution P() is normalized according to
_
dP() = 1 . (3.5)
1
Such a measure is invariant with respect to canonical transformations, which are the broad class of
transformations among coordinates and momenta which leave Hamiltons equations of motion invariant,
and which preserve phase space volumes under Hamiltonian evolution. For this reason d is called an
invariant phase space measure. See the discussion in the appendix, 3.17.
3.2. PROBABILITY 135
The average of a function A() on conguration space is then
A) =
_
dP() A() . (3.6)

For example, consider the Gaussian distribution
P(x) =
1
2
2
e
(x)
2
/2
2
. (3.7)
From the result
2
dx e
x
2
e
x
=
_
2
/4
, (3.8)
we see that P(x) is normalized. One can then compute
x) = (3.9)
x
2
) x)
2
=
2
. (3.10)
We call the mean and the standard deviation of the distribution, eqn. 3.7.
The quantity P() is called the distribution or probability density. One has
P() d = probability that conguration lies within volume d centered at
For example, consider the probability density P = 1 normalized on the interval x
_
0, 1
.
The probability that some x chosen at random will be exactly
1
2
, say, is innitesimal one
would have to specify each of the innitely many digits of x. However, we can say that
x
_
0.45 , 0.55
with probability
1
10
.
If x is distributed according to P
1
(x), then the probability distribution on the product space
(x
1
, x
2
) is simply the product of the distributions:
P
2
(x
1
, x
2
) = P
1
(x
1
) P
1
(x
2
) . (3.11)
Suppose we have a function (x
1
, . . . , x
N
). How is it distributed? Let Q() be the distri-
bution for . We then have
T() =
dx
1

dx
N
P
N
(x
1
, . . . , x
N
)
_
(x
1
, . . . , x
N
)
_
(3.12)
=
dx
1

x
N
P
1
(x
1
) P
1
(x
N
)
_
(x
1
, . . . , x
N
)
_
, (3.13)
where the second line is appropriate if the x
j
are themselves distributed independently.
Note that
d T() = 1 , (3.14)
so T() is itself normalized.
2
Memorize this!
3.2.1 Central limit theorem
In particular, consider the distribution function of the sum
X =
N
i=1
x
i
. (3.15)
We will be particularly interested in the case where N is large. For general N, though, we
have
T(X) =
dx
1

dx
N
P
1
(x
1
) P
1
(x
N
)
_
x
1
+x
2
+. . . +x
N
X
_
. (3.16)
It is convenient to compute the Fourier transform of T(X):
T(k) =
dXT(X) e
ikX
(3.17)
=
dX
dx
1

x
N
P
1
(x
1
) P
1
(x
N
)
_
x
1
+. . . +x
N
X) e
ikX
=
_
P
1
(k)
N
, (3.18)
where
P
1
(k) =
dxP
1
(x) e
ikx
(3.19)
is the Fourier transform of the single variable distribution P
1
(x). The distribution T(X)
is a convolution of the individual P
1
(x
i
) distributions. We have therefore proven that the
Fourier transform of a convolution is the product of the Fourier transforms.
OK, now we can write for

P
1
(k)
P
1
(k) =
dxP
1
(x)
_
1 ikx
1
2
k
2
x
2
+
1
6
i k
3
x
3
+. . .
_
= 1 ikx)
1
2
k
2
x
2
) +
1
6
i k
3
x
3
) +. . . (3.20)
Thus,
ln

P
1
(k) = ik
1
2
2
k
2
+
1
6
i
3
k
3
+. . . , (3.21)
where
= x) (3.22)
2
= x
2
) x)
2
(3.23)
3
= x
3
) 3 x
2
) x) + 2 x)
3
(3.24)
3.2. PROBABILITY 137
We can now write
_
P
1
(k)
N
= e
iNk
e
N
2
k
2
/2
e
iN
3
k
3
/6
(3.25)
Now for the inverse transform. In computing T(X), we will expand the term e
iN
3
k
3
/6
and
all subsequent terms in the above product as a power series in k. We then have
T(X) =
dk
2
e
ik(XN)
e
N
2
k
2
/2
_
1 +
1
6
i N
3
k
3
+. . .
_
(3.26)
=
_
1
1
6
N
3

3
X
3
+. . .
_
1
2N
2
e
(XN)
2
/2N
2
=
1
2N
2
e
(XN)
2
/2N
2
(N ) . (3.27)
In going from the second line to the third, we have used the fact that we can write X =
N ,
in which case N

3
X
3
= N
1/2
3
3
, which gives a subleading contribution which vanishes in
the N limit. We have just proven the central limit theorem: in the limit N , the
distribution of a sum of N independent random variables x
i
is a Gaussian with mean N
and standard deviation
N . Our only assumptions are that the mean and standard

deviation exist for the distribution P
1
(x). Note that P
1
(x) itself need not be a Gaussian
it could be a very peculiar distribution indeed, but so long as its rst and second moment
exist, where the k
th
moment is simply x
k
), the distribution of X =
N
i=1
x
i
is a Gaussian.
3.2.2 Multidimensional Gaussian integral
Consider the multivariable Gaussian distribution,
P(x)
_
det A
(2)
n
_
1/2
exp
_
1
2
x
i
A
ij
x
j
_
, (3.28)
where A is a positive denite matrix of rank n. A mathematical result which is extremely
important throughout physics is the following:
Z(b) =
_
det A
(2)
n
_
1/2

_
dx
1

dx
n
exp
_
1
2
x
i
A
ij
x
j
+b
i
x
i
_
= exp
_
1
2
b
i
A
1
ij
b
j
_
. (3.29)
Here, the vector b = (b
1
, . . . , b
n
) is identied as a source. Since Z(0) = 1, we have that
the distribution P(x) is normalized. Now consider averages of the form
x
j
1
x
j
2k
) =
_
d
n
x P(x) x
j
1
x
j
2k
=

n
Z(b)
b
j
1
b
j
2k
b=0
=
contractions
A
1
j
(1)
j
(2)
A
1
j
(2k1)
j
(2k)
. (3.30)
The sum in the last term is over all contractions of the indices j
1
, . . . , j
2k
. A contraction
is an arrangement of the 2k indices into k pairs. There are C
2k
= (2k)!/2
k
k! possible such
contractions. To obtain this result for C
k
, we start with the rst index and then nd a mate
among the remaining 2k 1 indices. Then we choose the next unpaired index and nd a
mate among the remaining 2k 3 indices. Proceeding in this manner, we have
C
2k
= (2k 1) (2k 3) 3 1 =
(2k)!
2
k
k!
. (3.31)
Equivalently, we can take all possible permutations of the 2k indices, and then divide by
2
k
k! since permutation within a given pair results in the same contraction and permutation
among the k pairs results in the same contraction. For example, for k = 2, we have C
4
= 3,
and
x
j
1
x
j
2
x
j
3
x
j
4
) = A
1
j
1
j
2
A
1
j
3
j
4
+A
1
j
1
j
3
A
1
j
2
j
4
+A
1
j
1
j
4
A
1
j
2
j
3
. (3.32)
3.3 Microcanonical Ensemble (CE)
We have seen how in an ergodic dynamical system, time averages can be replaced by phase
space averages:
ergodicity
f()
_
T
=
f()
_
S
, (3.33)
where
f()
_
T
= lim
T
1
T
T
_
0
dt f
_
(t)
_
. (3.34)
and
f()
_
S
=
_
df()
_
E H()
_
__
d
_
E H()
_
. (3.35)
Here H() = H(q, p) is the Hamiltonian, and where (x) is the Dirac -function. Thus,
averages are taken over a constant energy hypersurface which is a subset of the entire phase
space.
Weve also seen how any phase space distribution (
1
, . . . ,
k
) which is a function of
conserved quantitied
a
() is automatically a stationary (time-independent) solution to
Liouvilles equation. Note that the microcanonical distribution,
E
() =
_
E H()
_
__
d
_
E H()
_
, (3.36)
is of this form, since H() is conserved by the dynamics. Linear and angular momentum
conservation generally are broken by elastic scattering o the walls of the sample.
So averages in the microcanonical ensemble are computed by evaluating the ratio
A
_
=
Tr A(E H)
Tr (E H)
, (3.37)
3.3. MICROCANONICAL ENSEMBLE (CE) 139
where H = H(q, p) is the Hamiltonian, and where Tr means trace, which entails an
integration over all phase space:
Tr A(q, p)
1
N!
N
i=1
_
d
d
p
i
d
d
q
i
(2)
d
A(q, p) . (3.38)
Here N is the total number of particles and d is the dimension of physical space in which
each particle moves. The factor of 1/N!, which cancels in the ratio between numerator and
denominator, is present for indistinguishable particles. The normalization factor (2)
Nd
renders the trace dimensionless. Again, this cancels between numerator and denominator.
These factors may then seem arbitrary in the denition of the trace, but well see how
they in fact are required from quantum mechanical considerations. So we now adopt the
following metric for classical phase space integration:
d =
1
N!
N
i=1
d
d
p
i
d
d
q
i
(2)
d
. (3.39)
3.3.1 Density of states
The denominator,
D(E) = Tr (E H) , (3.40)
is called the density of states. It has dimensions of inverse energy, such that
D(E) E =
E+E
_
E
dE
_
d (E
H) =
_
E<H<E+E
d (3.41)
= # of states with energies between E and E + E .
Let us now compute D(E) for the nonrelativistic ideal gas. The Hamiltonian is
H(q, p) =
N
i=1
p
2
i
2m
. (3.42)
We assume that the gas is enclosed in a region of volume V , and well do a purely classical
calculation, neglecting discreteness of its quantum spectrum. We must compute
D(E) =
1
N!
_
N
i=1
d
d
p
i
d
d
q
i
(2)
d

_
E
N
i=1
p
2
i
2m
_
. (3.43)
Well do this calculation in two ways. First, lets rescale p
2mE u
i
. We then have
D(E) =
V
N
N!
_
2mE
h
_
Nd
1
E
_
d
M
u
_
u
2
1
+u
2
2
+. . . +u
2
M
1
_
. (3.44)
Here we have written u = (u
1
, u
2
, . . . , u
M
) with M = Nd as a M-dimensional vector. Weve
also used the rule (Ex) = E
1
(x) for -functions. We can now write
d
M
u = u
M1
du d
M
, (3.45)
where d
M
is the M-dimensional dierential solid angle. We now have our answer:
3
D(E) =
V
N
N!
_
2m
h
_
Nd
E
1
2
Nd1
1
2

Nd
. (3.46)
What remains is for us to compute
M
, the total solid angle in M dimensions. We do this
by a nifty mathematical trick. Consider the integral
1
M
=
_
d
M
u e
u
2
=
M
_
0
du u
M1
e
u
2
=
1
2
_
0
ds s
1
2
M1
e
s
=
1
2
M

_
1
2
M
_
, (3.47)
where s = u
2
, and where
(z) =
_
0
dt t
z1
e
t
(3.48)
is the Gamma function, which satises z (z) = (z + 1).
4
On the other hand, we can
compute 1
M
in Cartesian coordinates, writing
1
M
=
_
_
du
1
e
u
2
1
_
_
M
=
_
_
M
. (3.49)
Therefore
M
=
2
M/2
(M/2)
. (3.50)
We thereby obtain
2
= 2,
3
= 4,
4
= 2
2
, etc., the rst two of which are familiar.
Our nal result, then, is
D(E, V, N) =
V
N
N!
_
m
2
2
_
Nd/2
E
1
2
Nd1
(Nd/2)
. (3.51)
3
The factor of
1
2
preceding
M
in eqn. 3.46 appears because (u
2
1) =
1
2
(u 1) +
1
2
(u + 1). Since
u = [u[ 0, the second term can be dropped.
4
Note that for integer argument, (k) = (k 1)!
3.3. MICROCANONICAL ENSEMBLE (CE) 141
Figure 3.1: Complex integration contours ( for inverse Laplace transform L
1
_
Z()
=
D(E). When the product dN is odd, there is a branch cut along the negative Re axis.
Here we have emphasized that the density of states is a function of E, V , and N. Using
Stirlings approximation,
ln N! = N ln N N +
1
2
ln N +
1
2
ln(2) +O
_
N
1
_
, (3.52)
we may dene the statistical entropy,
S(E, V, N) k
B
ln D(E, V, N) = Nk
B
_
E
N
,
V
N
_
+O(ln N) , (3.53)
where
_
E
N
,
V
N
_
=
d
2
ln
_
E
N
_
+ ln
_
V
N
_
+
d
2
ln
_
m
d
2
_
+
_
1 +
1
2
d
_
. (3.54)
Recall k
B
= 1.3806503 10
16
erg/K is Boltzmanns constant.
The second way to calculate D(E) is to rst compute its Laplace transform, Z():
Z() = L
_
D(E)
_
0
dE e
E
D(E) = Tr e
H
. (3.55)
The inverse Laplace transform is then
D(E) = L
1
_
Z()
c+i
_
ci
d
2i
e
E
Z() , (3.56)
where c is such that the integration contour is to the right of any singularities of Z() in
the complex -plane. We then have
Z() =
1
N!
N
i=1
_
d
d
x
i
d
d
p
i
(2)
d
e
p
2
i
/2m
=
V
N
N!
_
_
dp
2
e
p
2
/2m
_
_
Nd
=
V
N
N!
_
m
2
2
_
Nd/2
Nd/2
. (3.57)
The inverse Laplace transform is then
D(E) =
V
N
N!
_
m
2
2
_
Nd/2
_
(
d
2i
e
E
Nd/2
=
V
N
N!
_
m
2
2
_
Nd/2
E
1
2
Nd1
(Nd/2)
, (3.58)
exactly as before. The integration contour for the inverse Laplace transform is extended in
an innite semicircle in the left half -plane. When Nd is even, the function
Nd/2
has a
simple pole of order Nd/2 at the origin. When Nd is odd, there is a branch cut extending
along the negative Re axis, and the integration contour must avoid the cut, as shown in
g. 3.1.
For a general system, the Laplace transform, Z() = L
_
D(E)
also is called the parti-

tion function. We shall again meet up with Z() when we discuss the ordinary canonical
ensemble.
3.3.2 Arbitrariness in the denition of S(E)
Note that D(E) has dimensions of inverse energy, so one might ask how we are to take the
logarithm of a dimensionful quantity in eqn. 3.53. We must introduce an energy scale, such
as E in eqn. 3.41, and dene

D(E; E) = D(E) E and S(E; E) k
B
ln

D(E; E).
The denition of statistical entropy then involves the arbitrary parameter E, however this
only aects S(E) in an additive way. That is,
S(E, V, N; E
1
) = S(E, V, N; E
2
) +k
B
ln
_
E
1
E
2
_
. (3.59)
Note that the dierence between the two denitions of S depends only on the ratio E
1
/E
2
,
and is independent of E, V , and N.
3.4. THE QUANTUM MECHANICAL TRACE 143
Figure 3.2: A system S in contact with a world W. The union of the two, universe
U = W S, is said to be the universe.
3.3.3 Ultra-relativistic ideal gas
Consider an ultrarelativistic ideal gas, with single particle dispersion (p) = cp. We then
have
Z() =
V
N
N!
N
d
h
N
d
_
_
_
0
dp p
d1
e
cp
_
_
N
=
V
N
N!
_
(d)
d
c
d
h
d
d
_
N
. (3.60)
The statistical entropy is S(E, V, N) = Nk
B
lnD(E, V, N) = Nk
B
_
E
N
,
V
N
_
, with
_
E
N
,
V
N
_
= d ln
_
E
N
_
+ ln
_
V
N
_
+ ln
_
d
(d)
(dhc)
d
_
+ (d + 1) (3.61)
3.4 The Quantum Mechanical Trace
Thus far our understanding of ergodicity is rooted in the dynamics of classical mechanics.
A Hamiltonian ow which is ergodic is one in which time averages can be replaced by phase
space averages using the microcanonical ensemble. What happens, though, if our system is
quantum mechanical, as all systems ultimately are?
3.4.1 The density matrix
First, let us consider that our system S will in general be in contact with a world W. We
call the union of S and W the universe, U = WS. Let
N
_
denote a quantum mechanical
state of W, and let
n
_
denote a quantum mechanical state of S. Then the most general
wavefunction we can write is of the form
_
=
N,n
N,n
N
_
n
_
. (3.62)
Now let us compute the expectation value of some operator

/ which acts as the identity
within W, meaning
_
=

A
NN
, where

A is the reduced operator which acts
within S alone. We then have
_
=
N,N
n,n
N,n

N
,n

NN
_
= Tr
_

A
_
, (3.63)
where
=
n,n
N,n

N,n
_
n
(3.64)
is the density matrix. The time-dependence of is easily found:
(t) =
n,n
N,n

N,n
(t)
_
n(t)
= e
i

Ht/
e
+i

Ht/
, (3.65)
where

H is the Hamiltonian for the system S. Thus, we nd
i

t
=
_
H,
. (3.66)
Note that the density matrix evolves according to a slightly dierent equation than an
operator in the Heisenberg picture, for which
A(t) = e
+iHt/
Ae
iHt/
= i

A
t
=
_
A,

H
=
_
H,

A
. (3.67)
For Hamiltonian systems, we found that the phase space distribution (q, p, t) evolved ac-
cording to the Liouville equation,
i

t
= L , (3.68)
where the Liouvillian L is the dierential operator
L = i
Nd
j=1
_
H
p
j
q
j
H
q
j
p
j
_
. (3.69)
Accordingly, any distribution (
1
, . . . ,
k
) which is a function of constants of the motion
a
(q, p) is a stationary solution to the Liouville equation:
t
(
1
, . . . ,
k
) = 0. Simi-
larly, any quantum mechanical density matrix which commutes with the Hamiltonian is a
stationary solution to eqn. 3.66. The corresponding microcanonical distribution is

E
=
_
E

H
_
. (3.70)
3.4. THE QUANTUM MECHANICAL TRACE 145
Figure 3.3: Averaging the quantum mechanical discrete density of states yields a continuous
curve.
3.4.2 Averaging the DOS
If our quantum mechanical system is placed in a nite volume, the energy levels will be
discrete, rather than continuous, and the density of states (DOS) will be of the form
D(E) = Tr
_
E

H
_
=
l
(E E
l
) , (3.71)
where E
l
are the eigenvalues of the Hamiltonian

H. In the thermodynamic limit, V ,
and the discrete spectrum of kinetic energies remains discrete for all nite V but must
approach the continuum result. To recover the continuum result, we average the DOS over
a window of width E:
D(E) =
1
E
E+E
_
E
dE
D(E
) . (3.72)
If we take the limit E 0 but with E E, where E is the spacing between successive
quantized levels, we recover a smooth function, as shown in g. 3.3. We will in general
drop the bar and refer to this function as D(E). Note that E 1/D(E) = e
N(,v)
is
(typically) exponentially small in the size of the system, hence if we took E V
1
which
vanishes in the thermodynamic limit, there are still exponentially many energy levels within
an interval of width E.
3.4.3 Coherent states
The quantum-classical correspondence is elucidated with the use of coherent states. Recall
that the one-dimensional harmonic oscillator Hamiltonian may be written
H
0
=
p
2
2m
+
1
2
m
2
0
q
2
=
0
_
a
a +
1
2
_
, (3.73)
where a and a
are ladder operators satisfying

_
a, a
= 1, which can be taken to be

a =

q
+
q
2
, a
=

q
+
q
2
, (3.74)
with =
_
/2m
0
. Note that
q =
_
a +a
_
, p =

2i
_
a a
_
. (3.75)
The ground state satises a
0
(q) = 0, which yields
0
(q) = (2
2
)
1/4
e
q
2
/4
2
. (3.76)
The normalized coherent state [ z ) is dened as
[ z ) = e
1
2
[z[
2
e
za
[ 0 ) = e
1
2
[z[
2
n=0
z
n
n!
[ n) . (3.77)
The overlap of coherent states is given by
z
1
[ z
2
) = e
1
2
[z
1
[
2
e
1
2
[z
2
[
2
e
z
1
z
2
, (3.78)
hence dierent coherent states are not orthogonal. Despite this nonorthogonality, the co-
herent states allow a simple resolution of the identity,
1 =
_
d
2
z
2i
[ z ) z [ ;
d
2
z
2i

d Rez d Imz
(3.79)
which is straightforward to establish.
To gain some physical intuition about the coherent states, dene
z
Q
2
+
iP
(3.80)
and write [ z ) [ Q, P ). One nds (exercise!)
Q,P
(q) = q [ z ) = (2
2
)
1/4
e
iPQ/2
e
iPq/
e
(qQ)
2
/4
2
, (3.81)
hence the coherent state
Q,P
(q) is a wavepacket Gaussianly localized about q = Q, but
oscillating with average momentum P.
For example, we can compute
Q, P
Q, P
_
=
(a +a
z
_
= 2 Re z = Q (3.82)
Q, P
Q, P
_
=

2i
(a a
z
_
=

Imz = P (3.83)
3.5. THERMAL EQUILIBRIUM 147
as well as
Q, P
q
2
Q, P
_
=
2
(a +a
)
2
z
_
= Q
2
+
2
(3.84)
Q, P
p
2
Q, P
_
=

2
4
2
(a a
)
2
z
_
= P
2
+

2
4
2
. (3.85)
Thus, the root mean square uctuations in the coherent state [ Q, P ) are
q = =

2m
0
, p =

2
=
_
m
0
2
, (3.86)
and q p =
1
2
. Thus we learn that the coherent state
Q,P
(q) is localized in phase
space, i.e. in both position and momentum. If we have a general operator

A(q, p), we can
then write
Q, P

A(q, p)
Q, P
_
= A(Q, P) +O() , (3.87)
where A(Q, P) is formed from

A(q, p) by replacing q Q and p P.
Since
d
2
z
2i

d Rez d Imz
=
dQdP
2
, (3.88)
we can write the trace using coherent states as
Tr

A =
1
2
dQ
dP
Q, P
Q, P
_
. (3.89)
We now can understand the origin of the factor 2 in the denominator of each (q
i
, p
i
)
integral over classical phase space in eqn. 3.38.
Note that
0
is arbitrary in our discussion. By increasing
0
, the states become more
localized in q and more plane wave like in p. However, so long as
0
is nite, the width of
the coherent state in each direction is proportional to
1/2
, and thus vanishes in the classical
limit.
3.5 Thermal Equilibrium
Consider two systems in thermal contact, as depicted in g. 3.4. The two subsystems #1
and #2 are free to exchange energy, but their respective volumes and particle numbers
remain xed. We assume the contact is made over a surface, and that the energy associated
with that surface is negligible when compared with the bulk energies E
1
and E
2
. Let the
total energy be E = E
1
+E
2
. Then the density of states D(E) for the combined system is
D(E) =
_
dE
1
D
1
(E
1
) D
2
(E E
1
) . (3.90)
Figure 3.4: Two systems in thermal contact.
The probability density for system #1 to have energy E
1
is then
P
1
(E
1
) =
D
1
(E
1
) D
2
(E E
1
)
D(E)
. (3.91)
Note that P
1
(E
1
) is normalized:
_
dE
1
P
1
(E
1
) = 1. We now ask: what is the most probable
value of E
1
? We nd out by dierentiating P
1
(E
1
) with respect to E
1
and setting the result
to zero. This requires
0 =
1
P
1
(E
1
)
dP
1
(E
1
)
dE
1
=

E
1
ln P
1
(E
1
)
=

E
1
ln D
1
(E
1
) +

E
1
ln D
2
(E E
1
) . (3.92)
Thus, we conclude that the maximally likely partition of energy between systems #1 and
#2 is realized when
S
1
E
1
=
S
2
E
2
. (3.93)
This guarantees that
S(E, E
1
) = S
1
(E
1
) +S
2
(E E
1
) (3.94)
is a maximum with respect to the energy E
1
, at xed total energy E.
The temperature T is dened as
1
T
=
_
S
E
_
V,N
, (3.95)
a result familiar from thermodynamics. The dierence is now we have a more rigorous
denition of the entropy. When the total entropy S is maximized, we have that T
1
= T
2
.
Once again, two systems in thermal contact and can exchange energy will in equilibrium
have equal temperatures.
According to eqns. 3.54 and 3.61, the entropies of nonrelativistic and ultrarelativistic ideal
gases in d space dimensions are given by
S
NR
=
1
2
Nd k
B
ln
_
E
N
_
+Nk
B
ln
_
V
N
_
+ const. (3.96)
S
UR
= Nd k
B
ln
_
E
N
_
+Nk
B
ln
_
V
N
_
+ const. . (3.97)
3.6. ORDINARY CANONICAL ENSEMBLE (OCE) 149
Invoking eqn. 3.95, we then have
E
NR
=
1
2
Nd k
B
T , E
UR
= Nd k
B
T . (3.98)
We saw that the probability distribution P
1
(E
1
) is maximized when T
1
= T
2
, but how sharp
is the peak in the distribution? Let us write E
1
= E
1
+ E
1
, where E
1
is the solution to
eqn. 3.92. We then have
ln P
1
(E
1
+ E
1
) = ln P
1
(E
1
) +
1
2k
B
2
S
1
E
2
1
1
(E
1
)
2
+
1
2k
B
2
S
2
E
2
2
2
(E
1
)
2
+. . . , (3.99)
where E
2
= E E
1
. We must now evaluate
2
S
E
2
=

E
_
1
T
_
=
1
T
2
_
T
E
_
V,N
=
1
T
2
C
V
, (3.100)
where C
V
=
_
E/T
_
V,N
is the heat capacity. Thus,
P
1
= P
1
e
(E
1
)
2
/2k
B
T
2
C
V
, (3.101)
where
C
V
=
C
V,1
C
V,2
C
V,1
+C
V,2
. (3.102)
The distribution is therefore a Gaussian, and the uctuations in E
1
can now be computed:
(E
1
)
2
_
= k
B
T
2

C
V
= (E
1
)
RMS
= k
B
T
_
C
V
/k
B
. (3.103)
Now, assuming both systems #1 and #2 are thermodynamically large, we note that

C
V
is
extensive, scaling as the overall size. Therefore the RMS uctuations in E
1
are propor-
tional to the square root of the system size, whereas E
1
itself is extensive. Thus, the ratio
(E
1
)
RMS
/E
1
V
1/2
scales as the inverse square root of the volume. The distribution
P
1
(E
1
) is thus extremely sharp.
3.6 Ordinary Canonical Ensemble (OCE)
Consider a system S in contact with a world W, and let their union U = W S be called
the universe. The situation is depicted in g. 3.2. The volume V
S
and particle number
N
S
of the system are held xed, but the energy is allowed to uctuate by exchange with
the world W. We are interested in the limit N
S
, N
W
, with N
S
N
W
, with
similar relations holding for the respective volumes and energies. We now ask what is the
probability that S is in a state [ n) with energy E
n
. This is given by the ratio
P
n
= lim
E0
D
W
(E
U
E
n
) E
D
U
(E
U
) E
(3.104)
=
# of states accessible to W given that E
S
= E
n
total # of states in U
.
Then
ln P
n
= ln D
W
(E
U
E
n
) ln D
U
(E
U
)
= ln D
W
(E
U
) ln D
U
(E
U
) E
n
ln D
W
(E)
E
E=E
U
+. . . (3.105)
E
n
. (3.106)
The constant is given by
=
ln D
W
(E)
E
E=E
U
=
1
k
B
T
. (3.107)
Thus, we nd P
n
= e
e
E
n
. The constant is xed by the requirement that
n
P
n
= 1:
P
n
=
1
Z
e
E
n
, Z(T, V, N) =
n
e
E
n
= Tr e

H
. (3.108)
Weve already met Z() in eqn. 3.55 it is the Laplace transform of the density of states.
It is also called the partition function of the system S. Quantum mechanically, we can write
the ordinary canonical density matrix as
=
e

H
Tr e

H
. (3.109)
Note that
_
,

H
= 0, hence the ordinary canonical distribution is a stationary solution

to the evolution equation for the density matrix. Note that the OCE is specied by three
parameters: T, V , and N.
3.6.1 Averages within the OCE
To compute averages within the OCE,
A
_
= Tr
_

A
_
=
n
n[

A[ n) e
E
n
_
n
e
E
n
, (3.110)
where we have conveniently taken the trace in a basis of energy eigenstates.
3.6.2 Entropy and free energy
The Boltzmann entropy is dened by
S = k
B
Tr
_
ln ) = k
B
n
P
n
ln P
n
. (3.111)
The Boltzmann entropy and the statistical entropy S = k
B
ln D(E) are identical in the
thermodynamic limit.
We dene the Helmholtz free energy F(T, V, N) as
F(T, V, N) = k
B
T ln Z(T, V, N) , (3.112)
hence
P
n
= e
F
e
E
n
, ln P
n
= F E
n
. (3.113)
Therefore the entropy is
S = k
B
n
P
n
_
F E
n
_
=
F
T
+

H)
T
, (3.114)
which is to say
F = E TS , (3.115)
where
E =
n
P
n
E
n
=
Tr

H e

H
Tr e

H
(3.116)
is the average energy. We also see that
Z = Tr e

H
=
n
e
E
n
= E =
n
E
n
e
E
n
n
e
E
n
=

ln Z =

_
F
_
. (3.117)
3.6.3 Fluctuations in the OCE
In the OCE, the energy is not xed. It therefore uctuates about its average value E = H).
Note that
E
= k
B
T
2
E
T
=
2
ln Z
2
=
_
Tr

H e

H
Tr e

H
_
2
Tr

H
2
e

H
Tr e

H
=
H
_
2
H
2
_
. (3.118)
Thus, the heat capacity is related to the uctuations in the energy, just as we saw at the
end of 3.5:
C
V
=
_
E
T
_
V,N
=
1
k
B
T
2
_
H
2
_
H
_
2
_
(3.119)
For the nonrelativistic ideal gas, we found C
V
=
d
2
Nk
B
, hence the ratio of RMS uctuations
in the energy to the energy itself is
_
(

H)
2
_
H)
=
_
k
B
T
2
C
V
d
2
Nk
B
T
=
_
2
Nd
, (3.120)
Figure 3.5: Microscopic, statistical interpretation of the First Law of Thermodynamics.
and the ratio of the RMS uctuations to the mean value vanishes in the thermodynamic
limit.
The full distribution function for the energy is
P(c) =
(c

H)
_
=
Tr (c

H) e

H
Tr e

H
=
1
Z
D(c) e
c
. (3.121)
Thus,
P(c) = e
_
Fc+TS(c)
, (3.122)
where S(c) = k
B
ln D(c) is the statistical entropy. Lets write c = E +E, where E =
H)
is the average energy. We have
S(E +E) = S(E) +
E
T

_
E
_
2
2T
2
C
V
+. . . (3.123)
Thus,
P(c) = ^ exp
_
(E)
2
2k
B
T
2
C
V
_
, (3.124)
where ^ is a normalization constant. Recall
_
dc P(c) = 1. Once again, we see that the
distribution is a Gaussian centered at c) = E, and of width (c)
RMS
=
_
k
B
T
2
C
V
.
3.6.4 Thermodynamics revisited
The average energy within the OCE is
E =
n
E
n
P
n
, (3.125)
and therefore
dE =
n
E
n
dP
n
+
n
P
n
dE
n
(3.126)
= dQ dW , (3.127)
where
dW =
n
P
n
dE
n
(3.128)
dQ =
n
E
n
dP
n
. (3.129)
Finally, from P
n
= Z
1
e
E
n
/k
B
T
, we can write
E
n
= k
B
T ln Z k
B
T ln P
n
, (3.130)
with which we obtain
dQ =
n
E
n
dP
n
= k
B
T ln Z
n
dP
n
k
B
T
n
ln P
n
dP
n
= T d
_
k
B
n
P
n
ln P
n
_
= T dS . (3.131)
Note also that
dW =
i
F
i
dX
i
(3.132)
=
i,n
P
n

H
X
i
n
_
dX
i
=
i
_
n
P
n
E
n
X
i
_
dX
i
(3.133)
so the generalized force F
i
conjugate to the generalized displacement dX
i
is
F
i
=
n
P
n
E
n
X
i
=
_

H
X
i
_
. (3.134)
This is the force acting on the system
5
. In the chapter on thermodynamics, we dened the
generalized force conjugate to X
i
as y
i
F
i
.
Thus we see from eqn. 3.127 that there are two ways that the average energy can change;
these are depicted in the sketch of g. 3.5. Starting from a set of energy levels E
n
and probabilities P
n
, we can shift the energies to E
n
. The resulting change in energy
(E)
I
= W is identied with the work done on the system. We could also modify the
probabilities to P
n
without changing the energies. The energy change in this case is
the heat absorbed by the system: (E)
II
= Q. This provides us with a statistical and
microscopic interpretation of the First Law of Thermodynamics.
3.6.5 Generalized susceptibilities
Suppose our Hamiltonian is of the form
H =

H() =

H
0

Q , (3.135)
where is an intensive parameter, such as magnetic eld. Then
Z() = Tr e
(

H
0
Q)
(3.136)
and
1
Z
Z
=
1
Z
Tr
_
Qe

H()
_
=
Q) . (3.137)
But then from Z = e
F
we have
Q(, T) =

Q) =
_
F
_
T
. (3.138)
Note that Q is an extensive quantity. We can now dene the susceptibility

as
=
1
V
Q
=
1
V
2
F
2
. (3.139)
The volume factor in the denominator ensures that

is intensive.
It is important to realize that we have assumed here that
_
H
0
,

Q
= 0, i.e. the bare

Hamiltonian

H
0
and the operator

Q commute. If they do not commute, then the response
functions must be computed within a proper quantum mechanical formalism, which we shall
not discuss here.
Note also that we can imagine an entire family of observables
_
Q
k
_
satisfying
_
Q
k
,

Q
k
= 0
and
_
H
0
,

Q
k
= 0, for all k and k
. Then for the Hamiltonian
H(
) =

H
0
k

Q
k
, (3.140)
5
In deriving eqn. 3.133, we have used the so-called Feynman-Hellman theorem of quantum mechanics:
dn[
H[n) = n[ d

H[n), if [n) is an energy eigenstate.
3.7. GRAND CANONICAL ENSEMBLE (GCE) 155
we have that
Q
k
(
, T) =

Q
k
) =
_
F
k
_
T, N
a
,
k
=k
(3.141)
and we may dene an entire matrix of susceptibilities,
kl
=
1
V
Q
k
l
=
1
V
2
F
k

l
. (3.142)
3.7 Grand Canonical Ensemble (GCE)
Consider once again the situation depicted in g. 3.2, where a system S is in contact with a
world W, their union U = W S being called the universe. We assume that the systems
volume V
S
is xed, but otherwise it is allowed to exchange energy and particle number with
W. Hence, the systems energy E
S
and particle number N
S
will uctuate. We ask what is
the probability that S is in a state [ n) with energy E
n
and particle number N
n
. This is
given by the ratio
P
n
= lim
E0
lim
N0
D
W
(E
U
E
n
, N
U
N
n
) E N
D
U
(E
U
, N
U
) EN
(3.143)
=
# of states accessible to W given that E
S
= E
n
and N
S
= N
n
total # of states in U
.
Then
ln P
n
= lnD
W
(E
U
E
n
, N
U
N
n
) ln D
U
(E
U
, N
U
)
= lnD
W
(E
U
, N
U
) ln D
U
(E
U
, N
U
)
E
n
ln D
W
(E, N)
E
E=E
U
N=N
U
N
n
ln D
W
(E, N)
N
E=E
U
N=N
U
+. . . (3.144)
E
n
+N
n
. (3.145)
The constants and are given by
=
ln D
W
(E, N)
E
E=E
U
N=N
U
=
1
k
B
T
(3.146)
= k
B
T
lnD
W
(E, N)
N
E=E
U
N=N
U
. (3.147)
The quantity has dimensions of energy and is called the chemical potential . Nota bene:
Some texts dene the grand canonical Hamiltonian

K as
K

H

N . (3.148)
Thus, P
n
= e
e
(E
n
N
n
)
. Once again, the constant is xed by the requirement that
n
P
n
= 1:
P
n
=
1
e
(E
n
N
n
)
, (, V, ) =
n
e
(E
n
N
n
)
= Tr e
(

H

N)
= Tr e

K
.
(3.149)
Thus, the quantum mechanical grand canonical density matrix is given by
=
e

K
Tr e

K
. (3.150)
Note that
_
,

K
= 0.
The quantity (T, V, ) is called the grand partition function. It stands in relation to a
corresponding free energy in the usual way:
(T, V, ) e
(T,V,)
= k
B
T ln , (3.151)
where (T, V, ) is the grand potential , also known as the Landau free energy. The dimen-
sionless quantity z e
is called the fugacity.

If
_
H,

N
= 0, the grand potential may be expressed as a sum over contributions from each
N sector, viz.
(T, V, ) =
N
e
N
Z(T, V, N) . (3.152)
When there is more than one species, we have several chemical potentials
a
, and accord-
ingly we dene
K =

H
a

N
a
, (3.153)
with = Tr e

K
as before.
3.7.1 Entropy
In the GCE, the Boltzmann entropy is
S = k
B
n
P
n
ln P
n
= k
B
n
P
n
_
E
n
+N
n
_
=
T
+

H)
T

N )
T
, (3.154)
which says
= E TS N , (3.155)
3.7. GRAND CANONICAL ENSEMBLE (GCE) 157
where
E =
n
E
n
P
n
= Tr
_
H
_
(3.156)
N =
n
N
n
P
n
= Tr
_
N
_
. (3.157)
This is consistent with the result from thermodynamics that G = E TS +pV = N.
3.7.2 Gibbs-Duhem relation
Since (T, V, ) is an extensive quantity, we must be able to write = V (T, ). We
identify the function (T, ) as the negative of the pressure:
V
=
k
B
T
V
_
T,
=
1
n
E
n
V
e
(E
n
N
n
)
=
_
E
V
_
T,
= p(T, ) . (3.158)
Therefore,
= pV , p = p(T, ) (equation of state) . (3.159)
3.7.3 Generalized susceptibilities in the GCE
We can appropriate the results from 3.6.5 and apply them, mutatis mutandis, to the GCE.
Suppose we have a family of observables
_
Q
k
_
satisfying
_
Q
k
,

Q
k
= 0 and
_
H
0
,

Q
k
= 0
and
_
N
a
,

Q
k
= 0 for all k, k
, and a. Then for the grand canonical Hamiltonian
K(
) =

H
0
a

N
a
k

Q
k
, (3.160)
we have that
Q
k
(
, T) =

Q
k
) =
_
k
_
T,
a
,
k
=k
(3.161)
and we may dene the matrix of generalized susceptibilities,
kl
=
1
V
Q
k
l
=
1
V
k

l
. (3.162)
3.7.4 Fluctuations in the GCE
Both energy and particle number uctuate in the GCE. Let us compute the uctuations in
particle number. We have
N =

N ) =
Tr

N e
(

H

N)
Tr e
(

H

N)
=
1
ln . (3.163)
Therefore,
1
=
Tr

N
2
e
(

H

N)
Tr e
(

H

N)
_
Tr

N e
(

H

N)
Tr e
(

H

N)
_
2
=
N
2
_
N
_
2
. (3.164)
Note now that
N
2
_
N
_
2
N
_
2
=
k
B
T
N
2
_
N
_
T,V
=
k
B
T
V

T
, (3.165)
where
T
is the isothermal compressibility. Note:
_
N
_
T,V
=
(N, T, V )
(, T, V )
=
(N, T, V )
(N, T, p)

(N, T, p)
(V, T, p)

(V, T, p)
(N, T, )

(N, T, )
(, T, V )
=
N
2
V
2
_
V
p
_
T,N
=
N
2
V

T
. (3.166)
Thus,
(N)
RMS
N
=
_
k
B
T
T
V
, (3.167)
which again scales as V
1/2
.
3.8 Gibbs Ensemble
Now let the systems particle number N
S
be xed, but let it exchange energy and volume
with the world W. Mutatis mutandis, we have
P
n
= lim
E0
lim
V 0
D
W
(E
U
E
n
, V
U
V
n
) E V
D
U
(E
U
, V
U
) E V
. (3.168)
3.9. STATISTICAL ENSEMBLES FROM MAXIMUM ENTROPY 159
Then
ln P
n
= lnD
W
(E
U
E
n
, V
U
V
n
) ln D
U
(E
U
, V
U
)
= lnD
W
(E
U
, V
U
) ln D
U
(E
U
, V
U
)
E
n
ln D
W
(E, V )
E
E=E
U
V =V
U
V
n
ln D
W
(E, V )
V
E=E
U
V =V
U
+. . . (3.169)
E
n
p V
n
. (3.170)
The constants and p are given by
=
ln D
W
(E, V )
E
E=E
U
V =V
U
=
1
k
B
T
(3.171)
p = k
B
T
ln D
W
(E, V )
V
E=E
U
V =V
U
. (3.172)
The corresponding partition function is
Y (T, p, N) = Tr e
(

H+pV )
= p
_
0
dV e
pV
Z(T, V, N) e
G(T,p,N)
. (3.173)
The factor of p multiplying the integral on the RHS guarantees that the partition function
Y is dimensionless.
3.9 Statistical Ensembles from Maximum Entropy
The basic principle: maximize the entropy,
S = k
B
n
P
n
lnP
n
. (3.174)
3.9.1 CE
We maximize S subject to the single constraint
C =
n
P
n
1 = 0 . (3.175)
We implement the constraint C = 0 with a Lagrange multiplier,

k
B
, writing
S
= S k
B
C , (3.176)
and freely extremizing over the distribution P
n
and the Lagrange multiplier . Thus,
S
= S k
B
n
P
n
k
B
_
n
P
n
1
_
k
B
C
= k
B
n
_
ln P
n
+ 1 +
_
P
n
k
B
C 0 . (3.177)
We conclude that C = 0 and that
ln P
n
=
_
1 +
_
, (3.178)
and we x by the normalization condition
n
P
n
= 1. This gives
P
n
=
1
, =
n
(E + E E
n
) (E
n
E) . (3.179)
Note that is the number of states with energies between E and E + E.
3.9.2 OCE
We maximize S subject to the two constraints
C
1
=
n
P
n
1 = 0 , C
2
=
n
E
n
P
n
E = 0 . (3.180)
We now have two Lagrange multipliers. We write
S
= S k
B
2
j=1
j
C
j
, (3.181)
and we freely extremize over P
n
and C
j
. We therefore have
S
= S k
B
n
_
1
+
2
E
n
_
P
n
k
B
2
j=1
C
j

j
= k
B
n
_
ln P
n
+ 1 +
1
+
2
E
n
_
P
n
k
B
2
j=1
C
j

j
0 . (3.182)
Thus, C
1
= C
2
= 0 and
lnP
n
=
_
1 +
1
+
2
E
n
_
. (3.183)
We dene
2
and we x
1
by normalization. This yields
P
n
=
1
Z
e
E
n
, Z =
n
e
E
n
. (3.184)
3.10. IDEAL GAS STATISTICAL MECHANICS 161
3.9.3 GCE
We maximize S subject to the three constraints
C
1
=
n
P
n
1 = 0 , C
2
=
n
E
n
P
n
E = 0 , C
3
=
n
N
n
P
n
N = 0 . (3.185)
We now have three Lagrange multipliers. We write
S
= S k
B
3
j=1
j
C
j
, (3.186)
and hence
S
= S k
B
n
_
1
+
2
E
n
+
3
N
n
_
P
n
k
B
3
j=1
C
j

j
= k
B
n
_
lnP
n
+ 1 +
1
+
2
E
n
+
3
N
n
_
P
n
k
B
3
j=1
C
j

j
0 . (3.187)
Thus, C
1
= C
2
= C
3
= 0 and
ln P
n
=
_
1 +
1
+
2
E
n
+
3
N
n
_
. (3.188)
We dene
2
and
3
, and we x
1
by normalization. This yields
P
n
=
1
e
(E
n
N
n
)
, =
n
e
(E
n
N
n
)
. (3.189)
3.10 Ideal Gas Statistical Mechanics
The ordinary canonical partition function for the ideal gas was computed in eqn. 3.57. We
found
Z(T, V, N) =
1
N!
N
i=1
_
d
d
x
i
d
d
p
i
(2)
d
e
p
2
i
/2m
=
V
N
N!
_
_
dp
2
e
p
2
/2m
_
_
Nd
=
1
N!
_
V
d
T
_
N
, (3.190)
where
T
is the thermal wavelength:
T
=
_
2
2
/mk
B
T . (3.191)
The physical interpretation of
T
is that it is the de Broglie wavelength for a particle of
mass m which has a kinetic energy of k
B
T.
In the GCE, we have
(T, V, ) =
N=0
e
N
Z(T, V, N)
=
N=1
1
N!
_
V e
/k
B
T
d
T
_
N
= exp
_
V e
/k
B
T
d
T
_
. (3.192)
From = e
/k
B
T
, we have the grand potential is
(T, V, ) = V k
B
T e
/k
B
T
_
d
T
. (3.193)
Since = pV (see 3.7.2), we have
p(T, ) = k
B
T
d
T
e
/k
B
T
. (3.194)
The number density can also be calculated:
n =
N
V
=
1
V
_
_
T,V
=
d
T
e
/k
B
T
. (3.195)
Combined, the last two equations recapitulate the ideal gas law, pV = Nk
B
T.
3.10.1 Maxwell velocity distribution
The distribution function for momenta is given by
g(p) =
_
1
N
N
i=1
(p
i
p)
_
. (3.196)
Note that g(p) =
(p
i
p)
_
is the same for every particle, independent of its label i. We
compute the average A) = Tr
_
Ae

H
_
/ Tr e

H
. Setting i = 1, all the integrals other
than that over p
1
divide out between numerator and denominator. We then have
g(p) =
_
d
3
p
1
(p
1
p) e
p
2
1
/2m
_
d
3
p
1
e
p
2
1
/2m
= (2mk
B
T)
3/2
e
p
2
/2m
. (3.197)
Textbooks commonly refer to the velocity distribution f(v), which is related to g(p) by
f(v) d
3
v = g(p) d
3
p . (3.198)
Hence,
f(v) =
_
m
2k
B
T
_
3/2
e
mv
2
/2k
B
T
. (3.199)
3.10. IDEAL GAS STATISTICAL MECHANICS 163
This is known as the Maxwell velocity distribution. Note that the distributions are normal-
ized, viz.
_
d
3
p g(p) =
_
d
3
v f(v) = 1 . (3.200)
If we are only interested in averaging functions of v = [v[ which are isotropic, then we can
dene the Maxwell speed distribution,

f(v), as
f(v) = 4 v
2
f(v) = 4
_
m
2k
B
T
_
3/2
v
2
e
mv
2
/2k
B
T
. (3.201)
Note that

f(v) is normalized according to
_
0
dv

f(v) = 1 . (3.202)
It is convenient to represent v in units of v
0
=
_
k
B
T/m, in which case
f(v) =
1
v
0
(v/v
0
) , (s) =
_
2
s
2
e
s
2
/2
. (3.203)
The distribution (s) is shown in g. 3.6. Computing averages, we have
C
k
s
k
) =
_
0
ds s
k
(s) = 2
k/2

_
3
2
+
k
2
_
. (3.204)
Thus, C
0
= 1, C
1
=
_
8
, C
2
= 3, etc. The speed averages are
v
k
_
= C
k
_
k
B
T
m
_
k/2
. (3.205)
Note that the average velocity is v) = 0, but the average speed is v) =
_
8k
B
T/m. The
speed distribution is plotted in g. 3.6.
3.10.2 Equipartition
The Hamiltonian for ballistic (i.e. massive nonrelativistic) particles is quadratic in the
individual components of each momentum p
i
. There are other cases in which a classical
degree of freedom appears quadratically in

H as well. For example, an individual normal
mode of a system of coupled oscillators has the Lagrangian
L =
1
2
1
2

2
0

2
, (3.206)
where the dimensions of are [] = M
1/2
L by convention. The Hamiltonian for this normal
mode is then
H =
p
2
2
+
1
2

2
0

2
, (3.207)
Figure 3.6: Maxwell distribution of speeds (v/v
0
). The most probable speed is v
MAX
=
2 v
0
. The average speed is v
AVG
=
_
8
v
0
. The RMS speed is v
RMS
=
3 v
0
.
from which we see that both the kinetic as well as potential energy terms enter quadratically
into the Hamiltonian. The classical rotational kinetic energy is also quadratic in the angular
momentum components.
Let us compute the contribution of a single quadratic degree of freedom in

H to the partition
function. Well call this degree of freedom it may be a position or momentum or angular
momentum and well write its contribution to

H as
=
1
2
K
2
, (3.208)
where K is some constant. Integrating over yields the following factor in the partition
function:
d e
K
2
/2
=
_
2
K
_
1/2
. (3.209)
The contribution to the Helmholtz free energy is then
F
=
1
2
k
B
T ln
_
K
2k
B
T
_
, (3.210)
and therefore the contribution to the internal energy E is
E
_
F
_
=
1
2
=
1
2
k
B
T . (3.211)
We have thus derived what is commonly called the equipartition theorem of classical statis-
tical mechanics:
3.11. SELECTED EXAMPLES 165
To each degree of freedom which enters the Hamiltonian quadratically is associ-
ated a contribution
1
2
k
B
T to the internal energy of the system. This results in
a concomitant contribution of
1
2
k
B
to the heat capacity.
We now see why the internal energy of a classical ideal gas with f degrees of freedom
per molecule is E =
1
2
fNk
B
T, and C
V
=
1
2
Nk
B
. This result also has applications in
the theory of solids. The atoms in a solid possess kinetic energy due to their motion,
and potential energy due to the spring-like interatomic potentials which tend to keep the
atoms in their preferred crystalline positions. Thus, for a three-dimensional crystal, there
are six quadratic degrees of freedom (three positions and three momenta) per atom, and
the classical energy should be E = 3Nk
B
T, and the heat capacity C
V
= 3Nk
B
. As we
shall see, quantum mechanics modies this result considerably at temperatures below the
highest normal mode (i.e. phonon) frequency, but the high temperature limit is given by the
classical value C
V
= 3R (where = N/N
A
is the number of moles) derived here, known as
the Dulong-Petit limit.
3.11 Selected Examples
3.11.1 Spins in an external magnetic eld
Consider a system of ^ spins , each of which can be either up ( = +1) or down ( = 1).
The Hamiltonian for this system is
H =
0
H
A
j=1
j
, (3.212)
where H is the external magnetic eld, and
0
is the magnetic moment per particle. We
treat this system within the ordinary canonical ensemble. The partition function is
Z =
N
e

H
=
A
, (3.213)
where is the single particle partition function:
=
=1
e
H/k
B
T
= 2 cosh
_
0
H
k
B
T
_
. (3.214)
The Helmholtz free energy is then
F(T, H, ^) = k
B
T ln Z = ^k
B
T ln
_
2 cosh
_
0
H
k
B
T
_
_
. (3.215)
The magnetization is
M =
_
F
H
_
T, A
= ^
0
tanh
_
0
H
k
B
T
_
. (3.216)
The energy is
E =

_
F
_
= ^
0
Htanh
_
0
H
k
B
T
_
. (3.217)
Hence, E = HM, which we already knew, from the form of

H itself.
Each spin here is independent. The probability that a given spin has polarization is
P
=
e
0
H
e
0
H
+e
0
H
. (3.218)
The total probability is unity, and the average polarization is a weighted average of = +1
and = 1 contributions:
P
+P
= 1 , ) = P
= tanh
_
0
H
k
B
T
_
. (3.219)
At low temperatures T
0
H/k
B
, we have P
1 e
2
0
H/k
B
T
. At high temperatures
T >
0
H/k
B
, the two polarizations are equally likely, and P

1
2
_
1 +

0
H
k
B
T
_
.
The isothermal magnetic susceptibility is dened as
T
=
1
^
_
M
H
_
T
=

2
0
k
B
T
sech
2
_
0
H
k
B
T
_
. (3.220)
(Typically this is computed per unit volume rather than per particle.) At H = 0, we have
T
=
2
0
/k
B
T, which is known as the Curie law.
Aside
The energy E = HM here is not the same quantity we discussed in our study of thermody-
namics. In fact, the thermodynamic energy for this problem vanishes! Here is why. To avoid
confusion, well need to invoke a new symbol for the thermodynamic energy, c. Recall that
the thermodynamic energy c is a function of extensive quantities, meaning c = c(S, M, ^).
It is obtained from the free energy F(T, H, ^) by a double Legendre transform:
c(S, M, ^) = F(T, H, ^) +TS + HM . (3.221)
Now from eqn. 3.215 we derive the entropy
S =
F
T
= ^k
B
ln
_
2 cosh
_
0
H
k
B
T
_
_
^

0
H
T
tanh
_
0
H
k
B
T
_
. (3.222)
Thus, using eqns. 3.215 and 3.216, we obtain c(S, M, ^) = 0.
The potential confusion here arises from our use of the expression F(T, H, ^). In thermo-
dynamics, it is the Gibbs free energy G(T, p, N) which is a double Legendre transform of
the energy: G = c TS +pV . By analogy, with magnetic systems we should perhaps write
G = c TS HM, but in keeping with many textbooks we shall use the symbol F and
refer to it as the Helmholtz free energy. The quantity weve called E in eqn. 3.217 is in fact
E = c HM, which means c = 0. The energy c(S, M, ^) vanishes here because the spins
are noninteracting.
3.11.2 Negative temperature (!)
Consider again a system of ^ spins, each of which can be either up (+) or down (). Let
N
be the number of sites with spin , where = 1. Clearly N

+
+ N
= ^. We now
treat this system within the microcanonical ensemble.
The energy of the system is
E = HM , (3.223)
where H is an external magnetic eld, and M = (N
+
N
)
0
is the total magnetization.
We now compute S(E) using the ordinary canonical ensemble. The number of ways of
arranging the system with N
+
up spins is
=
_
^
N
+
_
, (3.224)
hence the entropy is
S = k
B
ln = ^k
B
_
xln x + (1 x) ln(1 x)
_
(3.225)
in the thermodynamic limit: ^ , N
+
, x = N
+
/^ constant. Now the magne-
tization is M = (N
+
N
)
0
= (2N
+
^)
0
, hence if we dene the maximum energy
E
0
^
0
H, then
E
E
0
=
M
^
0
= 1 2x = x =
E
0
E
2E
0
. (3.226)
We therefore have
S(E, ^) = ^k
B
_
_
E
0
E
2E
0
_
ln
_
E
0
E
2E
0
_
+
_
E
0
+E
2E
0
_
ln
_
E
0
+E
2E
0
_
_
. (3.227)
We now have
1
T
=
_
S
E
_
A
=
S
x
x
E
=
^k
B
2E
0
ln
_
E
0
E
E
0
+E
_
. (3.228)
We see that the temperature is positive for E
0
E < 0 and is negative for 0 < E E
0
.
What has gone wrong? The answer is that nothing has gone wrong all our calculations
are perfectly correct. This system does exhibit the possibility of negative temperature. It
is, however, unphysical in that we have neglected kinetic degrees of freedom, which result
in an entropy function S(E, ^) which is an increasing function of energy. In this system,
S(E, ^) achieves a maximum of S
max
= ^k
B
ln 2 at E = 0 (i.e. x =
1
2
), and then turns over
and starts decreasing. In fact, our results are completely consistent with eqn. 3.217 : the
energy E is an odd function of temperature. Positive energy requires negative temperature!
Another example of this peculiarity is provided in the appendix in 3.16.2.
Figure 3.7: When entropy decreases with increasing energy, the temperature is negative.
Typically, kinetic degrees of freedom prevent this peculiarity from manifesting in physical
systems.
3.11.3 Adsorption
PROBLEM: A surface containing ^ adsorption sites is in equilibrium with a monatomic ideal
gas. Atoms adsorbed on the surface have an energy and no kinetic energy. Each
adsorption site can accommodate at most one atom. Calculate the fraction f of occupied
adsorption sites as a function of the gas density n, the temperature T, the binding energy
, and physical constants.
The grand partition function for the surface is
s
= e
s
/k
B
T
=
_
1 +e
/k
B
T
e
/k
B
T
_
A
. (3.229)
The fraction of occupied sites is
f =

N
s
)
^
=
1
^
=
e
/k
B
T
e
/k
B
T
+e
/k
B
T
. (3.230)
Since the surface is in equilibrium with the gas, its fugacity z = exp(/k
B
T) and tempera-
ture T are the same as in the gas.
SOLUTION: For a monatomic ideal gas, the single particle partition function is = V
3
T
,
where
T
=
_
2
2
/mk
B
T is the thermal wavelength. Thus, the grand partition function,
for indistinguishable particles, is
g
= exp
_
V
3
T
e
/k
B
T
_
. (3.231)
The gas density is
n =

N
g
)
V
=
k
B
T
V
=
3
T
e
/k
B
T
. (3.232)
We can now solve for the fugacity: z = e
/k
B
T
= n
3
T
. Thus, the fraction of occupied
adsorption sites is
f =
n
3
T
n
3
T
+ exp(/k
B
T)
. (3.233)
Interestingly, the solution for f involves the constant .
It is always advisable to check that the solution makes sense in various limits. First of all,
if the gas density tends to zero at xed T and , we have f 0. On the other hand,
if n we have f 1, which also makes sense. At xed n and T, if the adsorption
energy is , then once again f = 1 since every adsorption site wants to be occupied.
Conversely, taking + results in n 0, since the energetic cost of adsorption is
innitely high.
3.11.4 Elasticity of wool
Wool consists of interlocking protein molecules which can stretch into an elongated cong-
uration, but reversibly so. This feature gives wool its very useful elasticity. Let us model a
chain of these proteins by assuming they can exist in one of two states, which we will call
A and B, with energies
A
and
B
and lengths
A
and
B
. The situation is depicted in g.
3.8. We model these conformational degrees of freedom by a spin variable = 1 for each
molecule, where = +1 in the A state and = 1 in the B state. Suppose the chain is
placed under a tension . We then have
H =
A
j=1
_
1
2
_
A
+
B
_
+
1
2
_
B
_
j
_

L , (3.234)
where the length is
L =
A
j=1
_
1
2
_
A
+
B
_
+
1
2
_
B
_
j
_
. (3.235)
Thus, we can write
H =
A
j=1
_
1
2
_

A
+
B
_
+
1
2
_

A

B
_
j
_
, (3.236)
where

A
=
A
A
,
B
=
B
B
. (3.237)
Once again, we have a set of ^ noninteracting spins. The partition function is Z =
A
,
where is the single monomer partition function,
= Tr e
h
= e

A
+e

B
, (3.238)
where
h =
1
2
_

A
+
B
_
+
1
2
_

A

B
_
, (3.239)
Figure 3.8: The monomers in wool are modeled as existing in one of two states. The low
energy undeformed state is A, and the higher energy deformed state is B. Applying tension
induces more monomers to enter the B state.
Figure 3.9: Upper panel: length L(, T) for k
B
T/ = 0.01 (blue), 0.1 (green), 0.5 (dark red),
and 1.0 (red). Bottom panel: dimensionless force constant k/^()
2
versus temperature.
is the single spin Hamiltonian. It is convenient to dene the dierences
=
B
A
(3.240)
=
B
A
(3.241)
=
B

A
, (3.242)
in which case the partition function Z is
Z(T, ^) = e
A
A
_
1 +e

_
A
(3.243)
F(T, ^) = ^
A
^k
B
T ln
_
1 +e
/k
B
T
_
(3.244)
The average length is
L =
L) =
F
(3.245)
= ^
A
+
^
e
()/k
B
T
+ 1
. (3.246)
Note that
k
1
=
L
=0
= ^
()
2
k
B
T
e
/k
B
T
_
e
/k
B
T
+ 1
_
2
, (3.247)
where k is the eective spring force constant for weak applied tension. The results are
shown in g. 3.9.
3.11.5 Noninteracting spin dimers
Consider a system of noninteracting spin dimers as depicted in g. 3.10. Each dimer
contains two spins, and is described by the Hamiltonian
H
dimer
= J
1
0
H(
1
+
2
) . (3.248)
Here, J is an interaction energy between the spins which comprise the dimer. If J > 0 the
interaction is ferromagnetic, which prefers that the spins are aligned. That is, the lowest
energy states are [ ) and [ ). If J < 0 the interaction is antiferromagnetic, which prefers
that spins be anti-aligned: [ ) and [ ).
6
Suppose there are N
d
dimers. Then the OCE partition function is Z =
N
d
, where (T, H)
is the single dimer partition function. To obtain (T, H), we sum over the four possible
states of the two spins, obtaining
= Tr e

H
dimer
/k
B
T
= 2 e
J/k
B
T
+ 2 e
J/k
B
T
cosh
_
2
0
H
k
B
T
_
. (3.249)
Thus, the free energy is
F(T, H, N
d
) = N
d
k
B
T ln 2 N
d
k
B
T ln
_
e
J/k
B
T
+e
J/k
B
T
cosh
_
2
0
H
k
B
T
_
_
. (3.250)
The magnetization is
M =
_
F
H
_
T,N
d
= 2N
d
e
J/k
B
T
sinh
_
2
0
H
k
B
T
_
e
J/k
B
T
+e
J/k
B
T
cosh
_
2
0
H
k
B
T
_ (3.251)
It is instructive to consider the zero eld isothermal susceptibility per spin,
T
=
1
2N
d
M
H
H=0
=

2
0
k
B
T

2 e
J/k
B
T
e
J/k
B
T
+e
J/k
B
T
. (3.252)
The quantity
2
0
/k
B
T is simply the Curie susceptibility for noninteracting classical spins.
Note that we correctly recover the Curie result when J = 0, since then the individual spins
6
Nota bene we are concerned with classical spin congurations only there is no superposition of states
allowed in this model!
Figure 3.10: A model of noninteracting spin dimers on a lattice. Each red dot represents a
classical spin for which
j
= 1.
comprising each dimer are in fact noninteracting. For the ferromagnetic case, if J k
B
T,
then we obtain
T
(J k
B
T)
2
2
0
k
B
T
. (3.253)
This has the following simple interpretation. When J k
B
T, the spins of each dimer are
eectively locked in parallel. Thus, each dimer has an eective magnetic moment
e
= 2
0
.
On the other hand, there are only half as many dimers as there are spins, so the resulting
Curie susceptibility per spin is
1
2
(2
0
)
2
/k
B
T.
When J k
B
T, the spins of each dimer are eectively locked in one of the two antiparallel
congurations. We then have
T
(J k
B
T)
2
2
0
k
B
T
e
2[J[/k
B
T
. (3.254)
In this case, the individual dimers have essentially zero magnetic moment.
3.12 Quantum Statistics and the Boltzmann Limit
Consider a system composed of N noninteracting particles. The Hamiltonian is
H =
N
j=1
h
j
. (3.255)
The single particle Hamiltonian

h has eigenstates [ ) with corresponding energy eigenvalues
. What is the partition function? Is it

H
?
=
N
e
1
+
2
+ ... +
N
_
=
N
, (3.256)
3.12. QUANTUM STATISTICS AND THE BOLTZMANN LIMIT 173
where is the single particle partition function,
=
. (3.257)
For systems where the individual particles are distinguishable, such as spins on a lattice
which have xed positions, this is indeed correct. But for particles free to move in a gas,
this equation is wrong. The reason is that for indistinguishable particles the many particle
quantum mechanical states are specied by a collection of occupation numbers n
, which
tell us how many particles are in the single-particle state [ ). The energy is
E =
(3.258)
and the total number of particles is
N =
. (3.259)
That is, each collection of occupation numbers n
labels a unique many particle state
_
. In the product
N
, the collection n
occurs many times. We have therefore

overcounted the contribution to Z
N
due to this state. By what factor have we overcounted?
It is easy to see that the overcounting factor is
degree of overcounting =
N!
!
,
which is the number of ways we can rearrange the labels
j
to arrive at the same collection
n
. This follows from the multinomial theorem,

_
K
=1
x
_
N
=
n
1
n
2

n
K
N!
n
1
! n
2
! n
K
!
x
n
1
1
x
n
2
2
x
n
K
K

N,n
1
+... +n
K
. (3.260)
Thus, the correct expression for Z
N
is
Z
N
=
N,
P
N
_
!
N!
_
e
1
+
2
+ ... +
N
_
. (3.261)
When we study quantum statistics, we shall learn how to handle these constrained sums.
For now it suces to note that in the high temperature limit, almost all the n
are either
0 or 1, hence
Z
N

N
N!
. (3.262)
This is the classical Maxwell-Boltzmann limit of quantum statistical mechanics. We now
see the origin of the 1/N! term which was so important in the thermodynamics of entropy
of mixing.
3.13 Statistical Mechanics of Molecular Gases
The states of a noninteracting atom or molecule are labeled by its total momentum p and its
internal quantum numbers, which we will simply write with a collective index , specifying
rotational, vibrational, and electronic degrees of freedom. The single particle Hamiltonian
is then
h =
p
2
2m
+

h
int
, (3.263)
with
k ,
_
=
_
2
k
2
2m
+
k ,
_
. (3.264)
The partition function is
= Tr e
h
=
p
e
p
2
/2m
j
g
j
e
j
. (3.265)
Here we have replaced the internal label with a label j of energy eigenvalues, with g
j
being the degeneracy of the internal state with energy
j
. To do the p sum, we quantize in
a box of dimensions L
1
L
2
L
d
, using periodic boundary conditions. Then
p =
_
2n
1
L
1
,
2n
2
L
2
, . . . ,
2n
d
L
d
_
, (3.266)
where each n
i
is an integer. Since the dierences between neighboring quantized p vectors
are very tiny, we can replace the sum over p by an integral:
_
d
d
p
p
1
p
d
(3.267)
where the volume in momentum space of an elementary rectangle is
p
1
p
d
=
(2)
d
L
1
L
d
=
(2)
d
V
. (3.268)
Thus,
= V
_
d
d
p
(2)
d
e
p
2
/2mk
B
T
j
g
j
e
j
/k
B
T
= V
d
T
(3.269)
(T) =
j
g
j
e
j
/k
B
T
. (3.270)
Here, (T) is the internal coordinate partition function. The full N-particle ordinary canon-
ical partition function is then
Z
N
=
1
N!
_
V
d
T
_
N
N
(T) . (3.271)
3.13. STATISTICAL MECHANICS OF MOLECULAR GASES 175
Using Stirlings approximation, we nd the Helmholtz free energy F = k
B
T ln Z is
F(T, V, N) = Nk
B
T
_
ln
_
V
d
T
_
+ 1 + ln (T)
_
(3.272)
= Nk
B
T
_
ln
_
V
d
T
_
+ 1
_
+N(T) , (3.273)
where
(T) = k
B
T ln (T) (3.274)
is the internal coordinate contribution to the single particle free energy. We could also
compute the partition function in the Gibbs (T, p, N) ensemble:
Y (T, p, N) = e
G(T,p,N)
= p
_
0
dV e
pV
Z(T, V, N) (3.275)
=
_
k
B
T
p
d
T
_
N
N
(T) . (3.276)
Thus,
(T, p) =
G(T, p, N)
N
= k
B
T ln
_
p
d
T
k
B
T
_
k
B
T ln (T) (3.277)
= k
B
T ln
_
p
d
T
k
B
T
_
+(T) . (3.278)
3.13.1 Ideal gas law
Since the internal coordinate contribution to the free energy is volume-independent, we have
V =
_
G
p
_
T,N
=
Nk
B
T
p
, (3.279)
and the ideal gas law applies. The entropy is
S =
_
G
T
_
p,N
= Nk
B
_
ln
_
k
B
T
p
d
T
_
+ 1 +
1
2
d
_
N
(T) , (3.280)
and therefore the heat capacity is
C
p
= T
_
S
T
_
p,N
=
_
1
2
d + 1
_
Nk
B
NT
(T) (3.281)
C
V
= T
_
S
T
_
V,N
=
1
2
dNk
B
NT
(T) . (3.282)
Thus, any temperature variation in C
p
must be due to the internal degrees of freedom.
3.13.2 The internal coordinate partition function
At energy scales of interest we can separate the internal degrees of freedom into distinct
classes, writing
h
int
=

h
rot
+

h
vib
+

h
elec
(3.283)
as a sum over internal Hamiltonians governing rotational, vibrational, and electronic degrees
of freedom. Then
int
=
rot

vib

elec
. (3.284)
Associated with each class of excitation is a characteristic temperature . Rotational and
vibrational temperatures of a few common molecules are listed in table tab. 3.1.
3.13.3 Rotations
Consider a class of molecules which can be approximated as an axisymmetric top. The
rotational Hamiltonian is then
h
rot
=
L
2
a
+ L
2
b
2I
1
+
L
2
c
2I
3
=

2
L(L + 1)
2I
1
+
_
1
2I
3
1
2I
1
_
L
2
c
, (3.285)
where n
a.b,c
(t) are the principal axes, with n
c
the symmetry axis, and L
a,b,c
are the com-
ponents of the angular momentum vector L about these instantaneous body-xed principal
axes. The components of L along space-xed axes x, y, z are written as L
x,y,z
. Note that
_
L
, L
c
= n
c
_
L
, L
+
_
L
, n
= i
c
L
+i
c
L
= 0 , (3.286)
which is equivalent to the statement that L
c
= n
c
L is a rotational scalar. We can
therefore simultaneously specify the eigenvalues of L
2
, L
z
, L
c
, which form a complete set
of commuting observables (CSCO)
7
. The eigenvalues of L
z
are m with m L, . . . , L,
while those of L
c
are k with k L, . . . , L. There is a (2L+1)-fold degeneracy associated
with the L
z
quantum number.
We assume the molecule is prolate, so that I
3
< I
1
. We can the dene two temperature
scales,
=

2
2I
1
k
B
,

=

2
2I
3
k
B
. (3.287)
Prolateness then means

> . We conclude that the rotational partition function for an
axisymmetric molecule is given by
rot
(T) =
L=0
(2L + 1) e
L(L+1) /T
L
k=L
e
k
2
(
e
)/T
(3.288)
molecule
rot
(K)
vib
(K)
H
2
85.4 6100
N
2
2.86 3340
H
2
O 13.7 , 21.0 , 39.4 2290 , 5180 , 5400
Table 3.1: Some rotational and vibrational temperatures of common molecules.
In diatomic molecules, I
3
is extremely small, and

k
B
T at all relevant temperatures.
Only the k = 0 term contributes to the partition sum, and we have
rot
(T) =
L=0
(2L + 1) e
L(L+1) /T
. (3.289)
When T , only the rst few terms contribute, and
rot
(T) = 1 + 3 e
2/T
+ 5 e
6/T
+. . . (3.290)
In the high temperature limit, we have a slowly varying summand. The Euler-MacLaurin
summation formula may be used to evaluate such a series:
n
k=0
F
k
=
n
_
0
dk F(k) +
1
2
_
F(0) +F(n)
j=1
B
2j
(2j)!
_
F
(2j1)
(n) F
(2j1)
(0)
_
(3.291)
where B
j
is the j
th
Bernoulli number where
B
0
= 1 , B
1
=
1
2
, B
2
=
1
6
, B
4
=
1
30
, B
6
=
1
42
. (3.292)
Thus,
k=0
F
k
=
_
0
dk F(k) +
1
2
F(0)
1
12
F
(0)
1
720
F
(0) +. . . (3.293)
and
rot
=
_
0
dL(2L + 1) e
L(L+1) /T
=
T
+
1
3
+
1
15
T
+
4
315
_
T
_
2
+. . . . (3.294)
Recall that (T) = k
B
T ln (T). We conclude that
rot
(T) 3k
B
T e
2/T
for T
and
rot
(T) k
B
T ln(T/) for T . We have seen that the internal coordinate
contribution to the heat capacity is C
V
= NT
(T). For diatomic molecules, then, this

contribution is exponentially suppressed for T , while for high temperatures we have
7
Note that while we cannot simultaneously specify the eigenvalues of two components of L along axes
xed in space, we can simultaneously specify the components of L along one axis xed in space and one
axis rotating with a body. See Landau and Lifshitz, Quantum Mechanics, 103.
C
V
= Nk
B
. One says that the rotational excitations are frozen out at temperatures
much below . Including the rst few terms, we have
C
V
(T ) = 12 Nk
B
_
T
_
2
e
2/T
+. . . (3.295)
C
V
(T ) = Nk
B
_
1 +
1
45
_
T
_
2
+
16
945
_
T
_
3
+. . .
_
. (3.296)
Note that C
V
overshoots its limiting value of Nk
B
and asymptotically approaches it from
above.
Special care must be taken in the case of homonuclear diatomic molecules, for then only
even or odd L states are allowed, depending on the total nuclear spin. This is discussed
below in 3.13.6.
For polyatomic molecules, the moments of inertia generally are large enough that the
molecules rotations can be considered classically. We then have
(L
a
, L
b
, L
c
) =
L
2
a
2I
1
+
L
2
b
2I
2
+
L
2
c
2I
3
. (3.297)
We then have
rot
(T) =
1
g
rot
_
dL
a
dL
b
dL
c
dd d
(2)
3
e
(L
a
L
b
L
c
)/k
B
T
, (3.298)
where (, ) are the Euler angles. Recall [0, 2], [0, ], and [0, 2]. The factor
g
rot
accounts for physically indistinguishable orientations of the molecule brought about by
rotations, which can happen when more than one of the nuclei is the same. We then have
rot
(T) =
_
2k
B
T
2
_
3/2
_
I
1
I
2
I
3
. (3.299)
This leads to C
V
=
3
2
Nk
B
.
3.13.4 Vibrations
Vibrational frequencies are often given in units of inverse wavelength, such as cm
1
, called
a wavenumber. To convert to a temperature scale T
, we write k
B
T
= h = hc/, hence
T
= (hc/k
B
)
1
, and we multiply by
hc
k
B
= 1.436 K cm . (3.300)
For example, infrared absorption ( 50 cm
1
to 10
4
cm
1
) reveals that the asymmetric
stretch mode of the H
2
O molecule has a vibrational frequency of = 3756 cm
1
. The
corresponding temperature scale is T
= 5394 K.
Vibrations are normal modes of oscillations. A single normal mode Hamiltonian is of the
form
h =
p
2
2m
+
1
2
m
2
q
2
=
_
a
a +
1
2
_
. (3.301)
In general there are many vibrational modes, hence many normal mode frequencies
. We
then must sum over all of them, resulting in
vib
=
()
vib
. (3.302)
For each such normal mode, the contribution is
=
n=0
e
(n+
1
2
)/k
B
T
= e
/2k
B
T
n=0
_
e
/k
B
T
_
n
=
e
/2k
B
T
1 e
/k
B
T
=
1
2 sinh(/2T)
, (3.303)
where = /k
B
. Then
= k
B
T ln
_
2 sinh(/2T)
_
=
1
2
k
B
+k
B
T ln
_
1 e
/T
_
. (3.304)
The contribution to the heat capacity is
C
V
= Nk
B
_
T
_
2
e
/T
(e
/T
1)
2
(3.305)
=
_
Nk
B
(/T)
2
exp(/T) (T 0)
Nk
B
(T )
(3.306)
3.13.5 Two-level systems : Schottky anomaly
Consider now a two-level system, with energies
0
and
1
. We dene
1

0
and
assume without loss of generality that > 0. The partition function is
= e
0
+e
1
= e
0
_
1 +e
_
. (3.307)
The free energy is
f = k
B
T ln =
0
k
B
T ln
_
1 +e
/k
B
T
_
. (3.308)
The entropy for a given two level system is then
s =
f
T
= k
B
ln
_
1 +e
/k
B
T
_
+

T

1
e
/k
B
T
+ 1
(3.309)
and the heat capacity is = T (s/T), i.e.
c(T) =

2
k
B
T
2

e
/k
B
T
_
e
/k
B
T
+ 1
_
2
. (3.310)
Figure 3.11: Heat capacity per molecule as a function of temperature for (a) heteronuclear
diatomic gases, (b) a single vibrational mode, and (c) a single two-level system.
Thus,
c (T ) =

2
k
B
T
2
e
/k
B
T
(3.311)
c (T ) =

2
k
B
T
2
. (3.312)
We nd that c(T) has a characteristic peak at T
0.42 /k
B
. The heat capacity vanishes
in both the low temperature and high temperature limits. At low temperatures, the gap to
the excited state is much greater than k
B
T, and it is not possible to populate it and store
energy. At high temperatures, both ground state and excited state are equally populated,
and once again there is no way to store energy.
If we have a distribution of independent two-level systems, the heat capacity of such a
system is a sum over the individual Schottky functions:
C(T) =
i
c (
i
/k
B
T) = ^
_
0
dP() c(/T) , (3.313)
where c(x) = k
B
x
2
e
x
/(e
x
+ 1)
2
and where P() is the normalized distribution function,
with
_
0
dP() = 1 . (3.314)
^ is the total number of two level systems. If P()
r
for 0, then the low
temperature heat capacity behaves as C(T) T
1+r
. Many amorphous or glassy systems
contain such a distribution of two level systems, with r 0 for glasses, leading to a linear
low-temperature heat capacity. The origin of these two-level systems is not always so clear
but is generally believed to be associated with local atomic congurations for which there
are two low-lying states which are close in energy. The paradigmatic example is the mixed
crystalline solid KBr
1x
KCN
x
which over the range 0.1
<
x
<
0.6 forms an orientational

glass at low temperatures. The two level systems are associated with dierent orientation
of the cyanide (CN) dipoles.
3.13.6 Electronic and nuclear excitations
For a monatomic gas, the internal coordinate partition function arises due to electronic
and nuclear degrees of freedom. Lets rst consider the electronic degrees of freedom. We
assume that k
B
T is small compared with energy dierences between successive electronic
shells. The atomic ground state is then computed by lling up the hydrogenic orbitals until
all the electrons are used up. If the atomic number is a magic number (A = 2 (He), 10
(Ne), 18 (Ar), 36 (Kr), 54 (Xe), etc.) then the atom has all shells lled and L = 0 and
S = 0. Otherwise the last shell is partially lled and one or both of L and S will be nonzero.
The atomic ground state conguration
2J+1
L
S
is then determined by Hunds rules:
1. The LS multiplet with the largest S has the lowest energy.
2. If the largest value of S is associated with several multiplets, the multiplet with the
largest L has the lowest energy.
3. If an incomplete shell is not more than half-lled, then the lowest energy state has
J = [L S[. If the shell is more than half-lled, then J = L +S.
The last of Hunds rules distinguishes between the (2S+1)(2L+1) states which result upon
xing S and L as per rules #1 and #2. It arises due to the atomic spin-orbit coupling,
whose eective Hamiltonian may be written

H = L S, where is the Russell-Saunders
coupling. If the last shell is less than or equal to half-lled, then > 0 and the ground
state has J = [L S[. If the last shell is more than half-lled, the coupling is inverted, i.e.
< 0, and the ground state has J = L +S.
8
The electronic contribution to is then
elec
=
L+S
J=[LS[
(2J + 1) e
(L,S,J)/k
B
T
(3.315)
where
(L, S, J) =
1
2
_
J(J + 1) L(L + 1) S(S + 1)
_
. (3.316)
At high temperatures, k
B
T is larger than the energy dierence between the dierent J
multiplets, and we have
elec
(2L+1)(2S +1) e
0
, where
0
is the ground state energy.
At low temperatures, a particular value of J is selected that determined by Hunds third
8
See e.g. 72 of Landau and Lifshitz, Quantum Mechanics, which, in my humble estimation, is the greatest
physics book ever written.
rule and we have
elec
(2J + 1) e
0
. If, in addition, there is a nonzero nuclear spin I,
then we also must include a factor
nuc
= (2I +1), neglecting the small hyperne splittings
due to the coupling of nuclear and electronic angular momenta.
For heteronuclear diatomic molecules, i.e. molecules composed from two dierent atomic
nuclei, the internal partition function simply receives a factor of
elec

(1)
nuc

(2)
nuc
, where the
rst term is a sum over molecular electronic states, and the second two terms arise from the
spin degeneracies of the two nuclei. For homonuclear diatomic molecules, the exchange of
nuclear centers is a symmetry operation, and does not represent a distinct quantum state.
To correctly count the electronic states, we rst assume that the total electronic spin is
S = 0. This is generally a very safe assumption. Exchange symmetry now puts restrictions
on the possible values of the molecular angular momentum L, depending on the total nuclear
angular momentum I
tot
. If I
tot
is even, then the molecular angular momentum L must also
be even. If the total nuclear angular momentum is odd, then L must be odd. This is so
because the molecular ground state conguration is
1
+
g
.
9
The total number of nuclear states for the molecule is (2I + 1)
2
, of which some are even
under nuclear exchange, and some are odd. The number of even states, corresponding to
even total nuclear angular momentum is written as g
g
, where the subscript conventionally
stands for the (mercifully short) German word gerade, meaning even. The number of odd
(Ger. ungerade) states is written g
u
. Table 3.2 gives the values of g
g,u
corresponding to
half-odd-integer I and integer I.
The nal answer for the rotational component of the internal molecular partition function
is then
rot
(T) = g
g

g
+g
u
u
, (3.317)
where
g
=
L even
(2L + 1) e
L(L+1) /T
(3.318)
u
=
L odd
(2L + 1) e
L(L+1) /T
. (3.319)
For hydrogen, the molecules with the larger nuclear statistical weight are called orthohy-
drogen and those with the smaller statistical weight are called parahydrogen. For H
2
, we
have I =
1
2
hence the ortho state has g
u
= 3 and the para state has g
g
= 1. In D
2
, we have
I = 1 and the ortho state has g
g
= 6 while the para state has g
u
= 3. In equilibrium, the
ratio of ortho to para states is then
N
ortho
H
2
N
para
H
2
=
g
u
u
g
g

g
=
3
u
g
,
N
ortho
D
2
N
para
D
2
=
g
g

g
g
u
u
=
2
g
u
. (3.320)
9
See Landau and Lifshitz, Quantum Mechanics, 86.
3.14. DISSOCIATION OF MOLECULAR HYDROGEN 183
2I + 1 g
g
g
u
odd I(2I + 1) (I + 1)(2I + 1)
even (I + 1)(2I + 1) I(2I + 1)
Table 3.2: Number of even (g
g
) and odd (g
u
) total nuclear angular momentum states for a
homonuclear diatomic molecule. I is the ground state nuclear spin.
3.14 Dissociation of Molecular Hydrogen
Consider the reaction
H

p
+
+ e
. (3.321)
In equilibrium, we have
H
=
p
+
e
. (3.322)
What is the relationship between the temperature T and the fraction x of hydrogen which
is dissociated?
Let us assume a fraction x of the hydrogen is dissociated. Then the densities of H, p, and
e are then
n
H
= (1 x) n , n
p
= xn , n
e
= xn . (3.323)
The single particle partition function for each species is
=
g
N
N!
_
V
3
T
_
N
e
N
int
/k
B
T
, (3.324)
where g is the degeneracy and
int
the internal energy for a given species. We have
int
= 0
for p and e, and
int
= for H, where = e
2
/2a
B
= 13.6 eV, the binding energy of
hydrogen. Neglecting hyperne splittings
10
, we have g
H
= 4, while g
e
= g
p
= 2 because
each has spin S =
1
2
. Thus, the associated grand potentials are
H
= g
H
V k
B
T
3
T,H
e
(
H
+)/k
B
T
(3.325)
p
= g
p
V k
B
T
3
T,p
e
p
/k
B
T
(3.326)
e
= g
e
V k
B
T
3
T,e
e
e
/k
B
T
, (3.327)
where
T,a
=
2
2
m
a
k
B
T
(3.328)
for species a. The corresponding number densities are
n =
1
V
_
_
T,V
= g
3
T
e
(
int
)/k
B
T
, (3.329)
10
The hyperne splitting in hydrogen is on the order of (m
e
/m
p
)
4
m
e
c
2
10
6
eV, which is on the order
of 0.01 K. Here = e
2
/c is the ne structure constant.
and the fugacity z = e
/k
B
T
of a given species is given by
z = g
1
n
3
T
e
int
/k
B
T
. (3.330)
We now invoke
H
=
p
+
e
, which says z
H
= z
p
z
e
, or
g
1
H
n
H
3
T,H
e
/k
B
T
=
_
g
1
p
n
p
3
T,p
__
g
1
e
n
e
3
T,e
_
, (3.331)
which yields
_
x
2
1 x
_
n
3
T
= e
/k
B
T
, (3.332)
where

T
=
_
2
2
/m
k
B
T, with m
= m
p
m
e
/m
H
m
e
. Note that
T
= a
B
4m
H
m
p

k
B
T
, (3.333)
where a
B
= 0.529
A is the Bohr radius. Thus, we have

_
x
2
1 x
_
(4)
3/2
=
_
T
T
0
_
3/2
e
T
0
/T
, (3.334)
where T
0
= /k
B
= 1.578 10
5
K and = na
3
B
. Consider for example a temperature
T = 3000 K, for which T
0
/T = 52.6, and assume that x =
1
2
. We then nd = 1.6910
27
,
corresponding to a density of n = 1.14 10
2
cm
3
. At this temperature, the fraction of
hydrogen molecules in their rst excited (2s) state is x
e
T
0
/2T
= 3.8 10
12
. This is
quite striking: half the hydrogen atoms are completely dissociated, which requires an energy
of , yet the number in their rst excited state, requiring energy
1
2
, is twelve orders of
magnitude smaller. The student should reect on why this can be the case.
3.15 Lee-Yang Theory
How can statistical mechanics describe phase transitions? This question was addressed in
some beautiful mathematical analysis by Lee and Yang
11
. Consider the grand partition
function ,
(T, V, z) =
N=0
z
N
Q
N
(T, V )
dN
T
. (3.335)
Suppose further that these classical particles have hard cores. Then for any nite volume,
there must be some maximum number N
V
such that Q
N
(T, V ) vanishes for N > N
V
. This
is because if N > N
V
at least two spheres must overlap, in which case the potential energy
is innite. The theoretical maximum packing density for hard spheres is achieved for a
11
See C. N. Yang and R. D. Lee, Phys. Rev. 87, 404 (1952) and ibid, p. 410
3.15. LEE-YANG THEORY 185
Figure 3.12: In the thermodynamic limit, the grand partition function can develop a sin-
gularity at positive real fugacity z. The set of discrete zeros fuses into a branch cut.
hexagonal close packed (HCP) lattice
12
, for which f
HCP
=

3
2
= 0.74048. If the spheres
have radius r
0
, then N
V
= V/4
2r
3
0
is the maximum particle number.
Thus, if V itself is nite, then (T, V, z) is a nite degree polynomial in z, and may be
factorized as
(T, V, z) =
N
V
N=0
z
N
Q
N
(T, V )
dN
T
=
N
V
k=1
_
1
z
z
k
_
, (3.336)
where z
k
(T, V ) is one of the N
V
zeros of the grand partition function. Note that the O(z
0
)
term is xed to be unity. Note also that since the conguration integrals Q
N
(T, V ) are
all positive, (z) is an increasing function along the positive real z axis. In addition,
since the coecients of z
N
in the polynomial (z) are all real, then (z) = 0 implies
(z) = ( z) = 0, so the zeros of (z) are either real and negative or else come in complex
conjugate pairs.
For nite N
V
, the situation is roughly as depicted in the left panel of g. 3.12, with a
set of N
V
zeros arranged in complex conjugate pairs (or negative real values). The zeros
arent necessarily distributed along a circle as shown in the gure, though. They could be
anywhere, so long as they are symmetrically distributed about the Re(z) axis, and no zeros
occur for z real and nonnegative.
12
See e.g. http://en.wikipedia.org/wiki/Close-packing . For randomly close-packed hard spheres, one nds,
from numerical simulations, f
RCP
= 0.644.
Lee and Yang proved the existence of the limits
p
k
B
T
= lim
V
1
V
ln (T, V, z) (3.337)
n = lim
V
z

z
_
1
V
ln (T, V, z)
_
, (3.338)
and notably the result
n = z

z
_
p
k
B
T
_
, (3.339)
which amounts to the commutativity of the thermodynamic limit V with the dif-
ferential operator z

z
. In particular, p(T, z) is a smooth function of z in regions free of
roots. If the roots do coalesce and pinch the positive real axis, then then density n can be
discontinuous, as in a rst order phase transition, or a higher derivative
j
p/n
j
can be
discontinuous or divergent, as in a second order phase transition.
3.15.1 Electrostatic analogy
There is a beautiful analogy to the theory of two-dimensional electrostatics. We write
p
k
B
T
=
1
V
N
V
k=1
ln
_
1
z
z
k
_
=
N
V
k=1
_
(z z
k
) (0 z
k
)
_
, (3.340)
where
(z) =
1
V
ln(z) (3.341)
is the complex potential due to a line charge of linear density = V
1
located at origin.
The number density is then
n = z

z
_
p
k
B
T
_
= z

z
N
V
k=1
(z z
k
) , (3.342)
to be evaluated for physical values of z, i.e. z R
+
. Since (z) is analytic,
z
=
1
2
x
+
i
2
y
= 0 . (3.343)
If we decompose the complex potential =
1
+ i
2
into real and imaginary parts, the
condition of analyticity is recast as the Cauchy-Riemann equations,
1
x
=

2
y
,

1
y
=
2
x
. (3.344)
3.15. LEE-YANG THEORY 187
Thus,
z
=
1
2
x
+
i
2
y
=
1
2
_
1
x
+

2
y
_
+
i
2
_
1
y

2
x
_
=
1
x
+i

1
y
= E
x
iE
y
, (3.345)
where E =
1
is the electric eld. Suppose, then, that as V a continuous charge
distribution develops, which crosses the positive real z axis at a point x R
+
. Then
n
+
n
x
= E
x
(x
+
) E
x
(x
) = 4(x) , (3.346)
where is the linear charge density (assuming logarithmic two-dimensional potentials), or
the two-dimensional charge density (if we extend the distribution along a third axis).
3.15.2 Example
As an example, consider the function
(z) =
(1 +z)
M
(1 z)
M
1 z
(3.347)
= (1 +z)
M
_
1 +z +z
2
+. . . +z
M1
_
. (3.348)
The (2M1) degree polynomial has an M
th
order zero at z = 1 and (M1) simple zeros
at z = e
2ik/M
, where k 1, . . . , M1. Since M serves as the maximum particle number
N
V
, we may assume that V = Mv
0
, and the V limit may be taken as M . We
then have
p
k
B
T
= lim
V
1
V
ln (z)
=
1
v
0
lim
M
1
M
ln (z)
=
1
v
0
lim
M
1
M
_
M ln(1 +z) + ln
_
1 z
M
_
ln(1 z)
_
. (3.349)
The limit depends on whether [z[ > 1 or [z[ < 1, and we obtain
p v
0
k
B
T
=
_
_
ln(1 +z) if [z[ < 1
_
ln(1 +z) + ln z
_
if [z[ > 1 .
(3.350)
Figure 3.13: Fugacity z and p v
0
/k
B
T versus dimensionless specic volume v/v
0
for the
example problem discussed in the text.
Thus,
n = z

z
_
p
k
B
T
_
=
_
_
1
v
0
z
1+z
if [z[ < 1
1
v
0
_
z
1+z
+ 1
_
if [z[ > 1 .
(3.351)
If we solve for z(v), where v = n
1
, we nd
z =
_
_
v
0
vv
0
if v > 2v
0
v
0
v
2vv
0
if
1
2
v
0
< v <
2
3
v
0
.
(3.352)
We then obtain the equation of state,
p v
0
k
B
T
=
_
_
ln
_
v
vv
0
_
if v > 2v
0
ln 2 if
2
3
v
0
< v < 2v
0
ln
_
v(v
0
v)
(2vv
0
)
2
_
if
1
2
v
0
< v <
2
3
v
0
.
(3.353)
3.16 Appendix I : Additional Examples
3.16.1 Three state system
Consider a spin-1 particle where = 1, 0, +1. We model this with the single particle
Hamiltonian
h =
0
H + (1
2
) . (3.354)
3.16. APPENDIX I : ADDITIONAL EXAMPLES 189
We can also interpret this as describing a spin if = 1 and a vacancy if = 0. The
parameter then represents the vacancy formation energy. The single particle partition
function is
= Tr e
h
= e
+ 2 cosh(
0
H) . (3.355)
With ^ distinguishable noninteracting spins (e.g. at dierent sites in a crystalline lattice),
we have Z =
A
and
F ^f = k
B
T ln Z = ^k
B
T ln
_
e
+ 2 cosh(
0
H)
_
, (3.356)
where f = k
B
T ln is the free energy of a single particle. Note that
n
V
= 1
2
=

(3.357)
m =
0
=
h
H
(3.358)
are the vacancy number and magnetization, respectively. Thus,
n
V
=
n
V
_
=
f
=
e
/k
B
T
e
/k
B
T
+ 2 cosh(
0
H/k
B
T)
(3.359)
and
m =
m
_
=
f
H
=
2
0
sinh(
0
H/k
B
T)
e
/k
B
T
+ 2 cosh(
0
H/k
B
T)
. (3.360)
At weak elds we can compute
T
=
m
H
H=0
=

2
0
k
B
T

2
2 +e
/k
B
T
. (3.361)
We thus obtain a modied Curie law. At temperatures T /k
B
, the vacancies are frozen
out and we recover the usual Curie behavior. At high temperatures, where T /k
B
, the
low temperature result is reduced by a factor of
2
3
, which accounts for the fact that one
third of the time the particle is in a nonmagnetic state with = 0.
3.16.2 Spins and vacancies on a surface
PROBLEM: A collection of spin-
1
2
particles is conned to a surface with N sites. For each
site, let = 0 if there is a vacancy, = +1 if there is particle present with spin up, and
= 1 if there is a particle present with spin down. The particles are non-interacting, and
the energy for each site is given by = W
2
, where W < 0 is the binding energy.
(a) Let Q = N
+ N
be the number of spins, and N

0
be the number of vacancies. The
surface magnetization is M = N
. Compute, in the microcanonical ensemble,

the statistical entropy S(Q, M).
(b) Let q = Q/N and m = M/N be the dimensionless particle density and magnetization
density, respectively. Assuming that we are in the thermodynamic limit, where N, Q,
and M all tend to innity, but with q and m nite, Find the temperature T(q, m).
Recall Stirlings formula
ln(N!) = N ln N N +O(ln N) .
(c) Show explicitly that T can be negative for this system. What does negative T mean?
What physical degrees of freedom have been left out that would avoid this strange
property?
SOLUTION: There is a constraint on N
, N
0
, and N
:
N
+N
0
+N
= Q+N
0
= N .
The total energy of the system is E = WQ.
(a) The number of states available to the system is
=
N!
N
! N
0
! N
!
.
Fixing Q and M, along with the above constraint, is enough to completely determine
N
, N
0
, N
:
N
=
1
2
(Q +M) , N
0
= N Q , N
=
1
2
(QM) ,
whence
(Q, M) =
N!
_
1
2
(Q+M)
!
_
1
2
(QM)
! (N Q)!
.
The statistical entropy is S = k
B
ln :
S(Q, M) = k
B
ln(N!) k
B
ln
_
1
2
(Q +M)!
k
B
ln
_
1
2
(QM)!
k
B
ln
_
(N Q)!
.
(b) Now we invoke Stirlings rule,
ln(N!) = N ln N N +O(ln N) ,
to obtain
ln (Q, M) = N ln N N
1
2
(Q +M) ln
_
1
2
(Q +M)
+
1
2
(Q +M)
1
2
(QM) ln
_
1
2
(QM)
+
1
2
(QM) (N Q) ln(N Q) + (N Q)
= N ln N
1
2
Qln
_
1
4
(Q
2
M
2
)
_
1
2
M ln
_
Q +M
QM
_
= Nq ln
_
1
2
_
q
2
m
2
_
1
2
Nmln
_
q +m
q m
_
N(1 q) ln(1 q) ,
3.16. APPENDIX I : ADDITIONAL EXAMPLES 191
where Q = Nq and M = Nm. Note that the entropy S = k
B
ln is extensive. The
statistical entropy per site is thus
s(q, m) = k
B
q ln
_
1
2
_
q
2
m
2
_
1
2
k
B
mln
_
q +m
q m
_
k
B
(1 q) ln(1 q) .
The temperature is obtained from the relation
1
T
=
_
S
E
_
M
=
1
W
_
s
q
_
m
=
1
W
ln(1 q)
1
W
ln
_
1
2
_
q
2
m
2
_
.
Thus,
T =
W/k
B
ln
_
2(1 q)/
_
q
2
m
2
.
(c) We have 0 q 1 and q m q, so T is real (thank heavens!). But it is
easy to choose q, m such that T < 0. For example, when m = 0 we have T =
W/k
B
ln(2q
1
2) and T < 0 for all q
_
2
3
, 1
. The reason for this strange state

of aairs is that the entropy S is bounded, and is not an monotonically increasing
function of the energy E (or the dimensionless quantity Q). The entropy is maximized
for N = N
0
= N
=
1
3
, which says m = 0 and q =
2
3
. Increasing q beyond this point
(with m = 0 xed) starts to reduce the entropy, and hence (S/E) < 0 in this range,
which immediately gives T < 0. What weve left out are kinetic degrees of freedom,
such as vibrations and rotations, whose energies are unbounded, and which result in
an increasing S(E) function.
3.16.3 Fluctuating interface
Consider an interface between two dissimilar uids. In equilibrium, in a uniform gravita-
tional eld, the denser uid is on the bottom. Let z = z(x, y) be the height the interface
between the uids, relative to equilibrium. The potential energy is a sum of gravitational
and surface tension terms, with
U
grav
=
_
d
2
x
z
_
0
dz
g z
(3.362)
U
surf
=
_
d
2
x
1
2
(z)
2
. (3.363)
We wont need the kinetic energy in our calculations, but we can include it just for com-
pleteness. It isnt so clear how to model it a priori so we will assume a rather general
form
T =
_
d
2
x
_
d
2
x
1
2
(x, x
)
z(x, t)
t
z(x
, t)
t
. (3.364)
We assume that the (x, y) plane is a rectangle of dimensions L
x
L
y
. We also assume
(x, x
) =
_
[x x
[
_
. We can then Fourier transform
z(x) =
_
L
x
L
y
_
1/2
k
z
k
e
ikx
, (3.365)
where the wavevectors k are quantized according to
k =
2n
x
L
x
x +
2n
y
L
y
y , (3.366)
with integer n
x
and n
y
, if we impose periodic boundary conditions (for calculational con-
venience). The Lagrangian is then
L =
1
2
k
_
z
k
_
g +k
2
_
z
k
2
_
, (3.367)
where
k
=
_
d
2
x
_
[x[
_
e
ikx
. (3.368)
Since z(x, t) is real, we have the relation z
k
= z
k
, therefore the Fourier coecients at k
and k are not independent. The canonical momenta are given by
p
k
=
L
z
k
=
k
z
k
, p
k
=
L
z
k
=
k
z
k
(3.369)
The Hamiltonian is then
H =
_
p
k
z
k
+p
k
z
k
_
L (3.370)
=
_
[p
k
[
2
k
+
_
g +k
2
_
[z
k
[
2
_
, (3.371)
where the prime on the k sum indicates that only one of the pair k, k is to be included,
for each k.
We may now compute the ordinary canonical partition function:
Z =
_
d
2
p
k
d
2
z
k
(2)
2
e
[p
k
[
2
/
k
k
B
T
e
(g +k
2
) [z
k
[
2
/k
B
T
=
_
k
B
T
2
_
2
_

k
g +k
2
_
. (3.372)
Thus,
F = k
B
T
k
ln
_
k
B
T
2
k
_
, (3.373)
3.17. APPENDIX II : CANONICAL TRANSFORMATIONS IN HAMILTONIAN MECHANICS193
where
13
k
=
_
g +k
2
k
_
1/2
. (3.374)
is the normal mode frequency for surface oscillations at wavevector k. For deep water waves,
it is appropriate to take
k
=
_
[k[, where =
L

G

L
is the dierence between
the densities of water and air.
It is now easy to compute the thermal average
[z
k
[
2
_
=
_
d
2
z
k
[z
k
[
2
e
(g +k
2
) [z
k
[
2
/k
B
T
__
d
2
z
k
e
(g +k
2
) [z
k
[
2
/k
B
T
(3.375)
=
k
B
T
g +k
2
. (3.376)
Note that this result does not depend on
k
, i.e. on our choice of kinetic energy. One denes
the correlation function
C(x)
z(x) z(0)
_
=
1
L
x
L
y
[z
k
[
2
_
e
ikx
=
_
d
2
k
(2)
2
_
k
B
T
g +k
2
_
e
ikx
=
k
B
T
4
_
0
dq
e
ik[x[
_
q
2
+
2
=
k
B
T
4
K
0
_
[x[/
_
, (3.377)
where =
_
g / is the correlation length, and where K
0
(z) is the Bessel function of
imaginary argument. The asymptotic behavior of K
0
(z) for small z is K
0
(z) ln(2/z),
whereas for large z one has K
0
(z) (/2z)
1/2
e
z
. We see that on large length scales the
correlations decay exponentially, but on small length scales they diverge. This divergence
is due to the improper energetics we have assigned to short wavelength uctuations of the
interface. Roughly, it can cured by imposing a cuto on the integral, or by insisting that
the shortest distance scale is a molecular diameter.
3.17 Appendix II : Canonical Transformations in Hamilto-
nian Mechanics
The Euler-Lagrange equations of motion of classical mechanics are invariant under a redef-
inition of generalized coordinates,
Q
= Q
(q
1
, . . . , q
r
, t) , (3.378)
called a point transformation. That is, if we express the new Lagrangian in terms of the
new coordinates and their time derivatives, viz.
L
_
Q,

Q, t) = L
_
q(Q, t) , q(Q,

Q, t) , t
_
, (3.379)
13
Note that there is no prime on the k sum for F, as we have divided the logarithm of Z by two and
replaced the half sum by the whole sum.
then the equations of motion remain of the form
L
Q
=
d
dt
_

_
. (3.380)
Hamiltons equations,
q
=
H
p
, p
=
H
q
(3.381)
are invariant under a much broader class of transformations which mix all the q
s and p
s,
called canonical transformations. The general form for a canonical transformation is
q
= q
_
Q
1
, . . . , Q
r
, P
1
, . . . , P
r
, t
_
(3.382)
p
= p
_
Q
1
, . . . , Q
r
, P
1
, . . . , P
r
, t
_
, (3.383)
with 1, . . . , r. We may also write
i
=
i
_
1
, . . . ,
2r
, t
_
, (3.384)
with i 1, . . . , 2r. Here we have
i
=
_
q
i
if 1 i r
p
ir
if n i 2r
,
i
=
_
Q
i
if 1 i r
P
ir
if r i 2r .
(3.385)
The transformed Hamiltonian is

H(Q, P, t).
What sorts of transformations are allowed? Well, if Hamiltons equations are to remain
invariant, then
=

H
P
,

P
=

H
Q
, (3.386)
which gives
+

P
= 0 =

i
. (3.387)
I.e. the ow remains incompressible in the new (Q, P) variables. We will also require that
phase space volumes are preserved by the transformation, i.e.
det
_
j
_
=
(Q, P)
(q, p)
= 1 . (3.388)
This last condition guarantees the invariance of the phase space measure
d = h
r
r
=1
dq
dp
, (3.389)
where h in the normalization prefactor is Plancks constant.
Chapter 4
Noninteracting Quantum Systems
4.1 References
1967, and with good reason.
This is the best undergraduate thermodynamics book Ive come across, but only 40%
of the book treats statistical mechanics.
C. Kittel, Elementary Statistical Physics (Dover, 2004)
Remarkably crisp, though dated, this text is organized as a series of brief discussions
of key concepts and examples. Published by Dover, so you cant beat the price.
R. K. Pathria, Statistical Mechanics (2
nd
edition, Butterworth-Heinemann, 1996)
This popular graduate level text contains many detailed derivations which are helpful
for the student.
rd
edition, World
Scientic, 2006)
rd
edition, Pergamon,
1980)
195
196 CHAPTER 4. NONINTERACTING QUANTUM SYSTEMS
4.2 Grand Canonical Ensemble for Quantum Systems
A noninteracting many-particle quantum Hamiltonian may be written as
H =
, (4.1)
where n
is the number of particles in the quantum state with energy
. This form
is called the second quantized representation of the Hamiltonian. The number eigenbasis
is therefore also an energy eigenbasis. Any eigenstate of

H may be labeled by the integer
eigenvalues of the n
number operators, and written as
n
1
, n
2
, . . .
_
. We then have
n
n
_
= n
n
_
(4.2)
and
n
_
=
n
_
. (4.3)
The eigenvalues n
take on dierent possible values depending on whether the constituent

particles are bosons or fermions, viz.
bosons : n

_
0 , 1 , 2 , 3 , . . .
_
(4.4)
fermions : n

_
0 , 1
_
. (4.5)
In other words, for bosons, the occupation numbers are nonnegative integers. For fermions,
the occupation numbers are either 0 or 1 due to the Pauli principle, which says that at
most one fermion can occupy any single particle quantum state. There is no Pauli principle
for bosons.
The N-particle partition function Z
N
is then
Z
N
=
N,
P
, (4.6)
where the sum is over all allowed values of the set n
, which depends on the statistics of

the particles. Bosons satisfy Bose-Einstein (BE) statistics, in which n
0 , 1 , 2 , . . ..
Fermions satisfy Fermi-Dirac (FD) statistics, in which n
0 , 1.
The OCE partition sum is dicult to perform, owing to the constraint
= N on the
total number of particles. This constraint is relaxed in the GCE, where
=
N
e
N
Z
N
=
e
(
) n
_
. (4.7)
4.2. GRAND CANONICAL ENSEMBLE FOR QUANTUM SYSTEMS 197
Note that the grand partition function takes the form of a product over contributions
from the individual single particle states.
We now perform the single particle sums:
n=0
e
() n
=
1
1 e
()
(bosons) (4.8)
1
n=0
e
() n
= 1 +e
()
(fermions) . (4.9)
Therefore we have
BE
=
1
1 e
(
)/k
B
T
(4.10)
BE
= k
B
T
ln
_
1 e
(
)/k
B
T
_
(4.11)
and
FD
=
_
1 +e
(
)/k
B
T
_
(4.12)
FD
= k
B
T
ln
_
1 +e
(
)/k
B
T
_
. (4.13)
We can combine these expressions into one, writing
(T, V, ) = k
B
T
ln
_
1 e
(
)/k
B
T
_
, (4.14)
where we take the upper sign for Bose-Einstein statistics and the lower sign for Fermi-Dirac
statistics. Note that the average occupancy of single particle state is
n
) =

=
1
e
(
)/k
B
T
1
, (4.15)
and the total particle number is then
N(T, V, ) =
1
e
(
)/k
B
T
1
. (4.16)
We will henceforth write n
(, T) = n
) for the thermodynamic average of this occupancy.

4.2.1 Maxwell-Boltzmann limit
Note also that if n
(, T) 1 then
k
B
T, and

MB
= k
B
T
e
(
)/k
B
T
. (4.17)
This is the Maxwell-Boltzmann limit of quantum statistical mechanics. The occupation
number average is then
n
) = e
(
)/k
B
T
(4.18)
in this limit.
4.2.2 Single particle density of states
The single particle density of states per unit volume g() is dened as
g() =
1
V
) . (4.19)
We can then write
(T, V, ) = V k
B
T
d g() ln
_
1 e
()/k
B
T
_
. (4.20)
For particles with a dispersion (k), with p = k, we have
g() = g
_
d
d
k
(2)
d
( (k)
_
(4.21)
=
g
d
(2)
d
k
d1
d/dk
. (4.22)
where g = 2S+1 is the spin degeneracy. Thus, we have
g() =
g
d
(2)
d
k
d1
d/dk
=
_
_
g
dk
d
d = 1
g
2
k
dk
d
d = 2
g
2
2
k
2 dk
d
d = 3 .
(4.23)
In order to obtain g() as a function of the energy one must invert the dispersion relation
= (k) to obtain k = k().
Note that we can equivalently write
g() d = g
d
d
k
(2)
d
=
g
d
(2)
d
k
d1
dk (4.24)
to derive g().
For a spin-S particle with ballistic dispersion (k) =
2
k
2
/2m, we have
g() =
2S+1
(d/2)
_
m
2
2
_
d/2
d
2
1
() , (4.25)
where () is the step function, which takes the value 0 for < 0 and 1 for 0. The
appearance of () simply says that all the single particle energy eigenvalues are nonnega-
tive. Note that we are assuming a box of volume V but we are ignoring the quantization of
kinetic energy, and assuming that the dierence between successive quantized single particle
4.3. QUANTUM IDEAL GASES : LOW DENSITY EXPANSIONS 199
energy eigenvalues is negligible so that g() can be replaced by the average in the above
expression. Note that
n(, T, ) =
1
e
()/k
B
T
1
. (4.26)
This result holds true independent of the form of g(). The average total number of particles
is then
N(T, V, ) = V
d g()
1
e
()/k
B
T
1
, (4.27)
which does depend on g().
4.3 Quantum Ideal Gases : Low Density Expansions
From eqn. 4.27, we have that the number density n = N/V is
n(T, z) =
d
g()
z
1
e
/k
B
T
1
, (4.28)
where z = exp(/k
B
T) is the fugacity. From = pV and our expression above for
(T, V, ), we have the equation of state
p(T, z) = k
B
T
d g() ln
_
1 z e
/k
B
T
_
. (4.29)
We dene the integrated density of states H() as
H()
g(
) . (4.30)
Assuming a bounded spectrum, we have H() = 0 for <
0
, for some nite
0
. For an
ideal gas of spin-S particles, the integrated DOS is
H() =
2S+1
_
1 +
d
2
_
_
m
2
2
_
d/2
d
2
() (4.31)
The pressure p(T, ) is thus given by
p(T, z) = k
B
T
d H
() ln
_
1 z e
/k
B
T
_
, (4.32)
Integrating by parts, we have
1
p(T, z) =
d
H()
z
1
e
/k
B
T
1
, (4.33)
This last result can also be derived from eqn. 4.32 from the Gibbs-Duhem relation,
d = s dT +v dp = n = v
1
=
_
p
_
T
=
z
k
B
T
_
p
z
_
T
. (4.34)
We now expand in powers of z, writing
1
z
1
e
/k
B
T
1
=
z e
/k
B
T
1 ze
/k
B
T
= z e
/k
B
T
z
2
e
2/k
B
T
+z
3
e
3/k
B
T
+. . .
=
j=1
(1)
j1
z
j
e
j/k
B
T
. (4.35)
We then have
n(T, z) =
j=1
(1)
j1
z
j
C
j
(T) (4.36)
p(T, z) =
j=1
(1)
j1
z
j
D
j
(T) , (4.37)
where the expansion coecients are the following integral transforms of g() and H():
C
j
(T) =
d g() e
j/k
B
T
(4.38)
D
j
(T) =
d H() e
j/k
B
T
. (4.39)
The expansion coecients C
j
(T) all have dimensions of number density, and the coecients
D
j
(T) all have dimensions of pressure. Note that we can integrate the rst of these equations
by parts, using g() = H
(), to obtain
C
j
(T) =
d
d
d
_
H()
_
e
j/k
B
T
=
j
k
B
T
d H() e
j/k
B
T
=
j
k
B
T
D
j
(T) . (4.40)
Thus, we can write
D
j
(T) =
1
j
k
B
T C
j
(T) (4.41)
1
As always, the integration by parts generates a total derivative term which is to be evaluated at the
endpoints = . In our case, this term vanishes at the lower limit = because H() is identically
zero for < 0, and it vanishes at the upper limit because of the behavior of e
/k
B
T
.
and
p(T, z) = k
B
T
j=1
(1)
j1
j
z
j
C
j
(T) . (4.42)
4.3.1 Virial expansion of the equation of state
Eqns. 4.36 and 4.42 express n(T, z) and p(T, z) as power series in the fugacity z, with T-
dependent coecients. In principal, we can eliminate z using eqn. 4.36, writing z = z(T, n)
as a power series in the number density n, and substitute this into eqn. 4.37 to obtain an
equation of state p = p(T, n) of the form
p(T, n) = nk
B
T
_
1 +B
2
(T) n +B
3
(T) n
2
+. . .
_
. (4.43)
Note that the low density limit n 0 yields the ideal gas law independent of the density of
states g(). This follows from expanding n(T, z) and p(T, z) to lowest order in z, yielding
n = C
1
z +O(z
2
) and p = k
B
T C
1
z +O(z
2
). Dividing the second of these equations by the
rst yields p = nk
B
T +O(n
2
), which is the ideal gas law. Note that z = n/C
1
+O(n
2
) can
formally be written as a power series in n.
Unfortunately, there is no general analytic expression for the virial coecients B
j
(T) in
terms of the expansion coecients n
j
(T). The only way is to grind things out order by
order in our expansions. Lets roll up our sleeves and see how this is done. We start by
formally writing z(T, n) as a power series in the density n with T-dependent coecients
A
j
(T):
z = A
1
n +A
2
n
2
+A
3
n
3
+. . . . (4.44)
We then insert this into the series for n(T, z):
n = C
1
z C
2
z
2
+C
3
z
3
+. . .
= C
1
_
A
1
n +A
2
n
2
+A
3
n
3
+. . .
_
C
2
_
A
1
n +A
2
n
2
+A
3
n
3
+. . .
_
2
+C
3
_
A
1
n +A
2
n
2
+A
3
n
3
+. . .
_
3
+. . . . (4.45)
Lets expand the RHS to order n
3
. Collecting terms, we have
n = C
1
A
1
n +
_
C
1
A
2
C
2
A
2
1
_
n
2
+
_
C
1
A
3
2C
2
A
1
A
2
+C
3
A
3
1
_
n
3
+. . . (4.46)
In order for this equation to be true we require that the coecient of n on the RHS be
unity, and that the coecients of n
j
for all j > 1 must vanish. Thus,
C
1
A
1
= 1 (4.47)
C
1
A
2
C
2
A
2
1
= 0 (4.48)
C
1
A
3
2C
2
A
1
A
2
+C
3
A
3
1
= 0 . (4.49)
The rst of these yields A
1
:
A
1
=
1
C
1
. (4.50)
We now insert this into the second equation to obtain A
2
:
A
2
=
C
2
C
3
1
. (4.51)
Next, insert the expressions for A
1
and A
2
into the third equation to obtain A
3
:
A
3
=
2C
2
2
C
5
1
C
3
C
4
1
. (4.52)
This procedure rapidly gets tedious!
And were only half way done. We still must express p in terms of n:
p
k
B
T
= C
1
_
A
1
n +A
2
n
2
+A
3
n
3
+. . .
_
1
2
C
2
_
A
1
n +A
2
n
2
+A
3
n
3
+. . .
_
2
+
1
3
C
3
_
A
1
n +A
2
n
2
+A
3
n
3
+. . .
_
3
+. . . (4.53)
= C
1
A
1
n +
_
C
1
A
2
1
2
C
2
A
2
1
_
n
2
+
_
C
1
A
3
C
2
A
1
A
2
+
1
3
C
3
A
3
1
_
n
3
+. . . (4.54)
= n +B
2
n
2
+B
3
n
3
+. . . (4.55)
We can now write
B
2
= C
1
A
2
1
2
C
2
A
2
1
=
C
2
2C
2
1
(4.56)
B
3
= C
1
A
3
C
2
A
1
A
2
+
1
3
C
3
A
3
1
=
C
2
2
C
4
1
2 C
3
3 C
3
1
. (4.57)
It is easy to derive the general result that B
F
j
= (1)
j1
B
B
j
, where the superscripts denote
Fermi (F) or Bose (B) statistics.
We remark that the equation of state for classical (and quantum) interacting systems also
can be expanded in terms of virial coecients. Consider, for example, the van der Waals
equation of state,
_
p +
aN
2
V
2
_
_
V Nb) = Nk
B
T . (4.58)
This may be recast as
p =
nk
B
T
1 bn
an
2
= nk
B
T +
_
b k
B
T a
_
n
2
+k
B
T b
2
n
3
+k
B
T b
3
n
4
+. . . , (4.59)
where n = N/V . Thus, for the van der Waals system, we have B
2
= (b k
B
T a) and
B
k
= k
B
T b
k1
for all k 3.
4.3.2 Ballistic dispersion
For the ballistic dispersion (p) = p
2
/2m we computed the density of states in eqn. 4.25.
We have
g() =
g
S

d
T
(d/2)
1
k
B
T
_

k
B
T
_d
2
1
() . (4.60)
where g
S
= (2S+1) is the spin degeneracy. Therefore
C
j
(T) =
g
S

d
T
(d/2)
_
0
dt t
d
2
1
e
jt
= g
S

d
T
j
d/2
. (4.61)
We then have
B
2
(T) = 2
(
d
2
+1)
g
1
S

d
T
(4.62)
B
3
(T) =
_
2
(d+1)
3
(
d
2
+1)
_
2 g
2
S

2d
T
. (4.63)
Note that B
2
(T) is negative for bosons and positive for fermions. This is because bosons
have a tendency to bunch and under certain circumstances may exhibit a phenomenon
known as Bose-Einstein condensation (BEC). Fermions, on the other hand, obey the Pauli
principle, which results in an extra positive correction to the pressure in the low density
limit.
We may also write
n(T, z) = g
S

d
T
j=1
(z)
j
j
d
2
(4.64)
= g
S

d
T

d
2
(z) (4.65)
and
p(T, z) = g
S
k
B
T
d
T
j=1
(z)
j
j
1+
d
2
(4.66)
= g
S
k
B
T
d
T

d
2
+1
(z) , (4.67)
where
q
(z)
n=1
z
n
n
q
(4.68)
is the generalized Riemann -function
2
. Note that
q
(z) obeys a recursion relation in its
index, viz.
z

z
q
(z) =
q1
(z) , (4.69)
2
Several texts, such as Pathria and Reichl, write g
q
(z) for
q
(z). I adopt the latter notation since we are
already using the symbol g for the density of states function g() and for the internal degeneracy g.
and that
q
(1) =
n=1
1
n
q
= (q) . (4.70)
4.4 Photon Statistics
There exists a certain class of particles, including photons and certain elementary excitations
in solids such as phonons (i.e. lattice vibrations) and magnons (i.e. spin waves) which obey
bosonic statistics but with zero chemical potential. This is because their overall number
is not conserved (under typical conditions) photons can be emitted and absorbed by the
atoms in the wall of a container, phonon and magnon number is also not conserved due to
various processes, etc. In such cases, the free energy attains its minimum value with respect
to particle number when
=
_
F
N
_
T.V
= 0 . (4.71)
The number distribution, from eqn. 4.15, is then
n() =
1
e
1
. (4.72)
The grand partition function for a system of particles with = 0 is
(T, V ) = V k
B
T
d g() ln
_
1 e
/k
B
T
_
, (4.73)
where g is the internal degeneracy per particle. For example, photons in three space dimen-
sions have two possible polarization states, hence g = 2 for photons.
Suppose the particle dispersion is (p) = A[p[
. We can compute the density of states g():

g() = g
_
d
d
p
h
d

_
A[p[
_
=
g
d
h
d
_
0
dp p
d1
( Ap
)
=
g
d
h
d
A
_
0
dx x
d
1
( x)
=
2 g
(d/2)
_
hA
1/
_
d
1
() , (4.74)
where g is the internal degeneracy, due, for example, to dierent polarization states of the
photon. We have used the result
d
= 2
d/2
_
(d/2) for the solid angle in d dimensions.
4.4. PHOTON STATISTICS 205
The step function () is perhaps overly formal, but it reminds us that the energy spectrum
is bounded from below by = 0, i.e. there are no negative energy states.
For the photon, we have (p) = cp, hence = 1 and
g() =
2g
d/2
(d/2)
d1
(hc)
d
() . (4.75)
In d = 3 dimensions the degeneracy is g = 2, the number of independent polarization states.
The pressure p(T) is then obtained using = pV . We have
p(T) = k
B
T
d g() ln
_
1 e
/k
B
T
_
=
2 g
d/2
(d/2)
(hc)
d
k
B
T
_
0
d
d1
ln
_
1 e
/k
B
T
_
=
2 g
d/2
(d/2)
(k
B
T)
d+1
(hc)
d
_
0
dt t
d1
ln
_
1 e
t
_
. (4.76)
We can make some progress with the dimensionless integral:
1
d

_
0
dt t
d1
ln
_
1 e
t
_
=
n=1
1
n
_
0
dt t
d1
e
nt
= (d)
n=1
1
n
d+1
= (d) (d + 1) . (4.77)
Finally, we invoke a result from the mathematics of the gamma function known as the
doubling formula,
(z) =
2
z1

_
z
2
_
_
z+1
2
_
. (4.78)
Putting it all together, we nd
p(T) = g
1
2
(d+1)
_
d+1
2
_
(d + 1)
(k
B
T)
d+1
(c)
d
. (4.79)
The number density is found to be
n(T) =
d
g()
e
/k
B
T
1
= g
1
2
(d+1)
_
d+1
2
_
(d)
_
k
B
T
c
_
d
. (4.80)
For photons in d = 3 dimensions, we have g = 2 and thus
n(T) =
2 (3)
2
_
k
B
T
c
_
3
, p(T) =
2 (4)
2
(k
B
T)
4
(c)
3
. (4.81)
It turns out that (4) =

4
90
.
Note that c/k
B
= 0.22855 cm K, so
k
B
T
c
= 4.3755 T[K] cm
1
= n(T) = 20.405 T
3
[K
3
] cm
3
. (4.82)
To nd the entropy, we use Gibbs-Duhem:
d = 0 = s dT +v dp = s = v
dp
dT
, (4.83)
where s is the entropy per particle and v = n
1
is the volume per particle. We then nd
s(T) = (d+1)
(d+1)
(d)
k
B
. (4.84)
The entropy per particle is constant. The internal energy is
E =
ln
_
pV ) = d p V , (4.85)
and hence the energy per particle is
=
E
N
= d pv =
d (d+1)
(d)
k
B
T . (4.86)
4.4.1 Classical arguments for the photon gas
A number of thermodynamic properties of the photon gas can be determined from purely
classical arguments. Here we recapitulate a few important ones.
1. Suppose our photon gas is conned to a rectangular box of dimensions L
x
L
y
L
z
.
Suppose further that the dimensions are all expanded by a factor
1/3
, i.e. the volume
is isotropically expanded by a factor of . The cavity modes of the electromagnetic
radiation have quantized wavevectors, even within classical electromagnetic theory,
given by
k =
_
2n
x
L
x
,
2n
y
L
y
,
2n
z
L
z
_
. (4.87)
Since the energy for a given mode is (k) = c[k[, we see that the energy changes by
a factor
1/3
under an adiabatic volume expansion V V , where the distribution
of dierent electromagnetic mode occupancies remains xed. Thus,
V
_
E
V
_
S
=
_
E
_
S
=
1
3
E . (4.88)
Thus,
p =
_
E
V
_
S
=
E
3V
, (4.89)
as we found in eqn. 4.85. Since E = E(T, V ) is extensive, we must have p = p(T)
alone.
2. Since p = p(T) alone, we have
_
E
V
_
T
=
_
E
V
_
p
= 3p (4.90)
= T
_
p
T
_
V
p , (4.91)
where the second line follows the Maxwell relation
_
S
V
_
p
=
_
p
T
_
V
, after invoking
the First Law dE = TdS p dV . Thus,
T
dp
dT
= 4p = p(T) = AT
4
, (4.92)
where A is a constant. Thus, we recover the temperature dependence found micro-
scopically in eqn. 4.79.
3. Given an energy density E/V , the dierential energy ux emitted in a direction
relative to a surface normal is
dj
= c
E
V
cos
d
4
, (4.93)
where d is the dierential solid angle. Thus, the power emitted per unit area is
dP
dA
=
cE
4V
/2
_
0
d
2
_
0
d sin cos =
cE
4V
=
3
4
c p(T) T
4
, (4.94)
where =
3
4
cA, with p(T) = AT
4
as we found above. From quantum statistical
mechanical considerations, we have
=

2
k
4
B
60 c
2
3
= 5.67 10
8
W
m
2
K
4
(4.95)
is Stefans constant.
4.4.2 Surface temperature of the earth
We derived the result P = T
4
A where = 5.67 10
8
W/m
2
K
4
for the power emitted
by an electromagnetic black body. Lets apply this result to the earth-sun system. Well
need three lengths: the radius of the sun R
= 6.96 10
8
m, the radius of the earth
R
e
= 6.38 10
6
m, and the radius of the earths orbit a
e
= 1.50 10
11
m. Lets assume
Figure 4.1: Spectral density
(, T) for blackbody radiation at three temperatures.

that the earth has achieved a steady state temperature of T
e
. We balance the total power
incident upon the earth with the power radiated by the earth. The power incident upon
the earth is
P
incident
=
R
2
e
4a
2
e
T
4
4R
2
=
R
2
e
R
2
a
2
e
T
4
. (4.96)
The power radiated by the earth is
P
radiated
= T
4
e
4R
2
e
. (4.97)
Setting P
incident
= P
radiated
, we obtain
T
e
=
_
R
2 a
e
_
1/2
T
. (4.98)
Thus, we nd T
e
= 0.04817 T
, and with T
= 5780 K, we obtain T
e
= 278.4 K. The
mean surface temperature of the earth is

T
e
= 287 K, which is only about 10 K higher. The
dierence is due to the fact that the earth is not a perfect blackbody, i.e. an object which
absorbs all incident radiation upon it and emits radiation according to Stefans law. As you
know, the earths atmosphere retraps a fraction of the emitted radiation a phenomenon
known as the greenhouse eect.
4.4.3 Distribution of blackbody radiation
Recall that the frequency of an electromagnetic wave of wavevector k is = c/ = ck/2.
Therefore the number of photons ^
T
(, T) per unit frequency in thermodynamic equilibrium
is (recall there are two polarization states)
^(, T) d =
2 V
8
3

d
3
k
e
ck/k
B
T
1
=
V
2

k
2
dk
e
ck/k
B
T
1
. (4.99)
We therefore have
^(, T) =
8V
c
3

2
e
h/k
B
T
1
. (4.100)
Since a photon of frequency carries energy h, the energy per unit frequency c() is
c(, T) =
8hV
c
3

3
e
h/k
B
T
1
. (4.101)
Note what happens if Plancks constant h vanishes, as it does in the classical limit. The
denominator can then be written
e
h/k
B
T
1 =
h
k
B
T
+O(h
2
) (4.102)
and
c
CL
(, T) = lim
h0
c() = V
8k
B
T
c
3

2
. (4.103)
In classical electromagnetic theory, then, the total energy integrated over all frequencies
diverges. This is known as the ultraviolet catastrophe, since the divergence comes from the
large part of the integral, which in the optical spectrum is the ultraviolet portion. With
quantization, the Bose-Einstein factor imposes an eective ultraviolet cuto k
B
T/h on the
frequency integral, and the total energy, as we found above, is nite:
E(T) =
_
0
d c() = 3pV = V

2
15
(k
B
T)
4
(c)
3
. (4.104)
We can dene the spectral density
() of the radiation as
(, T)
c(, T)
E(T)
=
15
4
h
k
B
T
(h/k
B
T)
3
e
h/k
B
T
1
(4.105)
so that
(, T) d is the fraction of the electromagnetic energy, under equilibrium condi-

tions, between frequencies and + d, i.e.
_
0
d
(, T) = 1. In g. 4.1 we plot this in

g. 4.1 for three dierent temperatures. The maximum occurs when s h/k
B
T satises
d
ds
_
s
3
e
s
1
_
= 0 =
s
1 e
s
= 3 = s = 2.82144 . (4.106)
4.4.4 What if the sun emitted ferromagnetic spin waves?
We saw in eqn. 4.93 that the power emitted per unit surface area by a blackbody is T
4
.
The power law here follows from the ultrarelativistic dispersion = ck of the photons.
Suppose that we replace this dispersion with the general form = (k). Now consider a
large box in equilibrium at temperature T. The energy current incident on a dierential
area dA of surface normal to z is
dP = dA
_
d
3
k
(2)
3
(cos ) (k)
1
(k)
k
z
1
e
(k)/k
B
T
1
. (4.107)
Let us assume an isotropic power law dispersion of the form (k) = Ck
. Then after a
straightforward calculation we obtain
dP
dA
= T
2+
2
, (4.108)
where
=
_
2 +
2
_
2 +
2
g k
2+
2
B
C
8
2
. (4.109)
One can check that for g = 2, C = c, and = 1 that this result reduces to that of eqn.
4.95.
4.5 Lattice Vibrations : Einstein and Debye Models
Crystalline solids support propagating waves called phonons, which are quantized vibrations
of the lattice. Recall that the quantum mechanical Hamiltonian for a single harmonic
oscillator,

H =
p
2
2m
+
1
2
m
2
0
q
2
, may be written as

H =
0
(a
a +
1
2
), where a and a
are
ladder operators satisfying commutation relations
_
a , a
= 1.
4.5.1 One-dimensional chain
Consider the linear chain of masses and springs depicted in g. 4.2. We assume that our
system consists of N mass points on a large ring of circumference L. In equilibrium, the
masses are spaced evenly by a distance b = N/L. We dene u
n
= x
n
nb to be the dierence
between the position of mass n and its equilibrium position. The Hamiltonian is
H =
n
_
p
2
n
2m
+
1
2
(u
n+1
u
n
+b a)
2
_
, (4.110)
where a is the unstretched length of a spring, m is the mass of each mass point, and is
the force constant of each spring. If b ,= a the springs are under tension in equilibrium, but
this will turn out to be of no consequence for our considerations.
The classical equations of motion are
u
n
=

H
p
n
=
p
n
m
(4.111)
p
n
=

H
u
n
=
_
u
n+1
+u
n1
2u
n
_
. (4.112)
Taking the time derivative of the rst equation and substituting into the second yields
u
n
=
m
_
u
n+1
+u
n1
2u
n
_
. (4.113)
4.5. LATTICE VIBRATIONS : EINSTEIN AND DEBYE MODELS 211
Figure 4.2: A linear chain of masses and springs. The black circles represent the equilibrium
positions of the masses. The displacement of mass n relative to its equilibrium value is u
n
.
We now write
u
n
=
1
k
u
k
e
ikn
, (4.114)
where periodicity u
N+n
= u
n
requires that the k values are quantized so that e
ikN
= 1, i.e.
k = 2j/N where j 0, 1, . . . , N1. The inverse of this discrete Fourier transform is
u
k
=
1
n
u
n
e
ikn
. (4.115)
Note that u
k
is in general complex, but that u
k
= u
k
. In terms of the u
k
, the equations
of motion take the form
u
k
=
2
m
(1 cos k) u
k
. (4.116)
Thus, each u
k
is a normal mode, and the normal mode frequencies are
k
= 2
_

m
sin
_
1
2
k
_
. (4.117)
The density of states for this band of phonon excitations is
g() =
dk
2
(
k
)
=
2
_
J
2
2
_
1/2
() (J ) , (4.118)
where J = 2
_
/m is the phonon bandwidth. The step functions require 0 J; outside
this range there are no phonon energy levels and the density of states accordingly vanishes.
The entire theory can be quantized, taking
_
p
n
, u
n
= i
nn
. We then dene
p
n
=
1
k
p
k
e
ikn
, p
k
=
1
n
p
n
e
ikn
, (4.119)
in which case
_
p
k
, u
k
= i
kk
. Note that u
k
= u
k
and p
k
= p
k
. We then dene the
ladder operator
a
k
=
_
1
2m
k
_
1/2
p
k
i
_
m
k
2
_
1/2
u
k
(4.120)
and its Hermitean conjugate a
k
, in terms of which the Hamiltonian is
H =
k
_
a
k
a
k
+
1
2
_
, (4.121)
which is a sum over independent harmonic oscillator modes. Note that the sum over k is
restricted to an interval of width 2, e.g. k [, ]. The state at wavevector k + 2 is
identical to that at k, as we see from eqn. 4.115.
4.5.2 General theory of lattice vibrations
The most general model of a harmonic solid is described by a Hamiltonian of the form
H =
R,i
p
2
i
2M
i
+
1
2
i,j
R,R
i
(R)
ij
(RR
) u
j
(R
) , (4.122)
where the dynamical matrix is
ij
(RR
) =

2
U
u
i
(R) u
j
(R
)
, (4.123)
where U is the potential energy of interaction among all the atoms. Here we have simply
expanded the potential to second order in the local displacements u
i
(R). The lattice sites R
are elements of a Bravais lattice. The indices i and j specify basis elements with respect to
this lattice, and the indices and range over 1, . . . , d, the number of possible directions
in space. The subject of crystallography is beyond the scope of these notes, but, very briey,
a Bravais lattice in d dimensions is specied by a set of d linearly independent primitive
direct lattice vectors a
l
, such that any point in the Bravais lattice may be written as a sum
over the primitive vectors with integer coecients: R =
d
l=1
n
l
a
l
. The set of all such
vectors R is called the direct lattice. The direct lattice is closed under the operation of
vector addition: if R and R
are points in a Bravais lattice, then so is R+R
.
A crystal is a periodic arrangement of lattice sites. The fundamental repeating unit is called
the unit cell . Not every crystal is a Bravais lattice, however. Indeed, Bravais lattices are
special crystals in which there is only one atom per unit cell. Consider, for example, the
structure in g. 4.3. The blue dots form a square Bravais lattice with primitive direct
lattice vectors a
1
= a x and a
2
= a y, where a is the lattice constant, which is the distance
between any neighboring pair of blue dots. The red squares and green triangles, along with
the blue dots, form a basis for the crystal structure which label each sublattice. Our crystal
in g. 4.3 is formally classied as a square Bravais lattice with a three element basis. To
specify an arbitrary site in the crystal, we must specify both a direct lattice vector R as
well as a basis index j 1, . . . , r, so that the location is R+
j
. The vectors
j
are the
basis vectors for our crystal structure. We see that a general crystal structure consists of a
repeating unit, known as a unit cell . The centers (or corners, if one prefers) of the unit cells
form a Bravais lattice. Within a given unit cell, the individual sublattice sites are located
at positions
j
with respect to the unit cell position R.
Figure 4.3: A crystal structure with an underlying square Bravais lattice and a three element
basis.
Upon diagonalization, the Hamiltonian of eqn. 4.122 takes the form
H =
k,a
a
(k)
_
A
a
(k) A
a
(k) +
1
2
_
, (4.124)
where
_
A
a
(k) , A
b
(k
=
ab
kk
. (4.125)
The eigenfrequencies are solutions to the eigenvalue equation
j,
ij
(k) e
(a)
j
(k) = M
i
2
a
(k) e
(a)
i
(k) , (4.126)
where
ij
(k) =
ij
(R) e
ikR
. (4.127)
Here, k lies within the rst Brillouin zone, which is the unit cell of the reciprocal lattice
of points G satisfying e
iGR
= 1 for all G and R. The reciprocal lattice is also a Bravais
lattice, with primitive reciprocal lattice vectors b
l
, such that any point on the reciprocal
lattice may be written G =
d
l=1
m
l
b
l
. One also has that a
l
b
l
= 2
ll
. The index a
ranges from 1 to d r and labels the mode of oscillation at wavevector k. The vector e
(a)
i
(k)
is the polarization vector for the a
th
phonon branch. In solids of high symmetry, phonon
modes can be classied as longitudinal or transverse excitations.
For a crystalline lattice with an r-element basis, there are then d r phonon modes for each
wavevector k lying in the rst Brillouin zone. If we impose periodic boundary conditions,
then the k points within the rst Brillouin zone are themselves quantized, as in the d = 1
case where we found k = 2n/N. There are N distinct k points in the rst Brillouin zone
one for every direct lattice site. The total number of modes is than drN, which is the total
number of translational degrees of freedom in our system: rN total atoms (N unit cells each
with an r atom basis) each free to vibrate in d dimensions. Of the d r branches of phonon
excitations, d of them will be acoustic modes whose frequency vanishes as k 0. The
remaining d(r 1) branches are optical modes and oscillate at nite frequencies. Basically,
in an acoustic mode, for k close to the (Brillouin) zone center k = 0, all the atoms in each
unit cell move together in the same direction at any moment of time. In an optical mode,
the dierent basis atoms move in dierent directions.
There is no number conservation law for phonons they may be freely created or destroyed
in anharmonic processes, where two photons with wavevectors k and q can combine into a
single phonon with wavevector k +q, and vice versa. Therefore the chemical potential for
phonons is = 0. We dene the density of states g
a
() for the a
th
phonon mode as
g
a
() =
1
N
_

a
(k)
_
= 1
0
_
BZ
d
d
k
(2)
d

_

a
(k)
_
, (4.128)
where N is the number of unit cells, 1
0
is the unit cell volume of the direct lattice, and the
k sum and integral are over the rst Brillouin zone only. Note that here has dimensions
of frequency. The functions g
a
() is normalized to unity:
_
0
d g
a
() = 1 . (4.129)
The total phonon density of states per unit cell is given by
3
g() =
3r
a=1
g
a
() . (4.130)
The grand potential for the phonon gas is
(T, V ) = k
B
T ln
k,a
na(k)=0
e
a(k)
_
n
a
(k)+
1
2
_
= k
B
T
k,a
ln
_
2 sinh
_
a
(k)
2k
B
T
_
_
= Nk
B
T
_
0
d g() ln
_
2 sinh
_

2k
B
T
_
_
. (4.131)
Note that V = N1
0
since there are N unit cells, each of volume 1
0
. The entropy is
S =
_
T
_
V
and thus the heat capacity is
C
V
= T

2
T
2
= Nk
B
_
0
d g()
_

2k
B
T
_
2
csch
2
_

2k
B
T
_
(4.132)
3
Note the dimensions of g() are (frequency)
1
. By contrast, the dimensions of g() in eqn. 4.25 are
(energy)
1
(volume)
1
. The dierence lies in the a factor of 1
0
, where 1
0
is the unit cell volume.
Figure 4.4: Upper panel: phonon spectrum in elemental rhodium (Rh) at T = 297 K mea-
sured by high precision inelastic neutron scattering (INS) by A. Eichler et al., Phys. Rev. B
57, 324 (1998). Note the three acoustic branches and no optical branches, corresponding to
d = 3 and r = 1. Lower panel: phonon spectrum in gallium arsenide (GaAs) at T = 12 K,
comparing theoretical lattice-dynamical calculations with INS results of D. Strauch and B.
Dorner, J. Phys.: Condens. Matter 2, 1457 (1990). Note the three acoustic branches and
three optical branches, corresponding to d = 3 and r = 2. The Greek letters along the
x-axis indicate points of high symmetry in the Brillouin zone.
Note that as T we have csch
_

2k
B
T
_
_
2k
B
T
_
2
, and therefore
lim
T
C
V
(T) = Nk
B
_
0
d g() = d rNk
B
. (4.133)
This is the classical Dulong-Petit limit of
1
2
k
B
per quadratic degree of freedom; there are
rN atoms moving in d dimensions, hence d rN positions and an equal number of momenta,
resulting in a high temperature limit of C
V
= d rNk
B
.
4.5.3 Einstein and Debye models
HIstorically, two models of lattice vibrations have received wide attention. First is the so-
called Einstein model , in which there is no dispersion to the individual phonon modes. We
approximate g
a
() (
a
), in which case
C
V
(T) = Nk
B
a
_

a
2k
B
T
_
2
csch
2
_

a
2k
B
T
_
. (4.134)
At low temperatures, the contribution from each branch vanishes exponentially: csch
2
_

a
2k
B
T
_
4 e
a
/k
B
T
0. Real solids dont behave this way.
A more realistic model. due to Debye, accounts for the low-lying acoustic phonon branches.
Since the acoustic phonon dispersion vanishes linearly with [k[ as k 0, there is no
temperature at which the acoustic phonons freeze out exponentially, as in the case of
Einstein phonons. Indeed, the Einstein model is appropriate in describing the d (r 1)
optical phonon branches, though it fails miserably for the acoustic branches.
In the vicinity of the zone center k = 0 (also called in crystallographic notation) the d
acoustic modes obey a linear dispersion, with
a
(k) = c
a
(
k) k. This results in an acoustic

phonon density of states in d = 3 dimensions of
g() =
1
0
2
2
2
a
_
d
k
4
1
c
3
a
(k)
(
D
)
=
31
0
2
2
c
3

2
(
D
) , (4.135)
where c is an average acoustic phonon velocity (i.e. speed of sound) dened by
3
c
3
=
a
_
d
k
4
1
c
3
a
(k)
(4.136)
and
D
is a cuto known as the Debye frequency. The cuto is necessary because the
phonon branch does not extend forever, but only to the boundaries of the Brillouin zone.
Thus,
D
should roughly be equal to the energy of a zone boundary phonon. Alternatively,
we can dene
D
by the normalization condition
_
0
d g() = 3 =
D
= (6
2
/1
0
)
1/3
c . (4.137)
This allows us to write g() =
_
9
2
/
3
D
_
(
D
).
The specic heat due to the acoustic phonons is then
C
V
(T) = Nk
B
9
3
D
D
_
0
d
2
_

2k
B
T
_
2
csch
2
_

2k
B
T
_
= 9Nk
B
_
2T
D
_
3
D
2T
_
, (4.138)
Element Ag Al Au C Cd Cr Cu Fe Mn
D
(K) 225 428 165 2230 209 630 344 470 410
Element Ni Pb Pt Si Sn Ta Ti W Zn
D
(K) 450 105 240 645 200 240 420 400 327
Table 4.1: Debye temperatures for some common elements. (Source: Wikipedia)
where
D
=
D
/k
B
is the Debye temperature and
(x) =
x
_
0
dt t
4
csch
2
t =
_
_
1
3
x
3
x 0
4
30
x .
(4.139)
Therefore,
C
V
(T) =
_
_
12
4
5
Nk
B
_
T
D
_
3
T
D
3Nk
B
T
D
.
(4.140)
Thus, the heat capacity due to acoustic phonons obeys the Dulong-Petit rule in that
C
V
(T ) = 3Nk
B
, corresponding to the three acoustic degrees of freedom per unit
cell. The remaining contribution of 3(r 1)Nk
B
to the high temperature heat capacity
comes from the optical modes not considered in the Debye model. The low temperature
T
3
behavior of the heat capacity of crystalline solids is a generic feature, and its detailed
description is a triumph of the Debye model.
4.5.4 Melting and the Lindemann criterion
Consider a one-dimensional harmonic oscillator. We have
H =
p
2
2m
+
1
2
m
2
0
x
2
=
0
_
a
a +
1
2
_
, (4.141)
where
x =

2m
0
_
a +a
) , p = i
_
m
0
2
_
a a
_
. (4.142)
The RMS uctuations of the position are then
x
2
) =

2m
0
(a +a
)
2
_
=

m
0
_
n(T) +
1
2
_
. (4.143)
where n(T) =
_
exp(/k
B
T) 1
1
is the Bose occupancy function.
For a three-dimensional solid, the uctuations in the position of any given lattice site may
be expressed as a sum over contributions from all the phonon modes. Thus,
4
u
2
R
)
_
0
d g()

M
_
1
e
/k
B
T
1
+
1
2
_
=
D
_
0
d
9
2
3
D
_
k
B
T
M
2
+

2M
_
(T
>
D
)
=
9
M
2
D
_
k
B
T +
1
4
D
_
. (4.144)
Note that the uctuations receive a purely quantum temperature independent contribution
as well as a thermal contribution. An old phenomenological theory of melting due to
Lindemann asserts that crystals should melt when the RMS uctuations of the atomic
positions become greater than some critical distance, measured in units of the unit cell
length 1
1/3
0
. The above expression may then be used to compute the melting temperature.
For example, if we neglect the quantum uctuations relative to the thermal ones, and we
set u
2
R
)
1/2
= xa, where a is the lattice spacing, we obtain
T
melt
= x
2
Mk
B
2
D
a
2
9
2
. (4.145)
Here x 0.1 is a phenomenological parameter such that we say melting occurs when the
RMS uctuations in the ionic positions are equal to x times the lattice spacing.
4.5.5 Goldstone bosons
The vanishing of the acoustic phonon dispersion at k = 0 is a consequence of Goldstones
theorem which says that associated with every broken generator of a continuous symmetry
there is an associated bosonic gapless excitation (i.e. one whose frequency vanishes in the
long wavelength limit). In the case of phonons, the broken generators are the symmetries
under spatial translation in the x, y, and z directions. The crystal selects a particular
location for its center-of-mass, which breaks this symmetry. There are, accordingly, three
gapless acoustic phonons.
Magnetic materials support another branch of elementary excitations known as spin waves,
or magnons. In isotropic magnets, there is a global symmetry associated with rotations in
internal spin space, described by the group SU(2). If the system spontaneously magnetizes,
meaning there is long-ranged ferromagnetic order ( ), or long-ranged antiferromag-
netic order ( ), then global spin rotation symmetry is broken. Typically a particular
direction is chosen for the magnetic moment (or staggered moment, in the case of an an-
tiferromagnet). Symmetry under rotations about this axis is then preserved, but rotations
4
This expression is not exact, since the dierent phonon modes couple to the uctuations in u
2
R
with
dierent amplitudes.
which do not preserve the selected axis are broken. In the most straightforward case,
that of the antiferromagnet, there are two such rotations for SU(2), and concomitantly two
gapless magnon branches, with linearly vanishing dispersions
a
(k). The situation is more
subtle in the case of ferromagnets, because the total magnetization is conserved by the dy-
namics (unlike the total staggered magnetization in the case of antiferromagnets). Another
wrinkle arises if there are long-ranged interactions present.
For our purposes, we can safely ignore the deep physical reasons underlying the gaplessness
of Goldstone bosons and simply posit a gapless dispersion relation of the form (k) = A[k[
.
The density of states for this excitation branch is then
g() = (
d
1
(
c
) , (4.146)
where ( is a constant and
c
is the cuto, which is the bandwidth for this excitation branch.
5
Normalizing the density of states for this branch results in the identication
c
= (d/()
/d
.
The heat capacity is then found to be
C
V
= Nk
B
(
c
_
0
d
d
1
_

k
B
T
_
2
csch
2
_

2k
B
T
_
=
d
Nk
B
_
2T
_
d/
2T
_
, (4.147)
where =
c
/k
B
and
(x) =
x
_
0
dt t
d
+1
csch
2
t =
_
d
x
d/
x 0
2
d/
_
2 +
d
_
2 +
d
_
x ,
(4.148)
which is a generalization of our earlier results. Once again, we recover Dulong-Petit for
k
B
T
c
, with C
V
(T
c
/k
B
) = Nk
B
.
In an isotropic ferromagnet, i.e.a ferromagnetic material where there is full SU(2) symmetry
in internal spin spce, the magnons have a k
2
dispersion. Thus, a bulk three-dimensional
isotropic ferromagnet will exhibit a heat capacity due to spin waves which behaves as T
3/2
at low temperatures. For suciently low temperatures this will overwhelm the phonon
contribution, which behaves as T
3
.
5
If (k) = Ak
, then ( = 2
1d
d
2
1
A
(d/2) .
4.6 The Ideal Bose Gas
We already derived, in 4.3.2, expressions for n(T, z) and p(T, z) for the ideal Bose gas
(IBG) with ballistic dispersion (p) = p
2
/2m, We found
n(T, z) = g
d
T

d
2
(z) (4.149)
p(T, z) = g k
B
T
d
T

d
2
+1
(z), (4.150)
where g is the internal (e.g. spin) degeneracy of each single particle energy level, and
q
(z) =
n=1
z
n
n
q
. (4.151)
For bosons with a spectrum bounded from below by
min
= 0, the fugacity z = e
/k
B
T
takes
values on the interval z [0, 1]
6
.
Clearly n(T, z) = g
d
T

d
2
(z) is an increasing function of z for xed T. In g. 4.5 we
plot the function
s
(z) versus z for three dierent values of s. We note that the maximum
value
s
(z = 1) is nite if s > 1. Thus, for d > 2, there is a maximum density n
max
(T) =
g
d
2
(z)
d
T
which is an increasing function of temperature T. Put another way, if we x
the density n, then there is a critical temperature T
c
below which there is no solution to the
equation n = n(T, z). The critical temperature T
c
(n) is then determined by the relation
n = g
_
d
2
_
_
mk
B
T
c
2
2
_
d/2
= k
B
T
c
=
2
2
m
_
n
g
_
d
2
_
_
2/d
. (4.152)
What happens for T < T
c
?
To understand the low temperature phase of the ideal Bose gas, recall that the density
n = N/V is formally written as a sum,
n =
N
V
=
1
V
1
z
1
e
/k
B
T
1
. (4.153)
We presume the lowest energy eigenvalue is
min
= 0, with degeneracy g. We separate out
this term from the above sum, writing
n =
1
V
g
z
1
1
+
1
V
>0)
1
z
1
e
/k
B
T
1
. (4.154)
Now V
1
is of course very small, since V is thermodynamically large, but if 0 then
z
1
1 is also very small and the ratio can be nite. Indeed, if the density of k = 0 bosons
n
0
is nite, then their total number N
0
satises
N
0
= V n
0
=
1
z
1
1
= z =
1
1 +N
1
0
. (4.155)
6
It is easy to see that the chemical potential for noninteracting bosons can never exceed the minimum
value
min
of the single particle dispersion.
4.6. THE IDEAL BOSE GAS 221
Figure 4.5: The function
s
(z) versus z for s =
1
2
, s =
3
2
, and s =
5
2
. Note that
s
(1) = (s)
diverges for s 1.
The chemical potential is then
= k
B
T ln z = k
B
T ln
_
1 +N
1
0
_

k
B
T
N
0
0
. (4.156)
In other words, the chemical potential is innitesimally negative, because N
0
is assumed to
be thermodynamically large.
According to eqn. 4.14, the contribution to the pressure from the k = 0 states is
p
0
=
k
B
T
V
ln(1 z) =
k
B
T
V
ln(1 +N
0
) 0
+
. (4.157)
So the k = 0 bosons, which we identify as the condensate, contribute nothing to the pressure.
Having separated out the k = 0 mode, we can now replace the remaining sum over by
the usual integral over k. We then have
T < T
c
: n = n
0
+ g
_
d
2
_
d
T
(4.158)
p = g
_
d
2
+1
_
k
B
T
d
T
(4.159)
and
T > T
c
: n = g
d
2
(z)
d
T
(4.160)
p = g
d
2
+1
(z) k
B
T
d
T
. (4.161)
The condensate fraction n
0
/n is unity at T = 0, when all particles are in the condensate
with k = 0, and decreases with increasing T until T = T
c
, at which point it vanishes
identically. Explicitly, we have
n
0
(T)
n
= 1
g
_
d
2
_
n
d
T
= 1
_
T
T
c
(n)
_
d/2
. (4.162)
Let us compute the internal energy E for the ideal Bose gas. We have
() = +

= T

T
= +TS (4.163)
and therefore
E = +TS +N = N +

()
= V
_
n +

(p)
_
(4.164)
=
d
2
g V k
B
T
d
T

d
2
+1
(z) . (4.165)
This expression is valid at all temperatures, both above and below T
c
.
We now investigate the heat capacity C
V,N
=
_
E
T
_
V,N
. Since we have been working in the
GCE, it is very important to note that N is held constant when computing C
V,N
. Well
also restrict our attention to the case d = 3 since the ideal Bose gas does not condense at
nite T for d 2 and d > 3 is unphysical. While were at it, well also set g = 1.
The number of particles is
N =
_
_
N
0
+
_
3
2
_
V
3
T
(T < T
c
)
V
3
T

3/2
(z) (T > T
c
) ,
(4.166)
and the energy is
E =
3
2
k
B
T
V
3
T
5/2
(z) . (4.167)
For T < T
c
, we have z = 1 and
C
V,N
=
_
E
T
_
V,N
=
15
4

_
5
2
_
k
B
V
3
T
. (4.168)
The molar heat capacity is therefore
c
V,N
(T, n) = N
A
C
V,N
N
=
15
4

_
5
2
_
R
_
n
3
T
_
1
. (4.169)
For T > T
c
, we have
dE
V
=
15
4
k
B
T
5/2
(z)
V
3
T
dT
T
+
3
2
k
B
T
3/2
(z)
V
3
T
dz
z
, (4.170)
where we have invoked eqn. 4.69. Taking the dierential of N, we have
dN =
3
2

3/2
(z)
V
3
T
dT
T
+
1/2
(z)
V
3
T
dz
z
. (4.171)
Figure 4.6: Molar heat capacity of the ideal Bose gas. Note the cusp at T = T
c
.
We set dN = 0, which xes dz in terms of dT, resulting in
c
V,N
(T, z) =
3
2
R
_
5
2

5/2
(z)
3/2
(z)

3
2

3/2
(z)
1/2
(z)
_
. (4.172)
To obtain c
V,N
(T, n), we must invert the relation
n(T, z) =
3
T

3/2
(z) (4.173)
in order to obtain z(T, n), and then insert this into eqn. 4.172. The results are shown in g.
4.6. There are several noteworthy features of this plot. First of all, by dimensional analysis
the function c
V,N
(T, n) is R times a function of the dimensionless ratio T/T
c
(n) T n
2/3
.
Second, the high temperature limit is
3
2
R, which is the classical value. Finally, there is a
cusp at T = T
c
(n).
4.6.1 Isotherms for the ideal Bose gas
Let a be some length scale and dene
v
a
= a
3
, p
a
=
2
2
ma
5
, T
a
=
2
2
ma
2
k
B
(4.174)
Then we have
v
a
v
=
_
T
T
a
_
3/2
3/2
(z) +v
a
n
0
(4.175)
p
p
a
=
_
T
T
a
_
5/2
5/2
(z) , (4.176)
Figure 4.7: Phase diagrams for the ideal Bose gas. Left panel: (p, v) plane. The solid blue
curves are isotherms, and the green hatched region denotes v < v
c
(T), where the system is
partially condensed. Right panel: (p, T) plane. The solid red curve is the coexistence curve
p
c
(T), along which Bose condensation occurs. No distinct thermodynamic phase exists in
the yellow hatched region above p = p
c
(T).
where v = V/N is the volume per particle
7
and n
0
is the condensate number density;
v
0
vanishes for T T
c
, when z = 1. Note that the pressure is independent of volume
for T < T
c
. The isotherms in the (p, v) plane are then at for v < v
c
. This resembles
the coexistence region familiar from our study of the thermodynamics of the liquid-gas
transition.
Recall the Gibbs-Duhem equation,
d = s dT +v dp . (4.177)
Along a coexistence curve, we have the Clausius-Clapeyron relation,
_
dp
dT
_
coex
=
s
2
s
1
v
2
v
1
=

T v
, (4.178)
where = T (s
2
s
1
) is the latent heat per mole, and v = v
2
v
1
. For ideal gas Bose
condensation, the coexistence curve resembles the red curve in the right hand panel of
g. 4.7. There is no meaning to the shaded region where p > p
c
(T). Nevertheless, it is
tempting to associate the curve p = p
c
(T) with the coexistence of the k = 0 condensate
and the remaining uncondensed (k ,= 0) bosons
8
.
The entropy in the coexistence region is given by
s =
1
N
_
T
_
V
=
5
2

_
5
2
_
k
B
v
3
T
=
5
2

_
5
2
_
_
3
2
_ k
B
_
1
n
0
n
_
. (4.179)
7
Note that in the thermodynamics chapter we used v to denote the molar volume, NA V/N.
8
The k ,= 0 particles are sometimes called the overcondensate.
Figure 4.8: Phase diagram of
4
He. All phase boundaries are rst order transition lines, with
the exception of the normal liquid-superuid transition, which is second order. (Source:
University of Helsinki)
All the entropy is thus carried by the uncondensed bosons, and the condensate carries zero
entropy. The Clausius-Clapeyron relation can then be interpreted as describing a phase
equilibrium between the condensate, for which s
0
= v
0
= 0, and the uncondensed bosons,
for which s
= s(T) and v
= v
c
(T). So this identication forces us to conclude that the
specic volume of the condensate is zero. This is certainly false in an interacting Bose gas!
While one can identify, by analogy, a latent heat = T s = Ts in the Clapeyron equation,
it is important to understand that there is no distinct thermodynamic phase associated with
the region p > p
c
(T). Ideal Bose gas condensation is a second order transition, and not a
rst order transition.
4.6.2 The -transition in Liquid
4
He
Helium has two stable isotopes.
4
He is a boson, consisting of two protons, two neutrons, and
two electrons (hence an even number of fermions).
3
He is a fermion, with one less neutron
than
4
He. Each
4
He atom can be regarded as a tiny hard sphere of mass m = 6.6510
24
g
and diameter a = 2.65
A. A sketch of the phase diagram is shown in g. 4.8. At atmospheric

pressure, Helium liquees at T
l
= 4.2 K. The gas-liquid transition is rst order, as usual.
However, as one continues to cool, a second transition sets in at T = T
= 2.17 K (at
p = 1 atm). The -transition, so named for the -shaped anomaly in the specic heat in
the vicinity of the transition, as shown in g. 4.9, is continuous (i.e. second order).
If we pretend that
4
He is a noninteracting Bose gas, then from the density of the liquid n =
2.210
22
cm
3
, we obtain a Bose-Einstein condensation temperature T
c
=
2
2
m
_
n/(
3
2
)
_
2/3
=
3.16 K, which is in the right ballpark. The specic heat C
p
(T) is found to be singular at
T = T
, with
C
p
(T) = A
T T
(p)
. (4.180)
Figure 4.9: Specic heat of liquid
4
He in the vicinity of the -transition. Data from M. J.
Buckingham and W. M. Fairbank, in Progress in Low Temperature Physics, C. J. Gortner,
ed. (North-Holland, 1961). Inset at upper right: more recent data of J. A. Lipa et al., Phys.
Rev. B 68, 174518 (2003) performed in zero gravity earth orbit, to within T = 2 nK of
the transition.
is an example of a critical exponent. We shall study the physics of critical phenomena
later on in this course. For now, note that a cusp singularity of the type found in g.
4.6 corresponds to = 1. The behavior of C
p
(T) in
4
He is very nearly logarithmic in
[T T
[. In fact, both theory (renormalization group on the O(2) model) and experiment
concur that is almost zero but in fact slightly negative, with = 0.0127 0.0003 in the
best experiments (Lipa et al., 2003). The transition is most denitely not an ideal Bose gas
condensation. Theoretically, in the parlance of critical phenomena, IBG condensation and
the -transition in
4
He lie in dierent universality classes
9
. Unlike the IBG, the condensed
phase in
4
He is a distinct thermodynamic phase, known as a superuid.
Note that C
p
(T < T
c
) for the IBG is not even dened, since for T < T
c
we have p = p(T)
and therefore dp = 0 requires dT = 0.
9
IBG condensation is in the universality class of the spherical model. The -transition is in the universality
class of the XY model.
Figure 4.10: The fountain eect. In each case, a temperature gradient is maintained across
a porous plug through which only superuid can ow. This results in a pressure gradient
which can result in a fountain or an elevated column in a U-tube.
4.6.3 Fountain eect in superuid
4
He
At temperatures T < T
, liquid
4
He has a superuid component which is a type of Bose
condensate. In fact, there is an important dierence between condensate fraction N
k=0
/N
and superuid density, which is denoted by the symbol
s
. In
4
He, for example, at T = 0
the condensate fraction is only about 8%, while the superuid fraction
s
/ = 1. The
distinction between N
0
and
s
is very interesting but lies beyond the scope of this course.
One aspect of the superuid state is its complete absence of viscosity. For this reason,
superuids can ow through tiny cracks called microleaks that will not pass normal uid.
Consider then a porous plug which permits the passage of superuid but not of normal uid.
The key feature of the superuid component is that it has zero energy density. Therefore
even though there is a transfer of particles across the plug, there is no energy exchange,
and therefore a temperature gradient across the plug can be maintained
10
.
The elementary excitations in the superuid state are sound waves called phonons. They
are compressional waves, just like longitudinal phonons in a solid, but here in a liquid.
Their dispersion is acoustic, given by (k) = ck where c = 238 m/s.
11
The have no internal
degrees of freedom, hence g = 1. Like phonons in a solid, the phonons in liquid helium are
not conserved. Hence their chemical potential vanishes and these excitations are described
by photon statistics. We can now compute the height dierence h in a U-tube experiment.
Clearly h = p/g. so we must nd p(T) for the helium. In the grand canonical ensemble,
10
Recall that two bodies in thermal equilibrium will have identical temperatures if they are free to exchange
energy.
11
The phonon velocity c is slightly temperature dependent.
we have
p = /V = k
B
T
_
d
3
k
(2)
3
ln
_
1 e
ck/k
B
T
_
(4.181)
=
(k
B
T)
4
(c)
3
4
8
3
_
0
duu
2
ln(1 e
u
)
=

2
90
(k
B
T)
4
(c)
3
. (4.182)
Lets assume T = 1 K. Well need the density of liquid helium, = 148 kg/m
3
.
dh
dT
=
2
2
45
_
k
B
T
c
_
3
k
B
g
(4.183)
=
2
2
45
_
(1.38 10
23
J/K)(1 K)
(1.055 10
34
J s)(238 m/s)
_
3
(1.38 10
23
J/K)
(148 kg/m
3
)(9.8 m/s
2
)
(4.184)
32 cm/K , (4.185)
a very noticeable eect!
4.6.4 Bose condensation in optical traps
The 2001 Nobel Prize in Physics was awarded to Weiman, Cornell, and Ketterle for the
experimental observation of Bose condensation in dilute atomic gases. The experimental
techniques required to trap and cool such systems are a true tour de force, and we shall not
enter into a discussion of the details here
12
.
The optical trapping of neutral bosonic atoms, such as
87
Rb, results in a conning potential
V (r) which is quadratic in the atomic positions. Thus, the single particle Hamiltonian for
a given atom is written
H =

2
2m

2
+
1
2
m
_
2
1
x
2
+
2
2
y
2
+
2
3
z
2
_
, (4.186)
where
1,2,3
are the angular frequencies of the trap. This is an anisotropic three-dimensional
harmonic oscillator, the solution of which is separable into a product of one-dimensional
harmonic oscillator wavefunctions. The eigenspectrum is then given by a sum of one-
dimensional spectra, viz.
E
n
1
,n
2
,n
3
=
_
n
1
+
1
2
)
1
+
_
n
2
+
1
2
)
2
+
_
n
3
+
1
2
)
3
. (4.187)
12
Many reliable descriptions may be found on the web. Check Wikipedia, for example.
According to eqn. 4.16, the number of particles in the system is
N =
n
1
=0
n
2
=0
n
3
=0
_
y
1
e
n
1
1
/k
B
T
e
n
2
2
/k
B
T
e
n
3
3
/k
B
T
1
_
1
(4.188)
=
k=1
y
k
_
1
1 e
k
1
/k
B
T
__
1
1 e
k
2
/k
B
T
__
1
1 e
k
3
/k
B
T
_
, (4.189)
where weve dened
y e
/k
B
T
e
1
/2k
B
T
e
2
/2k
B
T
e
3
/2k
B
T
. (4.190)
Note that y [0, 1].
Lets assume that the trap is approximately anisotropic, which entails that the frequency
ratios
1
/
2
etc. are all numbers on the order of one. Let us further assume that k
B
T
1,2,3
. Then
1
1 e
k
j
/k
B
T

_
_
k
B
T
k
j
k
<
(T)
1 k > k
(T)
(4.191)
where k
(T) = k
B
T/ 1, with
=
_
3
_
1/3
. (4.192)
We then have
N(T, y)
y
k
+1
1 y
+
_
k
B
T

_
3 k
k=1
y
k
k
3
, (4.193)
where the rst term on the RHS is due to k > k
and the second term from k k
in the
previous sum. Since k
1 and since the sum of inverse cubes is convergent, we may safely

extend the limit on the above sum to innity. To help make more sense of the rst term,
write N
0
=
_
y
1
1
_
1
for the number of particles in the (n
1
, n
2
, n
3
) = (0, 0, 0) state. Then
y =
N
0
N
0
+ 1
. (4.194)
This is true always. The issue vis-a-vis Bose-Einstein condensation is whether N
0
1. At
any rate, we now see that we can write
N N
0
_
1 +N
1
0
_
k
+
_
k
B
T

_
3
3
(y) . (4.195)
As for the rst term, we have
N
0
_
1 +N
1
0
_
k
=
_
_
0 N
0
k
N
0
N
0
k
(4.196)
Thus, as in the case of IBG condensation of ballistic particles, we identify the critical
temperature by the condition y = N
0
/(N
0
+ 1) 1, and we have
T
c
=

k
B
_
N
(3)
_
1/3
= 4.5
_

100 Hz
_
N
1/3
[ nK] , (4.197)
where = /2. We see that k
B
T
c
if the number of particles in the trap is large:
N 1. In this regime, we have
T < T
c
: N = N
0
+(3)
_
k
B
T

_
3
(4.198)
T > T
c
: N =
_
k
B
T

_
3
3
(y) . (4.199)
It is interesting to note that BEC can also occur in two-dimensional traps, which is to
say traps which are very anisotropic, with oblate equipotential surfaces V (r) = V
0
. This
happens when
3
k
B
T
1,2
. We then have
T
(d=2)
c
=

k
B
_
6N
2
_
1/2
(4.200)
with =
_
2
_
1/2
. The particle number then obeys a set of equations like those in eqns.
4.198 and 4.199, mutatis mutandis
13
.
For extremely prolate traps, with
3

1,2
, the situation is dierent because
1
(y) diverges
for y = 1. We then have
N = N
0
+
k
B
T
3
ln
_
1 +N
0
_
. (4.201)
Here we have simply replaced y by the equivalent expression N
0
/(N
0
+ 1). If our criterion
for condensation is that N
0
= N, where is some fractional value, then we have
T
c
() = (1 )

3
k
B
N
ln N
. (4.202)
4.6.5 Example problem from Fall 2004 UCSD graduate written exam
PROBLEM: A three-dimensional gas of noninteracting bosonic particles obeys the dispersion
relation (k) = A
1/2
.
(a) Obtain an expression for the density n(T, z) where z = exp(/k
B
T) is the fugacity.
Simplify your expression as best you can, adimensionalizing any integral or innite
sum which may appear. You may nd it convenient to dene
(z)
1
()
_
0
dt
t
1
z
1
e
t
1
=
k=1
z
k
k
. (4.203)
13
Explicitly, one replaces (3) with (2) =

2
6
,
3
(y) with
2
(y), and
`
k
B
T/
3
with
`
k
B
T/
2
.
Note
(1) = (), the Riemann zeta function.

(b) Find the critical temperature for Bose condensation, T
c
(n). Your expression should
only include the density n, the constant A, physical constants, and numerical factors
(which may be expressed in terms of integrals or innite sums).
(c) What is the condensate density n
0
when T =
1
2
T
c
?
(d) Do you expect the second virial coecient to be positive or negative? Explain your
reasoning. (You dont have to do any calculation.)
SOLUTION: We work in the grand canonical ensemble, using Bose-Einstein statistics.
(a) The density for Bose-Einstein particles are given by
n(T, z) =
_
d
3
k
(2)
3
1
z
1
exp(Ak
1/2
/k
B
T) 1
=
1
2
_
k
B
T
A
_
6

_
0
ds
s
5
z
1
e
s
1
=
120
2
_
k
B
T
A
_
6
6
(z) , (4.204)
where we have changed integration variables from k to s = Ak
1/2
/k
B
T, and we have
dened the functions
(z) as above, in eqn. 4.203. Note
(1) = (), the Riemann

zeta function.
(b) Bose condensation sets in for z = 1, i.e. = 0. Thus, the critical temperature T
c
and
the density n are related by
n =
120 (6)
2
_
k
B
T
c
A
_
6
, (4.205)
or
T
c
(n) =
A
k
B
_

2
n
120 (6)
_
1/6
. (4.206)
(c) For T < T
c
, we have
n = n
0
+
120 (6)
2
_
k
B
T
A
_
6
= n
0
+
_
T
T
c
_
6
n , (4.207)
where n
0
is the condensate density. Thus, at T =
1
2
T
c
,
n
0
_
T =
1
2
T
c
_
=
63
64
n. (4.208)
(d) The virial expansion of the equation of state is
p = nk
B
T
_
1 +B
2
(T) n +B
3
(T) n
2
+. . .
_
.
We expect B
2
(T) < 0 for noninteracting bosons, reecting the tendency of the bosons
to condense. (Correspondingly, for noninteracting fermions we expect B
2
(T) > 0.)
For the curious, we compute B
2
(T) by eliminating the fugacity z from the equations
for n(T, z) and p(T, z). First, we nd p(T, z):
p(T, z) = k
B
T
_
d
3
k
(2)
3
ln
_
1 z exp(Ak
1/2
/k
B
T)
_
=
k
B
T
2
_
k
B
T
A
_
6

_
0
ds s
5
ln
_
1 z e
s
_
=
120 k
B
T
2
_
k
B
T
A
_
6
7
(z). (4.209)
Expanding in powers of the fugacity, we have
n =
120
2
_
k
B
T
A
_
6 _
z +
z
2
2
6
+
z
3
3
6
+. . .
_
(4.210)
p
k
B
T
=
120
2
_
k
B
T
A
_
6 _
z +
z
2
2
7
+
z
3
3
7
+. . .
_
. (4.211)
Solving for z(n) using the rst equation, we obtain, to order n
2
,
z =
_

2
A
6
n
120 (k
B
T)
6
_
1
2
6
_

2
A
6
n
120 (k
B
T)
6
_
2
+O(n
3
) . (4.212)
Plugging this into the equation for p(T, z), we obtain the rst nontrivial term in the
virial expansion, with
B
2
(T) =

2
15360
_
A
k
B
T
_
6
, (4.213)
which is negative, as expected. Note also that the ideal gas law is recovered for
T , for xed n.
4.7 The Ideal Fermi Gas
The grand potential of the ideal Fermi gas is, per eqn. 4.14,
(T, V, ) = V k
B
T
ln
_
1 +e
/k
B
T
e
/k
B
T
_
(4.214)
= V k
B
T
d g() ln
_
1 +e
()/k
B
T
_
. (4.215)
4.7. THE IDEAL FERMI GAS 233
Figure 4.11: The Fermi distribution, f() =
_
exp(/k
B
T) + 1
1
. Here we have set k
B
= 1
and taken = 2, with T =
1
20
(blue), T =
3
4
(green), and T = 2 (red). In the T 0 limit,
f() approaches a step function ().
The average number of particles in a state with energy is
n() =
1
e
()/k
B
T
+ 1
, (4.216)
hence the total number of particles is
N =
d g()
1
e
()/k
B
T
+ 1
. (4.217)
4.7.1 The Fermi distribution
We dene the function
f()
1
e
/k
B
T
+ 1
, (4.218)
known as the Fermi distribution. In the T limit, f()
1
2
for all nite values of .
As T 0, f() approaches a step function (). The average number of particles in a
state of energy in a system at temperature T and chemical potential is n() = f( ).
In g. 4.11 we plot f( ) versus for three representative temperatures.
4.7.2 T = 0 and the Fermi surface
At T = 0, we therefore have n() = ( ), which says that all single particle energy
states up to = are lled, and all energy states above = are empty. We call (T = 0)
the Fermi energy:
F
= (T = 0). If the single particle dispersion (k) depends only on the
wavevector k, then the locus of points in k-space for which (k) =
F
is called the Fermi
surface. For isotropic systems, (k) = (k) is a function only of the magnitude k = [k[,
and the Fermi surface is a sphere in d = 3 or a circle in d = 2. The radius of this circle
is the Fermi wavevector, k
F
. When there is internal (e.g. spin) degree of freedom, there is
a Fermi surface and Fermi wavevector (for isotropic systems) for each polarization state of
the internal degree of freedom.
Lets compute the Fermi wavevector k
F
and Fermi energy
F
for the IFG with a ballistic
dispersion (k) =
2
k
2
/2m. The number density is
n = g
_
d
d
k (k
F
k) =
g
d
(2)
d

k
d
F
d
=
_
_
g k
F
/ (d = 1)
g k
2
F
/4 (d = 2)
g k
3
F
/6
2
(d = 3) .
(4.219)
Note that the form of n(k
F
) is independent of the dispersion relation, so long as it remains
isotropic. Inverting the above expressions, we obtain k
F
(n):
k
F
= 2
_
d n
g
d
_
1/d
=
_
_
n/g (d = 1)
(4n/g)
1/2
(d = 2)
(6
2
n/g)
1/3
(d = 3) .
(4.220)
The Fermi energy in each case, for ballistic dispersion, is therefore
F
=

2
k
2
F
2m
=
2
2
2
m
_
d n
g
d
_
2/d
=
_
2
n
2
2g
2
m
(d = 1)
2
2
n
g m
(d = 2)
2
2m
_
6
2
n
g
_
2/3
(d = 3) .
(4.221)
Another useful result for the ballistic dispersion, which follows from the above, is that the
density of states at the Fermi level is given by
g(
F
) =
g
d
(2)
d

mk
d2
F
2
=
d
2

n
F
. (4.222)
For the electron gas, we have g = 2. In a metal, one typically has k
F
0.5
A
1
to 2
A
1
,
and
F
1 eV 10 eV. Due to the eects of the crystalline lattice, electrons in a solid
behave as if they had an eective mass m
which is typically on the order of the electron

mass but very often about an order of magnitude smaller, particularly in semiconductors.
Nonisotropic dispersions (k) are more interesting in that they give rise to non-spherical
Fermi surfaces. The simplest example is that of a two-dimensional tight-binding model of
electrons hopping on a square lattice, as may be appropriate in certain layered materials.
The dispersion relation is then
(k
x
, k
y
) = 2t cos(k
x
a) 2t cos(k
y
a) , (4.223)
where k
x
and k
y
are conned to the interval
_

a
,

a
. The quantity t has dimensions of

energy and is known as the hopping integral . The Fermi surface is the set of points (k
x
, k
y
)
which satises (k
x
, k
y
) =
F
. When
F
achieves its minimum value of
min
F
= 4t, the
Fermi surface collapses to a point at (k
x
, k
y
) = (0, 0). For energies just above this minimum
value, we can expand the dispersion in a power series, writing
(k
x
, k
y
) = 4t +ta
2
_
k
2
x
+k
2
y
_
1
12
ta
4
_
k
4
x
+k
4
y
_
+. . . . (4.224)
If we only work to quadratic order in k
x
and k
y
, the dispersion is isotropic, and the Fermi
surface is a circle, with k
2
F
= (
F
+4t)/ta
2
. As the energy increases further, the continuous
O(2) rotational invariance is broken down to the discrete group of rotations of the square,
C
4v
. The Fermi surfaces distort and eventually, at
F
= 0, the Fermi surface is itself a
square. As
F
increases further, the square turns back into a circle, but centered about the
point
_
a
,

a
_
. Note that everything is periodic in k
x
and k
y
modulo
2
a
. The Fermi surfaces
for this model are depicted in the upper right panel of g. 4.12.
Fermi surfaces in three dimensions can be very interesting indeed, and of great importance
in understanding the electronic properties of solids. Two examples are shown in the bottom
panels of g. 4.12. The electronic conguration of cesium (Cs) is [Xe] 6s
1
. The 6s electrons
hop from site to site on a body centered cubic (BCC) lattice, a generalization of the simple
two-dimensional square lattice hopping model discussed above. The elementary unit cell in
k space, known as the rst Brillouin zone, turns out to be a dodecahedron. In yttrium, the
electronic structure is [Kr] 5s
2
4d
1
, and there are two electronic energy bands at the Fermi
level, meaning two Fermi surfaces. Yttrium forms a hexagonal close packed (HCP) crystal
structure, and its rst Brillouin zone is shaped like a hexagonal pillbox.
4.7.3 Spin-split Fermi surfaces
Consider an electron gas in an external magnetic eld H. The single particle Hamiltonian
is then
H =
p
2
2m
+
B
H , (4.225)
where
B
is the Bohr magneton,
B
=
e
2mc
= 5.788 10
9
eV/G
B
/k
B
= 6.717 10
5
K/G ,
where m is the electron mass. What happens at T = 0 to a noninteracting electron gas in
a magnetic eld?
Figure 4.12: Fermi surfaces for two and three-dimensional structures. Upper left: free
particles in two dimensions. Upper right: tight binding electrons on a square lattice.
Lower left: Fermi surface for cesium, which is predominantly composed of electrons in the
6s orbital shell. Lower right: the Fermi surface of yttrium has two parts. One part (yellow)
is predominantly due to 5s electrons, while the other (pink) is due to 4d electrons. (Source:
www.phys.u.edu/fermisurface/)
Electrons of each spin polarization form their own Fermi surfaces. That is, there is an up
spin Fermi surface, with Fermi wavevector k
F
, and a down spin Fermi surface, with Fermi
wavevector k
F
. The individual Fermi energies, on the other hand, must be equal, hence
2
k
2
F
2m
+
B
H =
2
k
2
F
2m

B
H , (4.226)
which says
k
2
F
k
2
F
=
2eH
c
. (4.227)
The total density is
n =
k
3
F
6
2
+
k
3
F
6
2
= k
3
F
+k
3
F
= 6
2
n . (4.228)
Clearly the down spin Fermi surface grows and the up spin Fermi surface shrinks with
increasing H. Eventually, the minority spin Fermi surface vanishes altogether. This happens
for the up spins when k
F
= 0. Solving for the critical eld, we obtain
H
c
=
c
2e

_
6
2
n
_
1/3
. (4.229)
In real magnetic solids, like cobalt and nickel, the spin-split Fermi surfaces are not spheres,
just like the case of the (spin degenerate) Fermi surfaces for Cs and Y shown in g. 4.12.
4.7.4 The Sommerfeld expansion
In dealing with the ideal Fermi gas, we will repeatedly encounter integrals of the form
1(T, )
d f( ) () . (4.230)
The Sommerfeld expansion provides a systematic way of expanding these expressions in
powers of T and is an important analytical tool in analyzing the low temperature properties
of the ideal Fermi gas (IFG).
We start by dening
()
) (4.231)
so that () =
(). We then have

1 =
d f( )
d
d
=
d f
() ( +) , (4.232)
where we assume () = 0. Next, we invoke Taylors theorem, to write
( +) =
n=0
n
n!
d
n
d
n
= exp
_
d
d
_
() . (4.233)
This last expression involving the exponential of a dierential operator may appear overly
formal but it proves extremely useful. Since
f
() =
1
k
B
T
e
/k
B
T
_
e
/k
B
T
+ 1
_
2
, (4.234)
Figure 4.13: Deformation of the complex integration contour in eqn. 4.237.
we can write
1 =
dv
e
vD
(e
v
+ 1)(e
v
+ 1)
() , (4.235)
with v = /k
B
T, where
D = k
B
T
d
d
(4.236)
is a dimensionless dierential operator. The integral can now be done using the methods
of complex integration:
14
dv
e
vD
(e
v
+ 1)(e
v
+ 1)
= 2i
n=1
Res
_
e
vD
(e
v
+ 1)(e
v
+ 1)
_
v=(2n+1)i
= 2i
n=0
De
(2n+1)iD
=
2iDe
iD
1 e
2iD
= D csc D (4.237)
Thus,
1(T, ) = Dcsc(D) () , (4.238)
which is to be understood as the dierential operator D(csc D) = D/ sin(D) acting
on the function (). Appealing once more to Taylors theorem, we have
Dcsc(D) = 1 +

2
6
(k
B
T)
2
d
2
d
2
+
7
4
360
(k
B
T)
4
d
4
d
4
+. . . . (4.239)
14
Note that writing v = (2n+1) i + we have e
v
= 1
1
2
2
+. . . , so (e
v
+1)(e
v
+1) =
2
+. . .
We then expand e
vD
= e
(2n+1)iD
`
1 +D +. . .) to nd the residue: Res = De
(2n+1)iD
.
Thus,
1(T, ) =
d f( ) ()
=
d () +

2
6
(k
B
T)
2
() +
7
4
360
(k
B
T)
4
() +. . . . (4.240)
If () is a polynomial function of its argument, then each derivative eectively reduces the
order of the polynomial by one degree, and the dimensionless parameter of the expansion
is (T/)
2
. This procedure is known as the Sommerfeld expansion.
4.7.5 Chemical potential shift
As our rst application of the Sommerfeld expansion formalism, let us compute (n, T) for
the ideal Fermi gas. The number density n(T, ) is
n =
d g() f( )
=
d g() +

2
6
(k
B
T)
2
g
() +. . . . (4.241)
Let us write =
F
+, where
F
= (T = 0, n) is the Fermi energy, which is the chemical
potential at T = 0. We then have
n =
F
+
_
d g() +

2
6
(k
B
T)
2
g
(
F
+) +. . .
=
F
_
d g() +g(
F
) +

2
6
(k
B
T)
2
g
(
F
) +. . . , (4.242)
from which we derive
=
2
6
(k
B
T)
2
g
(
F
)
g(
F
)
+O(T
4
) . (4.243)
Note that g
/g = (ln g)
. For a ballistic dispersion, assuming g = 2,

g() = 2
_
d
3
k
(2)
3

_

2
k
2
2m
_
=
mk()
k()=
1
2m
(4.244)
Thus, g()
1/2
and (ln g)
=
1
2

1
, so
(n, T) =
F

2
12
(k
B
T)
2
F
+. . . , (4.245)
where
F
(n) =

2
2m
(3
2
n)
2/3
.
4.7.6 Specic heat
The energy of the electron gas is
E
V
=
d g() f( )
=
d g() +

2
6
(k
B
T)
2
d
d
_
g()
_
+. . .
=
F
_
d g() +g(
F
)
F
+

2
6
(k
B
T)
2
F
g
(
F
) +

2
6
(k
B
T)
2
g(
F
) +. . .
=
0
+

2
6
(k
B
T)
2
g(
F
) +. . . , (4.246)
where
0
=
F
_
d g() (4.247)
is the ground state energy density (i.e. ground state energy per unit volume). Thus,
C
V,N
=
_
E
T
_
V,N
=

2
3
V k
2
B
T g(
F
) V T , (4.248)
where
=

2
3
k
2
B
g(
F
) . (4.249)
Note that the molar heat capacity is
c
V
=
N
A
N
C
V
=

2
3
R
k
B
T g(
F
)
n
=

2
2
_
k
B
T
F
_
R , (4.250)
where in the last expression on the RHS we have assumed a ballistic dispersion, for which
g(
F
)
n
=
g mk
F
2
2
2

6
2
g k
3
F
=
3
2
F
. (4.251)
The molar heat capacity in eqn. 4.250 is to be compared with the classical ideal gas
value of
3
2
R. Relative to the classical ideal gas, the IFG value is reduced by a fraction of
(
2
/3) (k
B
T/
F
), which in most metals is very small and even at room temperature is
only on the order of 10
2
. Most of the heat capacity of metals at room temperature is due
to the energy stored in lattice vibrations.
4.7.7 Magnetic susceptibility and Pauli paramagnetism
Magnetism has two origins: (i) orbital currents of charged particles, and (ii) intrinsic mag-
netic moment. The intrinsic magnetic moment m of a particle is related to its quantum
mechanical spin via
m = g
0
S/ ,
0
=
q
2mc
= magneton , (4.252)
where g is the particles g-factor,
0
its magnetic moment, and S is the vector of quan-
tum mechanical spin operators satisfying
_
S
, S
= i
, i.e. SU(2) commutation

relations. The Hamiltonian for a single particle is then
H =
1
2m
_
p
q
c
A
_
2
m H
=
1
2m
_
p +
e
c
A
_
2
+
g
2

B
H , (4.253)
where in the last line weve restricted our attention to the electron, for which q = e. The
g-factor for an electron is g = 2 at tree level, and when radiative corrections are accounted
for using quantum electrodynamics (QED) one nds g = 2.0023193043617(15). For our
purposes we can take g = 2, although we can always absorb the small dierence into the
denition of
B
, writing
B

B
= ge/4mc. Weve chosen the z-axis in spin space to
point in the direction of the magnetic eld, and we wrote the eigenvalues of S
z
as
1
2
,
where = 1. The quantity m
is the eective mass of the electron, which we mentioned

earlier. An important distinction is that it is m
which enters into the kinetic energy term

p
2
/2m
, but it is the electron mass m itself (m = 511 keV) which enters into the denition
of the Bohr magneton. We shall discuss the consequences of this further below.
In the absence of orbital magnetic coupling, the single particle dispersion is
(k) =

2
k
2
2m
+
B
H . (4.254)
At T = 0, we have the results of 4.7.3. At nite T, we once again use the Sommerfeld
expansion. We then have
n =
d g
() f( ) +
d g
() f( )
=
1
2
d
_
g(
B
H) +g( +
B
H)
_
f( )
=
d
_
g() + (
B
H)
2
g
() +. . .
_
f( ) . (4.255)
Figure 4.14: Fermi distributions in the presence of an external Zeeman-coupled magnetic
eld.
We now invoke the Sommerfeld expension to nd the temperature dependence:
n =
d g() +

2
6
(k
B
T)
2
g
() + (
B
H)
2
g
() +. . .
=
F
_
d g() +g(
F
) +

2
6
(k
B
T)
2
g
(
F
) + (
B
H)
2
g
(
F
) +. . . . (4.256)
Note that the density of states for spin species is
g
() =
1
2
g(
B
H) , (4.257)
where g() is the total density of states per unit volume, for both spin species, in the absence
of a magnetic eld. We conclude that the chemical potential shift in an external eld is
(T, n, H) =
_
2
6
(k
B
T)
2
+ (
B
H)
2
_
g
(
F
)
g(
F
)
+. . . . (4.258)
We next compute the dierence n
in the densities of up and down spin electrons:

n
d
_
g
() g
()
_
f( )
=
1
2
d
_
g(
B
H) g( +
B
H)
_
f( )
=
B
H Dcsc(D) g() +O(H
3
) . (4.259)
We neednt go beyond the trivial lowest order term in the Sommerfeld expansion, because
H is already assumed to be small. Thus, the magnetization density is
M =
B
(n
) =
2
B
g(
F
) H . (4.260)
in which the magnetic susceptibility is
=
_
M
H
_
T,N
=
2
B
g(
F
) . (4.261)
This is called the Pauli paramagnetic susceptibility.
4.7.8 Landau diamagnetism
When orbital eects are included, the single particle energy levels are given by
(n, k
z
, ) = (n +
1
2
)
c
+

2
k
2
z
2m
+
B
H . (4.262)
Here n is a Landau level index, and
c
= eH/m
c is the cyclotron frequency. Note that

B
H
c
=
geH
4mc

m
c
eH
=
g
4

m
m
. (4.263)
Accordingly, we dene the ratio r (g/2) (m
/m). We can then write

(n, k
z
, ) =
_
n +
1
2
+
1
2
r
_
c
+

2
k
2
z
2m
. (4.264)
The grand potential is then given by
=
HA
0
L
z
k
B
T
dk
z
2
n=0
=1
ln
_
1 +e
/k
B
T
e
(n+
1
2
+
1
2
r)
c
/k
B
T
e
2
k
2
z
/2mk
B
T
_
.
(4.265)
A few words are in order here regarding the prefactor. In the presence of a uniform magnetic
eld, the energy levels of a two-dimensional ballistic charged particle collapse into Landau
levels. The number of states per Landau level scales with the area of the system, and is
equal to the number of ux quanta through the system: N
= HA/
0
, where
0
= hc/e is
the Dirac ux quantum. Note that
HA
0
L
z
k
B
T =
c
3
T
, (4.266)
hence we can write
(T, V, , H) =
c
n=0
=1
Q
_
(n +
1
2
+
1
2
r)
c
_
, (4.267)
where
Q() =
V
2
T
dk
z
2
ln
_
1 +e
/k
B
T
e
2
k
2
z
/2m
k
B
T
_
. (4.268)
We now invoke the Euler-MacLaurin formula,
n=0
F(n) =
_
0
dx F(x) +
1
2
F(0)
1
12
F
(0) +. . . , (4.269)
resulting in
=
=1
_

_
1
2
(1+r)
c
d Q( ) +
1
2

c
Q
_
1
2
(1 +r)
c
1
12
(
c
)
2
Q
_
1
2
(1 +r)
c
_
+. . .
_
(4.270)
We next expand in powers of the magnetic eld H to obtain
(T, V, , H) = 2
_
0
d Q( ) +
_
1
4
r
2
1
12
_
(
c
)
2
Q
() +. . . . (4.271)
Thus, the magnetic susceptibility is
=
1
V
H
2
=
_
r
2
1
3
_

2
B
_
m/m
_
2
2
V
Q
()
_
=
_
g
2
4

m
2
3m
2
_

2
B
n
2
T
, (4.272)
where
T
is the isothermal compressibility
15
. In most metals we have m
m and the term

in brackets is positive (recall g 2). In semiconductors, however, we can have m
m;
for example in GaAs we have m
= 0.067 . Thus, semiconductors can have a diamagnetic

15
Weve used
2
V
Q
() =
1
V
2
= n
2
T
.
response. If we take g = 2 and m
= m, we see that the orbital currents give rise to a

diamagnetic contribution to the magnetic susceptibility which is exactly
1
3
times as large
as the contribution arising from Zeeman coupling. The net result is then paramagnetic
(
> 0) and
2
3
as large as the Pauli susceptibility. The orbital currents can be understood
within the context of Lenzs law.
Exercise : Show that
2
V
Q
() = n
2
T
.
4.7.9 White dwarf stars
There is a nice discussion of this material in R. K. Pathria, Statistical Mechanics. As a
model, consider a mass M 10
33
g of helium at nuclear densities of 10
7
g/cm
3
and
temperature T 10
7
K. This temperature is much larger than the ionization energy of
4
He,
hence we may safely assume that all helium atoms are ionized. If there are N electrons,
then the number of particles (i.e.
4
He nuclei) must be
1
2
N. The mass of the particle is
m
4m
p
. The total stellar mass M is almost completely due to particle cores.
The electron density is then
n =
N
V
=
2 M/4m
p
V
=

2m
p
10
30
cm
3
, (4.273)
since M = N m
e
+
1
2
N 4m
p
. From the number density n we nd for the electrons
k
F
= (3
2
n)
1/3
= 2.14 10
10
cm
1
(4.274)
p
F
= k
F
= 2.26 10
17
g cm/s (4.275)
mc = (9.1 10
28
g)(3 10
10
m/s) = 2.7 10
17
g cm/s . (4.276)
Since p
F
mc, we conclude that the electrons are relativistic. The Fermi temperature will
then be T
F
mc
2
10
6
eV 10
12
K. Thus, T T
f
which says that the electron gas is
degenerate and may be considered to be at T 0. So we need to understand the ground
state properties of the relativistic electron gas.
The kinetic energy is given by
(p) =
_
p
2
c
2
+m
2
c
4
mc
2
. (4.277)
The velocity is
v =

p
=
pc
2
_
p
2
c
2
+m
2
c
4
. (4.278)
The pressure in the ground state is
p
0
=
1
3
np v)
=
1
3
2
3
p
F
_
0
dp p
2
p
2
c
2
_
p
2
c
2
+m
2
c
4
=
m
4
c
5
3
2
F
_
0
d sinh
4
=
m
4
c
5
96
2
3
_
sinh(4
F
) 8 sinh(2
F
) + 12
F
_
, (4.279)
where we use the substitution
p = mc sinh , v = c tanh = =
1
2
ln
_
c +v
c v
_
. (4.280)
Note that p
F
= k
F
= (3
2
n)
1/3
, and that
n =
M
2m
p
V
= 3
2
n =
9
8
M
R
3
m
p
. (4.281)
Now in equilibrium the pressure p is balanced by gravitational pressure. We have
dE
0
= p
0
dV = p
0
(R) 4R
2
dR . (4.282)
This must be balanced by gravity:
dE
g
=
GM
2
R
2
dR , (4.283)
where depends on the radial mass distribution. Equilibrium then implies
p
0
(R) =

4
GM
2
R
4
. (4.284)
To nd the relation R = R(M), we must solve
4
gM
2
R
4
=
m
4
c
5
96
2
3
_
sinh(4
F
) 8 sinh(2
F
) + 12
F
_
. (4.285)
Note that
sinh(4
F
) 8 sinh(2
F
) + 12
F
=
_
_
96
15

5
F

F
0
1
2
e
4
F

F
.
(4.286)
Figure 4.15: Mass-radius relationship for white dwarf stars. (Source: Wikipedia).
Thus, we may write
p
0
(R) =

4
gM
2
R
4
=
_
2
15
2
m
_
9
8
M
R
3
m
p
_
5/3
F
0
c
12
2
_
9
8
M
R
3
m
p
_
4/3
F
.
(4.287)
In the limit
F
0, we solve for R(M) and nd
R =
3
40
(9)
2/3

2
Gm
5/3
p
mM
1/3
M
1/3
. (4.288)
In the opposite limit
F
, the R factors divide out and we obtain
M = M
0
=
9
64
_
3
3
_
1/2
_
c
G
_
3/2
1
m
2
p
. (4.289)
To nd the R dependence, we must go beyond the lowest order expansion of eqn. 4.286, in
which case we nd
R =
_
9
8
_
1/3
_

mc
__
M
m
p
_
1/3
_
1
_
M
M
0
_
2/3
_
1/2
. (4.290)
The value M
0
is the limiting size for a white dwarf. It is called the Chandrasekhar limit.
Chapter 5
Interacting Systems
5.1 References
A comprehensive graduate level text with an emphasis on nonequilibrium phenomena.
rd
edition, World
Scientic, 2006)
rd
edition, Pergamon,
1980)
J.-P Hansen and I. R. McDonald, Theory of Simple Liquids (Academic Press, 1990)
An advanced, detailed discussion of liquid state physics.
249
250 CHAPTER 5. INTERACTING SYSTEMS
5.2 Ising Model
5.2.1 Denition
The simplest model of an interacting system consists of a lattice L of sites, each of which
contains a spin
i
which may be either up (
i
= +1) or down (
i
= 1). The Hamiltonian
is
H = J
ij)
0
H
i
. (5.1)
When J > 0, the preferred (i.e. lowest energy) conguration of neighboring spins is that
they are aligned, i.e.
i
j
= +1. The interaction is then called ferromagnetic. When J < 0
the preference is for anti-alignment, i.e.
i
j
= 1, which is antiferromagnetic.
This model is not exactly solvable in general. In one dimension, the solution is quite
straightforward. In two dimensions, Onsagers solution of the model (with H = 0) is among
the most celebrated results in statistical physics. In higher dimensions the system has
been studied by numerical simulations (the Monte Carlo method) and by eld theoretic
calculations (renormalization group), but no exact solutions exist.
5.2.2 Ising model in one dimension
Consider a one-dimensional ring of N sites. The ordinary canonical partition function is
then
Z
ring
= Tr e

H
=
n=1
e
J
n
n+1
e
0
H
n
= Tr
_
R
N
_
, (5.2)
where
N+1

1
owing to periodic (ring) boundary conditions, and where R is a 2 2
transfer matrix,
R
= e
J
0
H(+
)/2
(5.3)
=
_
e
J
e
0
H
e
J
e
J
e
J
e
0
H
_
(5.4)
= e
J
cosh(
0
H) +e
J
sinh(
0
H)
z
+e
J
x
, (5.5)
where
are the Pauli matrices. Since the trace of a matrix is invariant under a similarity
transformation, we have
Z(T, H, N) =
N
+
+
N
, (5.6)
where
(T, H) = e
J
cosh(
0
H)
_
e
2J
sinh
2
(
0
H) +e
2J
(5.7)
5.2. ISING MODEL 251
are the eigenvalues of R. In the thermodynamic limit, N , and the
N
+
term dominates
exponentially. We therefore have
F(T, H, N) = Nk
B
T ln
+
(T, H) . (5.8)
From the free energy, we can compute the magnetization,
M =
_
F
H
_
T,N
=
N
0
sinh(
0
H)
_
sinh
2
(
0
H) +e
4J
(5.9)
and the zero eld isothermal susceptibility,
(T) =
1
N
M
H
H=0
=

2
0
k
B
T
e
2J/k
B
T
. (5.10)
Note that in the noninteracting limit J 0 we recover the familiar result for a free spin.
The eect of the interactions at low temperature is to vastly increase the susceptibility.
Rather than a set of independent single spins, the system eectively behaves as if it were
composed of large blocks of spins, where the block size is the correlation length, to be
derived below.
The physical properties of the system are often elucidated by evaluation of various correla-
tion functions. In this case, we dene
C(n)
n+1
_
=
Tr
_
1
R
2
R
n+1
n+1
R
n+1
n+2
R
1
_
Tr
_
R
N
_
=
Tr
_
R
n
R
Nn
_
Tr
_
R
N
_ , (5.11)
where 0 < n < N, and where
=
_
1 0
0 1
_
. (5.12)
To compute this ratio, we decompose R in terms of its eigenvectors, writing
R =
+
[+)+[ +
[)[ . (5.13)
Then
C(n) =
N
+

2
++
+
N
+
_
Nn
+

n
+
n
+
Nn
N
+
+
N
, (5.14)
where
= [ [
) . (5.15)
5.2.3 H = 0
Consider the case H = 0, where R = e
J
+e
J
x
, where
x
is the Pauli matrix. Then
[ ) =
1
2
_
[ ) [ )
_
, (5.16)
i.e. the eigenvectors of R are
=
1
2
_
1
1
_
, (5.17)
and
++
=
= 0, while
=
+
= 1. The corresponding eigenvalues are
+
= 2 cosh(J) ,
= 2 sinh(J) . (5.18)
The correlation function is then found to be
C(n)
n+1
_
=

N[n[
+

[n[
+
[n[
+

N[n[
N
+
+
N
=
tanh
[n[
(J) + tanh
N[n[
(J)
1 + tanh
N
(J)
(5.19)
tanh
[n[
(J) (N ) . (5.20)
This result is also valid for n < 0, provided [n[ N. We see that we may write
C(n) = e
[n[/(T)
, (5.21)
where the correlation length is
(T) =
1
ln ctnh(J/k
B
T)
. (5.22)
Note that (T) grows as T 0 as
1
2
e
2J/k
B
T
.
5.2.4 Chain with free ends
When the chain has free ends, there are (N1) links, and the partition function is
Z
chain
=
_
R
N1
_
(5.23)
=
N1
+

+
()
+
(
) +
N1
()
)
_
, (5.24)
where
() = [ ). When H = 0, we make use of eqn. 5.17 to obtain

R
N1
=
1
2
_
1 1
1 1
_
_
2 cosh J
_
N1
+
1
2
_
1 1
1 1
_
_
2 sinh J
_
N1
, (5.25)
5.3. POTTS MODEL 253
and therefore
Z
chain
= 2
N
cosh
N1
(J) . (5.26)
Theres a nifty trick to obtaining the partition function for the Ising chain which amounts
to a chain of variables. We dene
n

n
n+1
(n = 1 , . . . , N 1) . (5.27)
Thus,
1
=
1
2
,
2
=
2
3
, etc. Note that each
j
takes the values 1. The Hamiltonian
for the chain is
H
chain
= J
N1
n=1
n+1
= J
N1
n=1
n
. (5.28)
The state of the system is dened by the N Ising variables
1
,
1
, . . . ,
N1
. Note
that
1
doesnt appear in the Hamiltonian. Thus, the interacting model is recast as N1
noninteracting Ising spins, and the partition function is
Z
chain
= Tr e
H
chain
=
N1
e
J
1
e
J
2
e
J
N1
=
1
_
e
J
_
N1
= 2
N
cosh
N1
(J) . (5.29)
5.3 Potts Model
5.3.1 Denition
The Potts model is dened by the Hamiltonian
H = J
ij)
i
,
j
h
i
,1
. (5.30)
Here, the spin variables
i
take values in the set 1, 2, . . . , q on each site. The equivalent
of an external magnetic eld in the Ising case is a eld h which prefers a particular value of
( = 1 in the above Hamiltonian). Once again, it is not possible to compute the partition
function on general lattices, however in one dimension we may once again nd Z using the
transfer matrix method.
5.3.2 Transfer matrix
On a ring of N sites, we have
Z = Tr e
H
=
e
h
1
,1
e
J
1
,
2
e
h
N
,1
e
J
N
,
1
(5.31)
= Tr
_
R
N
_
, (5.32)
where the q q transfer matrix R is given by
R
= e
J
e
1
2
h
,1
e
1
2
h
,1
=
_
_
e
(J+h)
if =
= 1
e
J
if =
,= 1
e
h/2
if = 1 and
,= 1
e
h/2
if ,= 1 and
= 1
1 if ,= 1 and
,= 1 and ,=
.
(5.33)
In matrix form,
R =
_
_
_
_
_
_
_
_
_
e
(J+h)
e
h/2
e
h/2
e
h/2
e
h/2
e
J
1 1
e
h/2
1 e
J
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
e
h/2
1 1 e
J
1
e
h/2
1 1 1 e
J
_
_
_
_
_
_
_
_
_
(5.34)
The matrix R has q eigenvalues
j
, with j = 1, . . . , q. The partition function for the Potts
chain is then
Z =
q
j=1
N
j
. (5.35)
We can actually nd the eigenvalues of R analytically. To this end, consider the vectors
=
_
_
_
_
_
1
0
.
.
.
0
_
_
_
_
_
, =
_
q 1 +e
h
_
1/2
_
_
_
_
_
e
h/2
1
.
.
.
1
_
_
_
_
_
. (5.36)
Then R may be written as
R =
_
e
J
1
_
I +
_
q 1 +e
h
_
[ ) [ +
_
e
J
1
__
e
h
1
_
[ ) [ , (5.37)
where I is the q q identity matrix. When h = 0, we have a simpler form,
R =
_
e
J
1
_
I +q [ ) [ . (5.38)
5.3. POTTS MODEL 255
From this we can read o the eigenvalues:
1
= e
J
+q 1 (5.39)
j
= e
J
1 , j 2, . . . , q , (5.40)
since [ ) is an eigenvector with eigenvalue = e
J
+ q 1, and any vector orthogonal to
[ ) has eigenvalue = e
J
1. The partition function is then
Z =
_
e
J
+q 1
_
N
+ (q 1)
_
e
J
1
_
N
. (5.41)
In the thermodynamic limit N , only the
1
eigenvalue contributes, and we have
F(T, N, h = 0) = Nk
B
T ln
_
e
J/k
B
T
+q 1
_
for N . (5.42)
When h is nonzero, the calculation becomes somewhat more tedious, but still relatively
easy. The problem is that [ ) and [ ) are not orthogonal, so we dene
[
) =
[ ) [ ) [ )
_
1 [ )
2
, (5.43)
where
x [ ) =
_
e
h
q 1 +e
h
_
1/2
. (5.44)
Now we have
[ ) = 0, with
) = 1 and [ ) = 1, with
[ ) =
_
1 x
2
[ ) +x[ ) . (5.45)
and the transfer matrix is then
R =
_
e
J
1
_
I +
_
q 1 +e
h
_
[ ) [
+
_
e
J
1
__
e
h
1
_
_
(1 x
2
) [
[ +x
2
[ ) [ +x
_
1 x
2
_
[
) [ +[ )
[
_
_
=
_
e
J
1
_
I +
_
_
q 1 +e
h
_
+
_
e
J
1
__
e
h
1
_
_
e
h
q 1 +e
h
_
_
[ ) [ (5.46)
+
_
e
J
1
__
e
h
1
_
_
q 1
q 1 +e
h
_
[
[
+
_
e
J
1
__
e
h
1
_
_
(q 1) e
h
q 1 +e
h
_
1/2 _
[
) [ +[ )
[
_
,
which in the two-dimensional subspace spanned by [
) and [ ) is of the form

R =
_
a c
c b
_
. (5.47)
Recall that for any 2 2 Hermitian matrix,
M = a
0
I +a
=
_
a
0
+a
3
a
1
ia
2
a
1
+ia
2
a
0
a
3
_
, (5.48)
the characteristic polynomial is
P() = det
_
I M
_
= ( a
0
)
2
a
2
1
a
2
2
a
2
3
, (5.49)
and hence the eigenvalues are
= a
0
_
a
2
1
+a
2
2
+a
2
3
. (5.50)
For the transfer matrix of eqn. 5.46, we obtain, after a little work,
1,2
= e
J
1 +
1
2
_
q 1 +e
h
+
_
e
J
1
__
e
h
1
_
_
(5.51)
1
2
_
_
q 1 +e
h
+
_
e
J
1
__
e
h
1
_
_
2
4(q 1)
_
e
J
1
__
e
h
1
_
.
There are q 2 other eigenvalues, however, associated with the (q2)-dimensional subspace
orthogonal to [
) and [ ). Clearly all these eigenvalues are given by
j
= e
J
1 , j 3 , . . . , q . (5.52)
The partition function is then
Z =
N
1
+
N
2
+ (q 2)
N
3
, (5.53)
and in the thermodynamic limit N the maximum eigenvalue
1
dominates. Note that
we recover the correct limit as h 0.
5.4 Weakly Nonideal Gases
Consider the ordinary canonical partition function for a nonideal system of identical point
particles:
Z(T, V, N) =
1
N!
_
N
i=1
d
d
p
i
d
d
x
i
h
d
e

H/k
B
T
(5.54)
=

Nd
T
N!
_
N
i=1
d
d
x
i
exp
_
1
k
B
T
i<j
u
_
[x
i
x
j
[
_
_
. (5.55)
Here, we have assumed a many body Hamiltonian of the form
H =
N
i=1
p
2
i
2m
+
i<j
u
_
[x
i
x
j
[
_
, (5.56)
5.4. WEAKLY NONIDEAL GASES 257
in which massive nonrelativistic particles interact via a two-body central potential. As
before,
T
=
_
2
2
/mk
B
T is the thermal wavelength. Consider the function e
u
ij
,
where u
ij
u([x
i
x
j
[). We assume that at very short distances there is a strong repulsion
between particles, i.e. u
ij
as r
ij
= [x
i
x
j
[ 0, and that u
ij
0 as r
ij
. Thus,
e
u
ij
vanishes as r
ij
0 and approaches unity as r
ij
. For our purposes, it will prove
useful to dene the function
f(r) = e
u(r)
1 , (5.57)
called the Mayer function after Josef Mayer. We can now write
Z(T, V, N) =
Nd
T
Q
N
(T, V ) , (5.58)
where the conguration integral Q
N
(T, V ) is given by
Q
N
(T, V ) =
1
N!
_
d
d
x
1

_
d
d
x
N
i<j
_
1 +f
ij
_
. (5.59)
A typical potential we might consider is the semi-phenomenological Lennard-Jones poten-
tial,
u(r) = 4
_
_
r
_
12
r
_
6
_
. (5.60)
This accounts for a long-distance attraction due to mutually induced electric dipole uc-
tuations, and a strong short-ranged repulsion, phenomenologically modelled with a r
12
potential, which mimics a hard core due to overlap of the atomic electron distributions.
Setting u
(r) = 0 we obtain r
= 2
1/6
1.12246 at the minimum, where u(r
) = .
In contrast to the Boltzmann weight e
u(r)
, the Mayer function f(r) vanishes as r ,
behaving as f(r) u(r). The Mayer function also depends on temperature. Sketches of
u(r) and f(r) for the Lennard-Jones model are shown in g. 5.1.
5.4.1 Mayer cluster expansion
We may expand the product in eqn. 5.59 as
i<j
_
1 +f
ij
_
= 1 +
i<j
f
ij
+
i<j , k<l
(ij)=(kl)
f
ij
f
kl
+. . . . (5.61)
As there are
1
2
N(N 1) possible pairings, there are 2
N(N1)/2
terms in the expansion of
the above product. Each such term may be represented by a graph, as shown in g. 5.2.
For each such term, we draw a connection between dots representing dierent particles i
and j if the factor f
ij
appears in the term under consideration. The contribution for any
given graph may be written as a product over contributions from each of its disconnected
component clusters. For example, in the case of the term in g. 5.2, the contribution to
the congurational integral would be
Q =
1
N!
_
d
d
x
1
d
d
x
4
d
d
x
7
d
d
x
9
f
1,4
f
4,7
f
4,9
f
7,9
(5.62)
_
d
d
x
2
d
d
x
5
d
d
x
6
f
2,5
f
2,6
_
d
d
x
3
d
d
x
10
f
3,10
_
d
d
x
8
d
d
x
11
f
8,11
.
Figure 5.1: Bottom panel: Lennard-Jones potential u(r) = 4
_
x
12
x
6
_
, with x = r/
and = 1. Note the weak attractive tail and the strong repulsive core. Top panel: Mayer
function f(r, T) = e
u(r)/k
B
T
1 for k
B
T = 0.8 (blue), k
B
T = 1.5 (green), and k
B
T = 5
(red).
We will refer to a given product of Mayer functions which arises from this expansion as a
term.
The particular labels we assign to each vertex of a given graph dont aect the overall
value of the graph. Now a given unlabeled graph consists of a certain number of connected
subgraphs. For a system with N particles, we may then write
N =
, (5.63)
where ranges over all possible connected subgraphs, and
m
= number of connected subgraphs of type in the unlabeled graph

n
= number of vertices in the connected subgraph .

Note that the single vertex counts as a connected subgraph, with n
= 1. We now ask:
how many ways are there of assigning the N labels to the N vertices of a given unlabeled
graph? One might rst thing the answer is simply N!, however this is too big, because
dierent assignments of the labels to the vertices may not result in a distinct graph. To see
this, consider the examples in g. 5.3. In the rst example, an unlabeled graph with four
vertices consists of two identical connected subgraphs. Given any assignment of labels to
the vertices, then, we can simply exchange the two subgraphs and get the same term. So we
should divide N! by the product
!. But even this is not enough, because within each

connected subgraph there may be permutations which leave the integrand unchanged,
Figure 5.2: Diagrammatic interpretation of a term involving a product of eight Mayer
functions.
as shown in the second and third examples in g. 5.3. We dene the symmetry factor s
as the number of permutations of the labels which leaves a given connected subgraphs
invariant. Examples of symmetry factors are shown in g. 5.4. Consider, for example,
the third subgraph in the top row. Clearly one can rotate the gure about its horizontal
symmetry axis to obtain a new labeling which represents the same term. This twofold
axis is the only symmetry the diagram possesses, hence s
= 2. For the rst diagram in

the second row, one can rotate either of the triangles about the horizontal symmetry axis.
One can also rotate the gure in the plane by 180
so as to exchange the two triangles.

Thus, there are 2 2 2 = 8 symmetry operations which result in the same term, and
s
= 8. Finally, the last subgraph in the second row consists of ve vertices each of which
is connected to the other four. Therefore any permutation of the labels results in the same
term, and s
= 5! = 120. In addition to dividing by the product
!, we must then also

divide by
s
m
.
We can now write the partition function as
Z =

Nd
T
N!
N!
! s
m
_
_
d
d
x
1
d
d
x
n
i<j
f
ij
_
m

N ,
P
m
, (5.64)
where the last product is over all links in the subgraph . The nal Kronecker delta enforces
the constraint N =
. We next dene the cluster integral b
as
b
(T)
1
s
1
V
_
d
d
x
1
d
d
x
n
i<j
f
ij
. (5.65)
Since f
ij
= f
_
[x
i
x
j
[
_
, the product
i<j
f
ij
is invariant under simultaneous translation of
all the coordinate vectors by any constant vector, and hence the integral over the n
position
variables contains exactly one factor of the volume, which cancels with the prefactor in the
above denition of b
. Thus, each cluster integral is intensive, scaling as V

0
.
1
If we compute the grand partition function, then the xed N constraint is relaxed, and we
1
We assume that the long-ranged behavior of f(r) u(r) is integrable.
Figure 5.3: Dierent assignations of labels to vertices may not result in a distinct term in
the expansion of the conguration integral.
can do the sums:
= e
_
e
d
T
_
P
m
1
m
!
_
V b
_
m
=0
1
m
!
_
e
d
T
_
m
_
V b
_
m
= exp
_
V
_
e
d
T
_
n
_
. (5.66)
Thus,
(T, V, ) = V k
B
T
_
e
d
T
_
n
(T) , (5.67)
and we can write
p = k
B
T
_
z
d
T
_
n
(T) (5.68)
n =
_
z
d
T
_
n
(T) , (5.69)
where z = exp() is the fugacity, and where b
1. As we did in the case of ideal quantum

gas statistical mechanics, we can systematically invert the relation n = n(z, T) to obtain
z = z(n, T), and then insert this into the equation for p(z, T) to obtain the equation of
state p = p(n, T). This yields the virial expansion of the equation of state,
p = nk
B
T
_
1 +B
2
(T) n +B
3
(T) n
2
+. . .
_
. (5.70)
5.4.2 Cookbook recipe
Just follow these simple steps!
The pressure and number density are written as an expansion over unlabeled con-
nected clusters , viz.
p =
_
z
d
T
_
n
n =
_
z
d
T
_
n
.
For each term in each of these sums, draw the unlabeled connected cluster .
Assign labels 1 , 2 , . . . , n
to the vertices, where n
is the total number of vertices

in the cluster . It doesnt matter how you assign the labels.
Write down the product
i<j
f
ij
. The factor f
ij
appears in the product if there is a
link in your (now labeled) cluster between sites i and j.
The symmetry factor s
is the number of elements of the symmetric group S

n
which
leave the product
i<j
f
ij
invariant. The identity permutation always leaves the
product invariant, so s
1.
The cluster integral is
b
(T)
1
s
1
V
_
d
d
x
1
d
d
x
n
i<j
f
ij
.
Due to translation invariance, b
(T) V
0
. One can therefore set x
1
0, eliminate
the volume factor, and perform the integral over the remaining n
1 coordinates.
This procedure generates expansions for p(T, z) and n(T, z) in powers of the fugacity
z = e
. To obtain something useful like p(T, n), we invert the equation n = n(T, z)
to nd z = z(T, n), and then substitute into the equation p = p(T, z) to obtain
p = p
_
T, z(T, n)
_
= p(T, n). The result is the virial expansion,
p = nk
B
T
_
1 +B
2
(T) n +B
3
(T) n
2
+. . .
_
.
5.4.3 Lowest order expansion
We have
b
(T) =
1
2V
_
d
d
x
1
_
d
d
x
2
f
_
[x
1
x
2
[
_
=
1
2
_
d
d
r f(r) (5.71)
Figure 5.4: The symmetry factor s
for a connected subgraph is the number of permuta-

tions of its indices which leaves the term
(ij)
f
ij
invariant.
and
b
(T) =
1
2V
_
d
d
x
1
_
d
d
x
2
_
d
d
x
3
f
_
[x
1
x
2
[
_
f
_
[x
1
x
3
[
_
=
1
2
_
d
d
r
_
d
d
r
f(r) f(r
) = 2
_
b
_
2
(5.72)
and
b
(T) =
1
6V
_
d
d
x
1
_
d
d
x
2
_
d
d
x
3
f
_
[x
1
x
2
[
_
f
_
[x
1
x
3
[
_
f
_
[x
2
x
3
[
_
=
1
6
_
d
d
r
_
d
d
r
f(r) f(r
) f
_
[r r
[
_
. (5.73)
We may now write
p = k
B
T
_
z
d
T
+
_
z
d
T
_
2
b
(T) +
_
z
d
T
_
3
_
b
+b
_
+O(z
4
)
_
(5.74)
n = z
d
T
+ 2
_
z
d
T
_
2
b
(T) + 3
_
z
d
T
_
3
_
b
+b
_
+O(z
4
) (5.75)
We invert by writing
z
d
T
= n +
2
n
2
+
3
n
3
+. . . (5.76)
and substituting into the equation for n(z, T), yielding
n = (n +
2
n
2
+
3
n
3
) + 2(n +
2
n
2
)
2
b
+ 3n
3
_
b
+b
_
+O(n
4
) . (5.77)
Thus,
0 = (
2
+ 2b
) n
2
+ (
3
+ 4
2
b
+ 3b
+ 3b
) n
3
+. . . . (5.78)
We therefore conclude
2
= 2b
(5.79)
3
= 4
2
b
3b
3b
= 8b
2
6b
2
3b
= 2b
2
3b
. (5.80)
We now insert eqn. 5.76 with the determined values of
2,3
into the equation for p(z, T),
obtaining
p
k
B
T
= n 2b
n
2
+ (2b
2
3b
) n
3
+ (n 2b
n
2
)
2
b
+n
3
(2b
2
+b
) +O(n
4
) (5.81)
= n b
n
2
2b
n
3
+O(n
4
) . (5.82)
Thus,
B
2
(T) = b
(T) , B
3
(T) = 2b
(T) . (5.83)
5.4.4 Hard sphere gas in three dimensions
The hard sphere potential is given by
u(r) =
_
if r a
0 if r > a .
(5.84)
Here a is the diameter of the spheres. The corresponding Mayer function is then tempera-
ture independent, and given by
f(r) =
_
1 if r a
0 if r > a .
(5.85)
We can change variables
b
(T) =
1
2
_
d
3
r f(r) =
2
3
a
3
. (5.86)
The calculation of b
is more challenging. We have

b
=
1
6
_
d
3
_
d
3
r f() f(r) f
_
[r [
_
. (5.87)
We must rst compute the volume of overlap for spheres of radius a (recall a is the diameter
of the constituent hard sphere particles) centered at 0 and at :
1 =
_
d
3
r f(r) f
_
[r [
_
(5.88)
= 2
a
_
/2
dz (a
2
z
2
) =
4
3
a
3
a
2
+

12

3
.
Figure 5.5: The overlap of hard sphere Mayer functions. The shaded volume is 1.
We then integrate over region [[ < 1, to obtain
b
=
1
6
4
2
_
0
d
2
_
4
3
a
3
a
2
+

12

3
_
=
5
2
36
a
6
. (5.89)
Thus,
p = nk
B
T
_
1 +
2
3
a
3
n +
5
2
18
a
6
n
2
+O(n
3
)
_
. (5.90)
5.4.5 Weakly attractive tail
Suppose
u(r) =
_
if r a
u
0
(r) if r > a .
(5.91)
Then the corresponding Mayer function is
f(r) =
_
1 if r a
e
u
0
(r)
1 if r > a .
(5.92)
Thus,
b
(T) =
1
2
_
d
3
r f(r) =
2
3
a
3
+ 2
_
a
dr r
2
_
e
u
0
(r)
1
_
. (5.93)
Thus, the second virial coecient is
B
2
(T) = b
(T)
2
3
a
3
2
k
B
T
_
a
dr r
2
u
0
(r) , (5.94)
where we have assumed k
B
T u
0
(r). We see that the second virial coecient changes
sign at some temperature T
0
, from a negative low temperature value to a positive high
temperature value.
5.4.6 Spherical potential well
Consider an attractive spherical well potential with an innitely repulsive core,
u(r) =
_
_
if r a
if a < r < R
0 if r > R .
(5.95)
Then the corresponding Mayer function is
f(r) =
_
_
1 if r a
e
1 if a < r < R
0 if r > R .
(5.96)
Writing s R/a, we have
B
2
(T) = b
(T) =
1
2
_
d
3
r f(r) (5.97)
=
1
2
_
(1)
4
3
a
3
+
_
e
1
_
4
3
a
3
(s
3
1)
_
=
2
3
a
3
_
1 (s
3
1)
_
e
1
_
_
. (5.98)
To nd the temperature T
0
where B
2
(T) changes sign, we set B
2
(T
0
) = 0 and obtain
k
B
T
0
=
_
ln
_
s
3
s
3
1
_
. (5.99)
Recall in our study of the thermodynamics of the Joule-Thompson eect in 1.10.6 that
the throttling process is isenthalpic. The temperature change, when a gas is pushed (or
escapes) through a porous plug from a high pressure region to a low pressure one is
T =
p
2
_
p
1
dp
_
T
p
_
H
, (5.100)
where
_
T
p
_
H
=
1
C
p
_
T
_
V
T
_
p
V
_
. (5.101)
Figure 5.6: An attractive spherical well with a repulsive core u(r) and its associated Mayer
function f(r).
Appealing to the virial expansion, and working to lowest order in corrections to the ideal
gas law, we have
p =
N
V
k
B
T +
N
2
V
2
k
B
T B
2
(T) +. . . (5.102)
and we compute
_
V
T
_
p
by seting
0 = dp =
Nk
B
T
V
2
dV +
Nk
B
V
dT
2N
2
V
3
k
B
T B
2
(T) dV +
N
2
V
2
d
_
k
B
T B
2
(T)
_
+. . . . (5.103)
Dividing by dT, we nd
T
_
V
T
_
p
V = N
_
T
B
2
T
B
2
_
. (5.104)
The temperature where
_
T
p
_
H
changes sign is called the inversion temperature T
. To nd
the inversion point, we set T
2
(T
) = B
2
(T
), i.e.
d ln B
2
d ln T
= 1 . (5.105)
If we approximate B
2
(T) A
B
T
, then the inversion temperature follows simply:
B
T
= A
B
T
= T
=
2B
A
. (5.106)
5.4.7 Hard spheres with a hard wall
Consider a hard sphere gas in three dimensions in the presence of a hard wall at z = 0. The
gas is conned to the region z > 0. The total potential energy is now
W(x
1
, . . . , x
N
) =
i
v(x
i
) +
i<j
u(x
i
x
j
) , (5.107)
where
v(r) = v(z) =
_
if z
1
2
a
0 if z >
1
2
a ,
(5.108)
and u(r) is given in eqn. 5.84. The grand potential is written as a series in the total particle
number N, and is given by
= e
= 1 +
_
d
3
r e
v(z)
+
1
2
2
_
d
3
r
_
d
3
r
e
v(z)
e
v(z
)
e
u(rr
)
+. . . , (5.109)
where = z
3
T
, with z = e
/k
B
T
the fugacity. Taking the logarithm, and invoking the
Taylor series ln(1 +) =
1
2
2
+
1
3
3
. . ., we obtain
=
_
z>
a
2
d
3
r +
1
2
2
_
z>
a
2
d
3
r
_
z
>
a
2
d
3
r
_
e
u(rr
)
1
_
+. . . (5.110)
The volume is V =
_
z>0
d
3
r. Dividing by V , we have, in the thermodynamic limit,
V
= p = +
1
2
2
1
V
_
z>
a
2
d
3
r
_
z
>
a
2
d
3
r
_
e
u(rr
)
1
_
+. . .
=
2
3
a
3
2
+O(
3
) . (5.111)
The number density is
n =

(p) =
4
3
a
3
2
+O(
3
) , (5.112)
and inverting to obtain (n) and then substituting into the pressure equation, we obtain
the lowest order virial expansion for the equation of state,
p = k
B
T
_
n +
2
3
a
3
n
2
+. . .
_
. (5.113)
As expected, the presence of the wall does not aect a bulk property such as the equation
of state.
Next, let us compute the number density n(z), given by
n(z) =

i
(r r
i
)
_
. (5.114)
Due to translational invariance in the (x, y) plane, we know that the density must be a
function of z alone. The presence of the wall at z = 0 breaks translational symmetry in the
z direction. The number density is
n(z) = Tr
_
e
(

N

H)
N
i=1
(r r
i
)
__
Tr e
(

N

H)
=
1
_
e
v(z)
+
2
e
v(z)
_
d
3
r
e
v(z
)
e
u(rr
)
+ . . .
_
= e
v(z)
+
2
e
v(z)
_
d
3
r
e
v(z
)
_
e
u(rr
)
1
_
+ . . . . (5.115)
Figure 5.7: In the presence of a hard wall, the Mayer sphere is cut o on the side closest
to the wall. The resulting density n(z) vanishes for z <
1
2
a since the center of each sphere
must be at least one radius (
1
2
a) away from the wall. Between z =
1
2
a and z =
3
2
a there
is a density enhancement. If the calculation were carried out to higher order, n(z) would
exhibit damped spatial oscillations with wavelength a.
Note that the term in square brackets in the last line is the Mayer function f(r r
) =
e
u(rr
)
1. Consider the function
e
v(z)
e
v(z
)
f(r r
) =
_
_
0 if z <
1
2
a or z
<
1
2
a
0 if [r r
[ > a
1 if z >
1
2
a and z
>
1
2
a and [r r
[ < a .
(5.116)
Now consider the integral of the above function with respect to r
. Clearly the result

depends on the value of z. If z >
3
2
a, then there is no excluded region in r
and the integral

is (1) times the full Mayer sphere volume, i.e.
4
3
a
3
. If z <
1
2
a the integral vanishes due
to the e
v(z)
factor. For z innitesimally larger than
1
2
a, the integral is (1) times half
the Mayer sphere volume, i.e.
2
3
a
3
. For z
_
a
2
,
3a
2
the integral interpolates between
2
3
a
3
and
4
3
a
3
. Explicitly, one nds by elementary integration,
_
d
3
r
e
v(z)
e
v(z
)
f(r r
) =
_
_
0 if z <
1
2
a
_
1
3
2
_
z
a

1
2
_
+
1
2
_
z
a

1
2
_
3
_
2
3
a
3
if
1
2
a < z <
3
2
a
4
3
a
3
if z >
3
2
a .
(5.117)
After substituting = n +
4
3
a
3
n
2
+ O(n
3
) to relate to the bulk density n = n
, we
obtain the desired result:
n(z) =
_
_
0 if z <
1
2
a
n +
_
1
3
2
_
z
a

1
2
_
+
1
2
_
z
a

1
2
_
3
_
2
3
a
3
n
2
if
1
2
a < z <
3
2
a
n if z >
3
2
a .
(5.118)
5.5. LIQUID STATE PHYSICS 269
A sketch is provided in the right hand panel of g. 5.7. Note that the density n(z) vanishes
identically for z <
1
2
due to the exclusion of the hard spheres by the wall. For z between
1
2
a and
3
2
a, there is a density enhancement, the origin of which has a simple physical
interpretation. Since the wall excludes particles from the region z <
1
2
, there is an empty
slab of thickness
1
2
z coating the interior of the wall. There are then no particles in this
region to exclude neighbors to their right, hence the density builds up just on the other
side of this slab. The eect vanishes to the order of the calculation past z =
3
2
a, where
n(z) = n returns to its bulk value. Had we calculated to higher order, wed have found
damped oscillations with spatial period a.
5.5 Liquid State Physics
5.5.1 The many-particle distribution function
The virial expansion is typically applied to low-density systems. When the density is high,
i.e. when na
3
1, where a is a typical molecular or atomic length scale, the virial expansion
is impractical. There are to many terms to compute, and to make progress one must use
sophisticated resummation techniques to investigate the high density regime.
To elucidate the physics of liquids, it is useful to consider the properties of various correlation
functions. These objects are derived from the general N-body Boltzmann distribution,
f(x
1
, . . . , x
N
; p
1
, . . . , p
N
) =
_
_
Z
1
N

1
N!
e

H
N
(p,x)
OCE
1
N!
e
N
e

H
N
(p,x)
GCE .
(5.119)
We assume a Hamiltonian of the form
H
N
=
N
i=1
p
2
i
2m
+W(x
1
, . . . , x
N
). (5.120)
The quantity
f(x
1
, . . . , x
N
; p
1
, . . . , p
N
)
d
d
x
1
d
d
p
1
h
d

d
d
x
N
d
d
p
N
h
d
(5.121)
is the propability of nding N particles in the system, with particle #1 lying within d
3
x
1
of
x
1
and having momentum within d
d
p
1
of p
1
, etc. If we compute averages of quantities which
only depend on the positions x
j
and not on the momenta p
j
, then we may integrate
out the momenta to obtain, in the OCE,
P(x
1
, . . . , x
N
) = Q
1
N

1
N!
e
W(x
1
,...,x
j
)
, (5.122)
where W is the total potential energy,
W(x
1
, . . . , x
N
) =
i
v(x
i
) +
i<j
u(x
i
x
j
) +
i<j<k
w(x
i
x
j
, x
j
x
k
) +. . . , (5.123)
and Q
N
is the conguration integral,
Q
N
(T, V ) =
1
N!
_
d
d
x
1

_
d
d
x
N
e
W(x
1
, ... , x
N
)
. (5.124)
We will, for the most part, consider only two-body central potentials as contributing to W,
which is to say we will only retain the middle term on the RHS. Note that P(x
1
, . . . , x
N
)
is invariant under any permutation of the particle labels.
5.5.2 Averages over the distribution
To compute an average, one integrates over the distribution:
F(x
1
, . . . , x
N
)
_
=
_
d
d
x
1

_
d
d
x
N
P(x
1
, . . . , x
N
) F(x
1
, . . . , x
N
) . (5.125)
The overall N-particle probability density is normalized according to
_
d
d
x
N
P(x
1
, . . . , x
N
) = 1 . (5.126)
The average local density is
n
1
(r) =
i
(r x
i
)
_
(5.127)
= N
_
d
d
x
2

_
d
d
x
N
P(r, x
2
, . . . , x
N
) . (5.128)
Note that the local density obeys the sum rule
_
d
d
r n
1
(r) = N . (5.129)
In a translationally invariant system, n
1
= n =
N
V
is a constant independent of position.
The boundaries of a system will in general break translational invariance, so in order to
maintain the notion of a translationally invariant system of nite total volume, one must
impose periodic boundary conditions.
The two-particle density matrix n
2
(r
1
, r
2
) is dened by
n
2
(r
1
, r
2
) =
i,=j
(r
1
x
i
) (r
2
x
j
)
_
(5.130)
= N(N 1)
_
d
d
x
3

_
d
d
x
N
P(r
1
, r
2
, x
3
, . . . , x
N
) . (5.131)
As in the case of the one-particle density matrix, i.e. the local density n
1
(r), the two-particle
density matrix satises a sum rule:
_
d
d
r
1
_
d
d
r
2
n
2
(r
1
, r
2
) = N(N 1) . (5.132)
Generalizing further, one denes the k-particle density matrix as
n
k
(r
1
, . . . , r
k
) =
i
1
i
k
(r
1
x
i
1
) (r
k
x
i
k
)
_
(5.133)
=
N!
(N k)!
_
d
d
x
k+1

_
d
d
x
N
P(r
1
, . . . , r
k
, x
k+1
, . . . , x
N
) , (5.134)
where the prime on the sum indicates that all the indices i
1
, . . . , i
k
are distinct. The
corresponding sum rule is then
_
d
d
r
1

_
d
d
r
k
n
k
(r
1
, . . . , r
k
) =
N!
(N k)!
. (5.135)
The average potential energy can be expressed in terms of the distribution functions. As-
suming only two-body interactions, we have
W) =
i<j
u(x
i
x
j
)
_
=
1
2
_
d
d
r
1
_
d
d
r
2
u(r
1
r
2
)
i,=j
(r
1
x
i
) (r
2
x
j
)
_
=
1
2
_
d
d
r
1
_
d
d
r
2
u(r
1
r
2
) n
2
(r
1
, r
2
) . (5.136)
As the separations r
ij
= [r
i
r
j
[ get large, we expect the correlations to vanish, in which
case
n
k
(r
1
, . . . , r
k
) =
i
1
i
k
(r
1
x
i
1
) (r
k
x
i
k
)
_
r
ij
i
1
i
k
(r
1
x
i
1
)
_

(r
k
x
i
k
)
_
=
N!
(N k)!

1
N
k
n
1
(r
1
) n
1
(r
k
)
=
_
1
1
N
__
1
2
N
_

_
1
k 1
N
_
n
1
(r
1
) n
1
(r
k
) . (5.137)
The k-particle distribution function is dened as the ratio
g
k
(r
1
, . . . , r
k
)
n
k
(r
1
, . . . , r
k
)
n
1
(r
1
) n
1
(r
k
)
. (5.138)
For large separations, then,
g
k
(r
1
, . . . , r
k
)
r
ij
k1
j=1
_
1
j
N
_
. (5.139)
For isotropic systems, the two-particle distribution function g
2
(r
1
, r
2
) depends only on the
magnitude [r
1
r
2
[. As a function of this scalar separation, the function is known as the
radial distribution function:
g(r) g
2
(r) =
1
n
2
i,=j
(r x
i
) (x
j
)
_
=
1
V n
2
i,=j
(r x
i
+x
j
)
_
. (5.140)
The radial distribution function is of great importance in the physics of liquids because
thermodynamic properties of the system can be related to g(r)
g(r) is directly measurable by scattering experiments
For example, in an isotropic system the average potential energy is given by
W) =
1
2
_
d
d
r
1
_
d
d
r
2
u(r
1
r
2
) n
2
(r
1
, r
2
)
=
1
2
n
2
_
d
d
r
1
_
d
d
r
2
u(r
1
r
2
) g
_
[r
1
r
2
[
_
=
N
2
2V
_
d
d
r u(r) g(r) . (5.141)
For a three-dimensional system, the average internal (i.e. potential) energy per particle is
W)
N
= 2n
_
0
dr r
2
g(r) u(r) . (5.142)
Intuitively, f(r) dr 4r
2
ng(r) dr is the average number of particles lying at a radial
distance between r and r + dr from a given reference particle. The total potential energy
of interaction with the reference particle is then f(r) u(r) dr. Now integrate over all r and
divide by two to avoid double-counting. This recovers eqn. 5.142.
In the OCE, g(r) obeys the sum rule
_
d
d
r g(r) =
V
N
2
N(N 1) = V
V
N
, (5.143)
hence
n
_
d
d
r
_
g(r) 1
= 1 . (5.144)
The function h(r) g(r) 1 is called the pair correlation function.
Figure 5.8: Pair distribution functions for hard spheres of diameter a at lling fraction =
6
a
3
n = 0.49 (left) and for liquid Argon at T = 85 K (right). Molecular dynamics data for
hard spheres (points) is compared with the result of the Percus-Yevick approximation (see
below in 5.5.8). Reproduced (without permission) from J.-P. Hansen and I. R. McDonald,
Theory of Simple Liquids, g 5.5. Experimental data on liquid argon are from the neutron
scattering work of J. L. Yarnell et al., Phys. Rev. A 7, 2130 (1973). The data (points) are
compared with molecular dynamics calculations by Verlet (1967) for a Lennard-Jones uid.
In the grand canonical formulation, we have
_
d
3
r h(r) =
V
2
N)
2

N
2
N)
V
V
= V
_
N
2
) N)
2
N)
2

1
N)
_
= k
B
T
T

1
n
, (5.145)
where
T
is the isothermal compressibility. Note that in an ideal gas we have h(r) = 0 and
T
=
0
T
1/nk
B
T.
Self-condensed systems, such as liquids and solids far from criticality, are nearly incom-
pressible, hence 0 < nk
B
T
T
1, and therefore
_
d
3
r g(r) 1 . (5.146)
The above equation is an equality if the system is incompressible (
T
= 0).
Figure 5.9: Pair distribution functions for liquid water. From A. K. Soper, Chem Phys.
202, 295 (1996).
5.5.3 Virial equation of state
The virial of a mechanical system is dened to be
G =
i
x
i
F
i
, (5.147)
where F
i
is the total force acting on particle i. If we average G over time, we obtain
G) = lim
T
1
T
T
_
0
dt
i
x
i
F
i
= lim
T
1
T
T
_
0
dt
i
m x
2
i
= 3Nk
B
T . (5.148)
Here, we have made use of
x
i
F
i
= mx
i
x
i
= m x
2
i
+
d
dt
_
mx
i
x
i
_
, (5.149)
as well as ergodicity and equipartition of kinetic energy. We have also assumed three
space dimensions. In a bounded system, there are two contributions to the force F
i
. One
contribution is from the surfaces which enclose the system. This is given by
2
G)
surfaces
=
i
x
i
F
(surf)
i
_
= 3pV . (5.150)
The remaining contribution is due to the interparticle forces. Thus,
p
k
B
T
=
N
V

1
3V k
B
T
i
x
i

i
W
_
. (5.151)
Invoking the denition of g(r), we have
p = nk
B
T
_
_
_
1
2n
3k
B
T
_
0
dr r
3
g(r) u
(r)
_
_
_
. (5.152)
As an alternate derivation, consider the First Law of Thermodynamics,
d = S dT p dV N d , (5.153)
from which we derive
p =
_
V
_
T,
=
_
F
V
_
T,N
. (5.154)
Now let V
3
V , where is a scale parameter. Then
p =
V
=
1
3V
=1
(T,
3
V, ) . (5.155)
Now
(T,
3
V, ) =
N=0
1
N!
e
N
3N
T
_
3
V
d
3
x
1

_
3
V
d
3
x
N
e
W(x
1
, ... , x
N
)
=
N=0
1
N!
_
e
3
T
_
N
3N
_
V
d
3
x
1

_
V
d
3
x
N
e
W(x
1
, ... , x
N
)
(5.156)
Thus,
p =
1
3V
(
3
V )
=1
=
k
B
T
3V
1
(
3
V )
(5.157)
=
k
B
T
3V
1
N=0
1
N!
_
z
3
T
_
N
_
_
_
_
V
d
3
x
1

_
V
d
3
x
N
e
W(x
1
, ... , x
N
)
_
3N
i
x
i
W
x
i
_
_
_
_
= nk
B
T
1
3V
_
W
_
=1
. (5.158)
2
To derive this expression, one can assume the system is in a rectangular box, and write

F
(surf)
i
=
P
6
a=1
P
j
2p
a
e
a
(t t
a,j
), where a |1, . . . , 6 labels the six faces and e
a
is the unit normal to the a
th
surface. Then computing the time average and identifying the rate at which momentum is transferred to a
given face as pA
a
, we recover eqn. 5.150.
Finally, from W =
i<j
u(x
ij
) we have
_
W
_
=1
=
i<j
x
ij
u(x
ij
)
=
2N
2
V
_
0
dr r
3
g(r) u
(r) , (5.159)
and hence
p = nk
B
T
2
3
n
2
_
0
dr r
3
g(r) u
(r) . (5.160)
Note that the density n enters the equation of state explicitly on the RHS of the above
equation, but also implicitly through the pair distribution function g(r), which has implicit
dependence on both n and T.
5.5.4 Correlations and scattering
Consider the scattering of a light or particle beam (i.e. photons or neutrons) from a liquid.
We label the states of the beam particles by their wavevector k and we assume a general
dispersion
k
. For photons,
k
= c[k[, while for neutrons
k
=

2
k
2
2m
n
. We assume a single
scattering process with the liquid, during which the total momentum and energy of the
liquid plus beam are conserved. We write
k
= k +q (5.161)
k
=
k
+ , (5.162)
where k
is the nal state of the scattered beam particle. Thus, the uid transfers momen-
tum p = q and energy to the beam.
Now consider the scattering process between an initial state [ i, k ) and a nal state [ j, k
),
where these states describe both the beam and the liquid. According to Fermis Golden
Rule, the scattering rate is
ikjk
=
2
j, k
[ 1 [ i, k )
2
(E
j
E
i
+ ) , (5.163)
where 1 is the scattering potential and E
i
is the initial internal energy of the liquid. If r is
the position of the beam particle and x
l
are the positions of the liquid particles, then
1(r) =
N
l=1
v(r x
l
) . (5.164)
The dierential scattering cross section (per unit frequency per unit solid angle) is
=

4
g(
k
)
[v
k
[
i,j
P
i
ikjk
, (5.165)
Figure 5.10: In a scattering experiment, a beam of particles interacts with a sample and
the beam particles scatter o the sample particles. A momentum q and energy are
transferred to the beam particle during such a collision. If = 0, the scattering is said to
be elastic. For ,= 0, the scattering is inelastic.
where
g() =
_
d
d
k
(2)
d
(
k
) (5.166)
is the density of states for the beam particle and
P
i
=
1
Z
e
E
i
. (5.167)
Consider now the matrix element
j, k
i, k
_
=
1
V
N
l=1
_
d
d
re
i(kk
)r
v(r x
l
)
i
_
=
1
V
v(q)
l=1
e
iqx
l
i
_
, (5.168)
where we have assumed that the incident and scattered beams are plane waves. We then
have
=

2
g(
k+q
)
[
k
k
[
[ v(q)[
2
V
2
i
P
i
l=1
e
iqx
l
i
_
2
(E
j
E
i
+ ) (5.169)
=
g(
k+q
)
4 [
k
k
[
N
V
2
[ v(q)[
2
S(q, ) , (5.170)
where S(q, ) is the dynamic structure factor,
S(q, ) =
2
N
i
P
i
l=1
e
iqx
l
i
_
2
(E
j
E
i
+ ) (5.171)
Note that for an arbitrary operator A,
i
_
2
(E
j
E
i
+ ) =
1
2
dt e
i(E
j
E
i
+) t/
j
_
j
i
_
=
1
2
dt e
it
j
_
j
e
i

Ht/
Ae
i

Ht/
i
_
=
1
2
dt e
it
(0) A(t)
i
_
. (5.172)
Thus,
S(q, ) =
1
N
dt e
it
i
P
i

l,l
e
iqx
l
(0)
e
iqx
l
(t)
i
_
(5.173)
=
1
N
dt e
it
l,l
e
iqx
l
(0)
e
iqx
l
(t)
_
, (5.174)
where the angular brackets in the last line denote a thermal expectation value of a quantum
mechanical operator. If we integrate over all frequencies, we obtain the equal time correlator,
S(q) =
d
2
S(q, ) (5.175)
=
1
N
l,l
e
iq(x
l
x
l
)
_
= N
q,0
+ 1 +n
_
d
d
r e
iqr
_
g(r) 1
. (5.176)
known as the static structure factor
3
. Note that S(q = 0) = N, since all the phases
e
iq(x
i
x
j
)
are then unity. As q , the phases oscillate rapidly with changes in the
distances [x
i
x
j
[, and average out to zero. However, the diagonal terms in the sum, i.e.
those with i = j, always contribute a total of 1 to S(q). Therefore in the q limit we
have S(q ) = 1.
In general, the detectors used in a scattering experiment are sensitive to the energy of the
scattered beam particles, although there is always a nite experimental resolution, both in
q and . This means that what is measured is actually something like
S
meas
(q, ) =
_
d
d
q
_
d
F(q q
) G(
) S(q
) , (5.177)
where F and G are essentially Gaussian functions of their argument, with width given by the
experimental resolution. If one integrates over all frequencies , i.e. if one simply counts
3
We may write
q,0
=
1
V
(2)
d
(q).
Figure 5.11: Comparison of the static structure factor as determined by neutron scatter-
ing work of J. L. Yarnell et al., Phys. Rev. A 7, 2130 (1973) with molecular dynamics
calculations by Verlet (1967) for a Lennard-Jones uid.
scattered particles as a function of q but without any discrimination of their energies,
then one measures the static structure factor S(q). Elastic scattering is determined by
S(q, = 0, i.e. no energy transfer.
5.5.5 Correlation and response
Suppose an external potential v(x) is also present. Then
P(x
1
, . . . , x
N
) =
1
Q
N
[v]

1
N!
e
W(x
1
, ... , x
N
)
e
P
i
v(x
i
)
, (5.178)
where
Q
N
[v] =
1
N!
_
d
d
x
1

_
d
d
x
N
e
W(x
1
, ... , x
N
)
e
P
i
v(x
i
)
. (5.179)
The Helmholtz free energy is then
F =
1
ln
_
dN
T
Q
N
[v]
_
. (5.180)
Now consider the functional derivative
F
v(r)
=
1

1
Q
N
Q
N
v(r)
. (5.181)
Using
i
v(x
i
) =
_
d
d
r v(r)
i
(r x
i
) , (5.182)
hence
F
v(r)
=
_
d
d
x
1

_
d
d
x
N
P(x
1
, . . . , x
N
)
i
(r x
i
)
= n
1
(r) , (5.183)
which is the local density at r.
Next, consider the response function,
(r, r
)
n
1
(r)
v(r
)
=

2
F[v]
v(r) v(r
)
(5.184)
=
1

1
Q
2
N
Q
N
v(r)
Q
N
v(r
)

1

1
Q
N
2
Q
N
v(r) v(r
)
= n
1
(r) n
1
(r
) n
1
(r) (r r
) n
2
(r, r
) . (5.185)
In an isotropic system,

(r, r
) =

(r r
) is a function of the coordinate separation, and

k
B
T (r r
) = n
2
+n(r r
) +n
2
g
_
[r r
[
_
= n
2
h
_
[r r
[
_
+n(r r
) . (5.186)
Taking the Fourier transform,
k
B
T (q) = n +n
2

h(q)
= nS(q) . (5.187)
We may also write
0
T
= 1 +n
h(0) = nk
B
T (0) , (5.188)
i.e.
T
=

(0).
What does this all mean? Suppose we have an isotropic system which is subjected to a weak,
spatially inhomogeneous potential v(r). We expect that the density n(r) in the presence of
the inhomogeneous potential to itself be inhomogeneous. The rst corrections to the v = 0
value n = n
0
are linear in v, and given by
n(r) =
_
d
d
r

(r, r
) v(r
)
= n
0
v(r) n
2
0
_
d
d
r
h(r r) v(r
) . (5.189)
Note that if v(r) > 0 it becomes energetically more costly for a particle to be at r. Accord-
ingly, the density response is negative, and proportional to the ratio v(r)/k
B
T this is the
rst term in the above equation. If there were no correlations between the particles, then
h = 0 and this would be the entire story. However, the particles in general are correlated.
Consider, for example, the case of hard spheres of diameter a, and let there be a repulsive
potential at r = 0. This means that it is less likely for a particle to be centered anywhere
within a distance a of the origin. But then it will be more likely to nd a particle in the
next shell of radial thickness a.
5.5.6 BBGKY hierarchy
The distribution functions satisfy a hierarchy of integro-dierential equations known as the
BBGKY hierarchy
4
. In homogeneous systems, we have
g
k
(r
1
, . . . , r
k
) =
N!
(N k)!
1
n
k
_
d
d
x
k+1

_
d
d
x
N
P(r
1
, . . . , r
k
, x
k+1
, . . . , x
N
) , (5.190)
where
P(x
1
, . . . , x
N
) =
1
Q
N
1
N!
e
W(x
1
, ... , x
N
)
. (5.191)
Taking the gradient with respect to r
1
, we have
r
1
g
k
(r
1
, . . . , r
k
) =
1
Q
N
n
k
(N k)!
_
d
d
x
k+1

_
d
d
x
N
e
P
k<i<j
u(x
ij
)

r
1
_
e
P
i<jk
u(r
ij
)
e
P
ik<j
u(r
i
x
j
)
_
, (5.192)
where
k<i<j
means to sum on indices i and j such that i < j and k < i, i.e.
k<i<j
u(x
ij
)
N1
i=k+1
N
j=i+1
u
_
x
i
x
j
_
i<jk
u(r
ij
)
k1
i=1
k
j=i+1
u
_
r
i
r
j
_
ik<j
u(r
i
x
j
) =
k
i=1
N
j=k+1
u(r
i
x
j
) .
Now
r
1
_
e
P
i<jk
u(r
ij
)
e
P
ik<j
u(r
i
x
j
)
_
=
_

1<jk
u(r
1
r
j
)
r
1
+
k<j
u(r
1
r
j
)
r
1
_
_
e
P
i<jk
u(r
ij
)
e
P
ik<j
u(r
i
x
j
)
_
,
(5.193)
4
So named after Bogoliubov, Born, Green, Kirkwood, and Yvon.
hence
r
1
g
k
(r
1
, . . . , r
k
) =
k
j=2
u(r
1
r
j
)
r
1
g
k
(r
1
, . . . , r
k
) (5.194)
(N k)
_
d
d
x
k+1
u(r
1
x
k+1
)
r
1
P(r
1
, . . . , r
k
, x
k+1
, . . . , x
N
)
=
k
j=2
u(r
1
r
j
)
r
1
g
k
(r
1
, . . . , r
k
) (5.195)
+n
_
d
d
x
k+1
u(r
1
x
k+1
)
r
1
g
k+1
(r
1
, . . . , r
k
, x
k+1
)
Thus, we obtain the BBGKY hierarchy:
k
B
T

r
1
g
k
(r
1
, . . . , r
k
) =
k
j=2
u(r
1
r
j
)
r
1
g
k
(r
1
, . . . , r
k
) (5.196)
+n
_
d
d
r
u(r
1
r
)
r
1
g
k+1
(r
1
, . . . , r
k
, r
) .
The BBGKY hierarchy is an innite tower of coupled integro-dierential equations, relating
g
k
to g
k+1
for all k. If we approximate g
k
at some level k in terms of equal or lower order
distributions, then we obtain a closed set of equations which in principle can be solved, at
least numerically. For example, the Kirkwood approximation closes the hierarchy at order
k = 2 by imposing the condition
g
3
(r
1
, r
2
, r
3
) g(r
1
r
2
) g(r
1
r
3
) g(r
2
r
2
) . (5.197)
This results in the single integro-dierential equation
k
B
T g(r) = g(r) u +n
_
d
d
r
g(r) g(r
) g(r r
) u(r r
) . (5.198)
This is known as the Born-Green-Yvon (BGY) equation. In practice, the BGY equation,
which is solved numerically, gives adequate results only at low densities.
5.5.7 Ornstein-Zernike theory
The direct correlation function c(r) is dened by the equation
h(r) = c(r) +n
_
d
3
r
h(r r
) c(r
) , (5.199)
where h(r) = g(r)1 and we assume an isotropic system. This is called the Ornstein-Zernike
equation. The rst term, c(r), accounts for local correlations, which are then propagated
in the second term to account for long-ranged correlations.
The OZ equation is an integral equation, but it becomes a simple algebraic one upon Fourier
transforming:
h(q) = c(q) +n
h(q) c(q) , (5.200)

the solution of which is
h(q) =
c(q)
1 n c(q)
. (5.201)
The static structure factor is then
S(q) = 1 +n
h(q) =
1
1 n c(q)
. (5.202)
In the grand canonical ensemble, we can write
T
=
1 +n
h(0)
nk
B
T
=
1
nk
B
T

1
1 n c(0)
= n c(0) = 1

0
T
T
, (5.203)
where
0
T
= 1/nk
B
T is the ideal gas isothermal compressibility.
At this point, we have merely substituted one unknown function, h(r), for another, namely
c(r). To close the system, we need to relate c(r) to h(r) again in some way. There are
various approximation schemes which do just this.
5.5.8 Percus-Yevick equation
In the Percus-Yevick approximation, we take
c(r) =
_
1 e
u(r)
g(r) . (5.204)
Note that c(r) vanishes whenever the potential u(r) itself vanishes. This results in the
following integro-dierential equation for the pair distribution function g(r):
g(r) = e
u(r)
+ne
u(r)
_
d
3
r
_
g(r r
) 1
_
1 e
u(r
g(r
) . (5.205)
This is the Percus-Yevick equation. Remarkably, the Percus-Yevick (PY) equation can be
solved analytically for the case of hard spheres, where u(r) = for r a and u(r) = 0 for
r > a, where a is the hard sphere diameter. Dene the function y(r) = e
u(r)
g(r), in which
case
c(r) = y(r) f(r) =
_
y(r) , r a
0 , r > a .
(5.206)
Here, f(r) = e
u(r)
1 is the Mayer function. We remark that the denition of y(r) may
cause some concern for the hard sphere system, because of the e
u(r)
term, which diverges
severely for r a. However, g(r) vanishes in this limit, and their product y(r) is in fact
nite! The PY equation may then be written for the function y(r) as
y(r) = 1 +n
_
r
<a
d
3
r
y(r
) n
_
r
<a
|rr
|>a
d
3
r
y(r
) y(r r
) . (5.207)
This has been solved using Laplace transform methods by M. S. Wertheim, J. Math. Phys.
5, 643 (1964). The nal result for c(r) is
c(r) =
_
1
+ 6
2
_
r
a
_
+
1
2

1
_
r
a
_
3
_
(a r) , (5.208)
where =
1
6
a
3
n is the packing fraction and
1
=
(1 + 2)
2
(1 )
4
,
2
=
(1 +
1
2
)
2
(1 )
4
. (5.209)
This leads to the equation of state
p = nk
B
T
1 + +
2
(1 )
3
. (5.210)
This gets B
2
and B
3
exactly right. The accuracy of the PY approximation for higher order
virial coecients is shown in table 5.1.
To obtain the equation of state from eqn. 5.208, we invoke the compressibility equation,
nk
B
T
T
=
_
n
p
_
T
=
1
1 n c(0)
. (5.211)
We therefore need
c(0) =
_
d
3
r c(r) (5.212)
= 4a
3
1
_
0
dxx
2
_
1
+ 6
2
x +
1
2

1
x
3
= 4a
3
_
1
3

1
+
3
2

2
+
1
12

1
.
With =
1
6
a
3
n and using the denitions of
1,2
in eqn. 5.209, one nds
1 n c(0) =
1 + 4 + 4
2
(1 )
4
. (5.213)
We then have, from the compressibility equation,
6k
B
T
a
3
p
=
1 + 4 + 4
2
(1 )
4
. (5.214)
Integrating, we obtain p() up to a constant. The constant is set so that p = 0 when n = 0.
The result is eqn. 5.210.
Another commonly used scheme is the hypernetted chains (HNC) approximation, for which
c(r) = u(r) +h(r) ln
_
1 +h(r)
_
. (5.215)
The rationale behind the HNC and other such approximation schemes is rooted in diagram-
matic approaches, which are extensions of the Mayer cluster expansion to the computation
of correlation functions. For details and references to their application in the literature, see
Hansen and McDonald (1990) and Reichl (1998).
quantity exact PY HNC
B
4
/B
3
2
0.28695 0.2969 0.2092
B
5
/B
4
2
0.1103 0.1211 0.0493
B
6
/B
5
2
0.0386 0.0281 0.0449
B
7
/B
6
2
0.0138 0.0156
Table 5.1: Comparison of exact (Monte Carlo) results to those of the Percus-Yevick (PY)
and hypernetted chains approximation (HCA) for hard spheres in three dimensions. Sources:
Hansen and McDonald (1990) and Reichl (1998)
5.5.9 Long wavelength behavior and the Ornstein-Zernike approximation
Lets expand the direct correlation function c(q) in powers of the wavevector q, viz.
c(q) = c(0) +c
2
q
2
+c
4
q
4
+. . . . (5.216)
Here we have assumed spatial isotropy. Then
1 n c(q) =
1
S(q)
= 1 n c(0) nc
2
q
2
+. . .

2
R
2
+q
2
R
2
+O(q
4
) , (5.217)
where
R
2
= nc
2
= 2n
_
0
dr r
4
c(r) (5.218)
and
2
=
1 n c(0)
R
2
=
1 4n
_
0
dr r
2
c(r)
2n
_
0
dr r
4
c(r)
. (5.219)
The quantity R(T) tells us something about the eective range of the interactions, while
(T) is the correlation length. As we approach a critical point, the correlation length
diverges as a power law:
(T) A[T T
c
[
. (5.220)
The susceptibility is given by

(q) = n S(q) =
R
2
2
+q
2
+O(q
4
)
(5.221)
In the Ornstein-Zernike approximation, one drops the O(q
4
) terms in the denominator and
retains only the long wavelength behavior. in the direct correlation function. Thus,

OZ
(q) =
R
2
2
+q
2
. (5.222)
We now apply the inverse Fourier transform back to real space to obtain

OZ
(r). In d = 1
dimension the result can be obtained exactly:
OZ
d=1
(x) =
1
k
B
TR
2
dq
2
e
iqx
2
+q
2
=

2k
B
TR
2
e
[x[/
. (5.223)
In higher dimensions d > 1 we can obtain the result asymptotically in two limits:
Take r with xed. Then
OZ
d
(r) C
d

(3d)/2
k
B
T R
2

e
r/
r
(d1)/2

_
1 +O
_
d 3
r/
__
, (5.224)
where the C
d
are dimensionless constants.
Take with r xed; this is the limit T T
c
at xed r. In dimensions d > 2 we
obtain
OZ
d
(r)
C
d
k
B
TR
2

e
r/
r
d2

_
1 +O
_
d 3
r/
__
. (5.225)
In d = 2 dimensions we obtain
OZ
d=2
(r)
C
2
k
B
TR
2
ln
_
r
_
e
r/
_
1 +O
_
1
ln(r/)
__
, (5.226)
where the C
d
At criticality, , and clearly our results in d = 1 and d = 2 dimensions are nonsensical,
as they are divergent. To correct this behavior, M. E. Fisher in 1963 suggested that the OZ
correlation functions in the r limit be replaced by
(r) C
k
B
TR
2

e
r/
r
d2+
, (5.227)
a result known as anomalous scaling. Here, is the anomalous scaling exponent.
Recall that the isothermal compressibility is given by
T
=

(0). Near criticality, the
integral in

(0) is dominated by the r part, since . Thus, using Fishers
anomalous scaling,
T
=

(0) =
_
d
d
r
(r)
A
_
d
d
r
e
r/
r
d2+
B
2
C
T T
c
(2)
, (5.228)
where A, B, and C are temperature-dependent constants which are nonsingular at T = T
c
.
Thus, since
T
[T T
c
[
, we conclude
= (2 ) , (5.229)
a result known as hyperscaling.
5.6. COULOMB SYSTEMS : PLASMAS AND THE ELECTRON GAS 287
5.6 Coulomb Systems : Plasmas and the Electron Gas
5.6.1 Electrostatic potential
Coulomb systems are particularly interesting in statistical mechanics because of their long-
ranged forces, which result in the phenomenon of screening. Long-ranged forces wreak
havoc with the Mayer cluster expansion, since the Mayer function is no longer integrable.
Thus, the virial expansion fails, and new techniques need to be applied to reveal the physics
of plasmas.
The potential energy of a Coulomb system is
U =
1
2
_
d
d
r
_
d
d
r
(r) u(r r
) (r
) , (5.230)
where (r) is the charge density and u(r), which has the dimensions of (energy)/(charge)
2
,
satises
2
u(r r
) = 4 (r r
) . (5.231)
Thus,
u(r) =
_
_
2 [x x
[ , d = 1
2 ln [r r
[ , d = 2
[r r
[
1
, d = 3 .
(5.232)
For discete particles, the charge density (r) is given by
(r) =
i
q
i
(r x
i
) , (5.233)
where q
i
is the charge of the i
th
particle. We will assume two types of charges: q = e,
with e > 0. The electric potential is
(r) =
_
d
d
r
u(r r
) (r
) (5.234)
=
i
q
i
u(r x
i
) . (5.235)
This satises the Poisson equation,
2
(r) = 4(r) . (5.236)
The total potential energy can be written as
U =
1
2
_
d
d
r (r) (r) (5.237)
=
1
2
i
q
i
(x
i
) , (5.238)
5.6.2 Debye-H uckel theory
We now write the grand partition function:
(T, V,
+
,
) =
N
+
=0
=0
1
N
+
!
e
+
N
+
N
+
d
+

1
N
!
e
_
d
d
r
1

_
d
d
r
N
+
+N
e
U(r
1
, ... , r
N
+
+N
)
. (5.239)
We now adopt a mean eld approach, known as Debye-H uckel theory, writing
(r) =
av
(r) +(r) (5.240)
(r) =
av
(r) +(r) . (5.241)
We then have
U =
1
2
_
d
d
r
_
av
(r) +(r)
av
(r) +(r)
=
U
0
..
1
2
_
d
d
r
av
(r)
av
(r) +
_
d
d
r
av
(r) (r)+
ignore uctuation term
..
1
2
_
d
d
r (r) (r) . (5.242)
We apply the mean eld approximation in each region of space, which leads to
(T, V,
+
,
) = k
B
T
d
+
z
+
_
d
d
r exp
_
e
av
(r)
k
B
T
_
(5.243)
k
B
T
d
_
d
d
r exp
_
+
e
av
(r)
k
B
T
_
,
where
=
_
2
2
m
k
B
T
_
, z
= exp
_

k
B
T
_
. (5.244)
The charge density is therefore
(r) =

av
(r)
= e
d
+
z
+
exp
_
e (r)
k
B
T
_
e
d
exp
_
+
e (r)
k
B
T
_
, (5.245)
where we have now dropped the superscript on
av
(r) for convenience. At r , we
assume charge neutrality and () = 0. Thus
n
+
() = n
() =
d
+
z
+
=
d
=
1
2
n
, (5.246)
where n
is the total ionic density at innity. Therefore,

(r) = e n
sinh
_
e (r)
k
B
T
_
. (5.247)
We now invoke Poissons equation,
2
= 4en
sinh(e) 4
ext
, (5.248)
where
ext
is an externally imposed charge density.
If e k
B
T, we can expand the sinh function and obtain
2
=
2
D
4
ext
, (5.249)
where
D
=
_
4n
e
2
k
B
T
_
1/2
,
D
=
_
k
B
T
4n
e
2
_
1/2
. (5.250)
The quantity
D
is known as the Debye screening length. Consider, for example, a point
charge Q located at the origin. We then solve Poissons equation in the weak eld limit,
2
=
2
D
4Q(r) . (5.251)
Fourier transforming, we obtain
q
2

(q) =
2
D
(q) 4Q =

(q) =
4Q
q
2
+
2
D
. (5.252)
Transforming back to real space, we obtain, in three dimensions, the Yukawa potential,
(r) =
_
d
3
q
(2)
3
4Qe
iqr
q
2
+
2
D
=
Q
r
e
D
r
. (5.253)
This solution must break down suciently close to r = 0, since the assumption e(r) k
B
T
is no longer valid there. However, for larger r, the Yukawa form is increasingly accurate.
For another example, consider an electrolyte held between two conducting plates, one at
potential (x = 0) = 0 and the other at potential (x = L) = V , where x is normal to
the plane of the plates. Again assuming a weak eld e k
B
T, we solve
2
=
2
D
and
obtain
(x) = Ae
D
x
+Be
D
x
. (5.254)
We x the constants A and B by invoking the boundary conditions, which results in
(x) = V
sinh(
D
x)
sinh(
D
L)
. (5.255)
Debye-H uckel theory is valid provided n
3
D
1, so that the statistical assumption of
many charges in a screening volume is justied.
5.6.3 The electron gas : Thomas-Fermi screening
Assuming k
B
T
F
, thermal uctuations are unimportant and we may assume T = 0. In
the same spirit as the Debye-H uckel approach, we assume a slowly varying mean electrostatic
potential (r). Locally, we can write
F
=

2
k
2
F
2m
e(r) . (5.256)
Thus, the Fermi wavevector k
F
is spatially varying, according to the relation
k
F
(r) =
_
2m
2
_
F
+e(r)
_
_
1/2
. (5.257)
The local electron number density is
n(r) =
k
3
F
(r)
3
2
= n
_
1 +
e(r)
F
_
3/2
. (5.258)
In the presence of a uniform compensating positive background charge
+
= en
, Poissons
equation takes the form
2
= 4e n
_
_
1 +
e(r)
F
_
3/2
1
_
4
ext
(r) . (5.259)
If e
F
, we may expand in powers of the ratio, obtaining
2
=
6n
e
2
F

2
TF
4
ext
(r) . (5.260)
Here,
TF
is the Thomas-Fermi wavevector,
TF
=
_
6n
e
2
F
_
1/2
. (5.261)
Thomas-Fermi theory is valid provided n
3
TF
1, where
TF
=
1
TF
, so that the statistical
assumption of many electrons in a screening volume is justied.
One important application of Thomas-Fermi screening is to the theory of metals. In a
metal, the outer, valence electrons of each atom are stripped away from the positively
charged ionic core and enter into itinerant, plane-wave-like states. These states disperse
with some (k) function (that is periodic in the Brillouin zone, i.e. under k k+G, where
G is a reciprocal lattice vector), and at T = 0 this energy band is lled up to the Fermi level
F
, as Fermi statistics dictates. (In some cases, there may be several bands at the Fermi
level, as we saw in the case of yttrium.) The set of ionic cores then acts as a neutralizing
positive background. In a perfect crystal, the ionic cores are distributed periodically, and
the positive background is approximately uniform. A charged impurity in a metal, such as
a zinc atom in a copper matrix, has a dierent nuclear charge and a dierent valency than
the host. The charge of the ionic core, when valence electrons are stripped away, diers
from that of the host ions, and therefore the impurity acts as a local charge impurity. For
example, copper has an electronic conguration of [Ar] 3d
10
4s
1
. The 4s electron forms an
energy band which contains the Fermi surface. Zinc has a conguration of [Ar] 3d
10
4s
2
, and
in a Cu matrix the Zn gives up its two 4s electrons into the 4s conduction band, leaving
behind a charge +2 ionic core. The Cu cores have charge +1 since each copper atom
contributed only one 4s electron to the conduction band. The conduction band electrons
neutralize the uniform positive background of the Cu ion cores. What is left is an extra
Q = +e nuclear charge at the Zn site, and one extra 4s conduction band electron. The
Q = +e impurity is, however, screened by the electrons, and at distances greater than an
atomic radius the potential that a given electron sees due to the Zn core is of the Yukawa
form,
(r) =
Q
r
e
TF
r
. (5.262)
We should take care, however, that the dispersion (k) for the conduction band in a metal
is not necessarily of the free electron form (k) =
2
k
2
/2m. To linear order in the potential,
however, the change in the local electronic density is
n(r) = e(r) g(
F
) , (5.263)
where g(
F
) is the density of states at the Fermi energy. Thus, in a metal, we should write
2
= 4e n
= 4e
2
g(
F
) =
2
TF
, (5.264)
where
TF
=
_
4e
2
g(
F
) . (5.265)
The value of g(
F
) will depend on the form of the dispersion. For ballistic bands with an
eective mass m
, the formula in eqn. 5.260 still applies.

The Thomas-Fermi atom
Consider an ion formed of a nucleus of charge +Ze and an electron cloud of charge Ne.
The net ionic charge is then (Z N)e. Since we will be interested in atomic scales, we can
no longer assume a weak eld limit and we must retain the full nonlinear screening theory,
for which
2
(r) = 4e
(2m)
3/2
3
2
3
_
F
+e(r)
_
3/2
4Ze (r) . (5.266)
We assume an isotropic solution. It is then convenient to dene
F
+e(r) =
Ze
2
r

(r/r
0
) , (5.267)
where r
0
is yet to be determined. As r 0 we expect

1 since the nuclear charge is
then unscreened. We then have
2
_
Ze
2
r

(r/r
0
)
_
=
1
r
2
0
Ze
2
r

(r/r
0
) , (5.268)
Figure 5.12: The Thomas-Fermi atom consists of a nuclear charge +Ze surrounded by N
electrons distributed in a cloud. The electric potential (r) felt by any electron at position
r is screened by the electrons within this radius, resulting in a self-consistent potential
(r) =
0
+ (Ze
2
/r)
(r/r
0
).
thus we arrive at the Thomas-Fermi equation,
(t) =
1
3/2
(t) , (5.269)
with r = t r
0
, provided we take
r
0
=

2
2me
2
_
3
4
Z
_
2/3
= 0.885 Z
1/3
a
B
, (5.270)
where a
B
=

2
me
2
= 0.529
A is the Bohr radius. The TF equation is subject to the following

boundary conditions:
At short distances, the nucleus is unscreened, i.e.
(0) = 1 . (5.271)
For positive ions, with N < Z, there is perfect screening at the ionic boundary
R = t
r
0
, where

(t
) = 0. This requires
E = =
_
Ze
2
R
2
(R/r
0
) +
Ze
2
Rr
0
(R/r
0
)
_
r =
(Z N) e
R
2
r . (5.272)
This requires
t
(t
) = 1
N
Z
. (5.273)
For an atom, with N = Z, the asymptotic solution to the TF equation is a power law, and
by inspection is found to be

(t) C t
3
, where C is a constant. The constant follows from
the TF equation, which yields 12 C = C
3/2
, hence C = 144. Thus, a neutral TF atom has
a density with a power law tail, with r
9/2
. TF ions with N > Z are unstable.
Chapter 6
Mean Field Theory
6.1 References
rd
edition, World
Scientic, 2006)
G. Parisi, Statistical Field Theory (Addison-Wesley, 1988)
An advanced text focusing on eld theoretic approaches, covering mean eld and
Landau-Ginzburg theories before moving on to renormalization group and beyond.
J. P. Sethna, Entropy, Order Parameters, and Complexity (Oxford, 2006)
An excellent introductory text with a very modern set of topics and exercises.
295
296 CHAPTER 6. MEAN FIELD THEORY
6.2 The Lattice Gas and the Ising Model
The usual description of a uid follows from a continuum Hamiltonian of the form
H(p, x) =
N
i=1
p
2
i
2m
+
i<j
u(x
i
x
j
) . (6.1)
The potential u(r) is typically central, depending only on the magnitude [r[, and short-
ranged. Now consider a discretized version of the uid, in which we divide up space into cells
(cubes, say), each of which can accommodate at most one uid particle (due to excluded
volume eects). That is, each cube has a volume on the order of a
3
, where a is the diameter
of the uid particles. In a given cube i we set the occupancy n
i
= 1 if a uid particle is
present and n
i
= 0 if there is no uid particle present. We then have that the potential
energy is
U =
i<j
u(x
i
x
j
) =
1
2
R,=R
V
RR
n
R
n
R
, (6.2)
where V
RR
v(R R
), where R
k
is the position at the center of cube k. The grand
partition function is then approximated as
(T, V, )
n
R
n
R
_
exp
_
1
2
R,=R
V
RR
n
R
n
R
_
, (6.3)
where
= e
d
T
a
d
, (6.4)
where a is the side length of each cube (chosen to be on the order of the hard sphere
diameter). The
d
T
factor arises from the integration over the momenta. Note
R
n
R
= N
is the total number of uid particles, so
n
R
=
N
= e
N
Nd
T
a
d
. (6.5)
Thus, we can write a lattice Hamiltonian,
H =
1
2
R,=R
V
RR
n
R
n
R
k
B
T ln
R
n
R
=
1
2
R,=R
J
RR

R
R
H
R
+E
0
,
(6.6)
where
R
2n
R
1 is a spin variable taking the possible values 1, +1, and
J
RR
=
1
4
V
RR
H =
1
2
k
B
T ln
1
4
V
RR
,
(6.7)
where the prime on the sum indicates that R
= R is to be excluded. For the Lennard-Jones

system, V
RR
= v(R R
) < 0 is due to the attractive tail of the potential, hence J

RR
is positive, which prefers alignment of the spins

R
and
R
. This interaction is therefore

ferromagnetic. The spin Hamiltonian in eqn. 6.6 is known as the Ising model.
6.2. THE LATTICE GAS AND THE ISING MODEL 297
Figure 6.1: The lattice gas model. An occupied cell corresponds to n = 1 ( = +1), and a
vacant cell to n = 0 ( = 1).
6.2.1 Fluid and magnetic phase diagrams
The physics of the liquid-gas transition in fact has a great deal in common with that of
the transition between a magnetized and unmagnetized state of a magnetic system. The
correspondences are
1
p H , v m ,
where m is the magnetization density, dened here to be the total magnetization M divided
by the number of lattice sites ^:
2
m =
M
^
=
1
^
R
) . (6.8)
Sketches of the phase diagrams are reproduced in g. 6.2. Of particular interest is the
critical point, which occurs at (T
c
, p
c
) in the uid system and (T
c
, H
c
) in the magnetic
system, with H
c
= 0 by symmetry.
In the uid, the coexistence curve in the (p, T) plane separates high density (liquid) and
low density (vapor) phases. The specic volume v (or the density n = v
1
) jumps discon-
tinuously across the coexistence curve. In the magnet, the coexistence curve in the (H, T)
plane separates positive magnetization and negative magnetization phases. The magnetiza-
tion density m jumps discontinuously across the coexistence curve. For T > T
c
, the latter
system is a paramagnet, in which the magnetization varies smoothly as a function of H.
1
One could equally well identify the second correspondence as n m between density (rather than
specic volume) and magnetization. One might object that H is more properly analogous to . However,
since = (p, T) it can equally be regarded as analogous to p. Note also that p = z
d
T
for the ideal gas,
in which case = z(a/
T
)
d
is proportional to p.
2
Note the distinction between the number of lattice sites A and the number of occupied cells N. According
to our denitions, N =
1
2
(M +A).
Figure 6.2: Comparison of the liquid-gas phase diagram with that of the Ising ferromagnet.
This behavior is most apparent in the bottom panel of the gure, where v(p) and m(H)
curves are shown.
For T < T
c
, the uid exists in a two phase region, which is spatially inhomogeneous,
supporting local regions of high and low density. There is no stable homogeneous ther-
modynamic phase for (T, v) within the two phase region shown in the middle left panel.
Similarly, for the magnet, there is no stable homogeneous thermodynamic phase at xed
temperature T and magnetization m if (T, m) lies within the coexistence region. Rather,
the system consists of blobs where the spin is predominantly up, and blobs where the spin
is predominantly down.
6.2. THE LATTICE GAS AND THE ISING MODEL 299
Note also the analogy between the isothermal compressibility
T
and the isothermal sus-
ceptibility

T
:
T
=
1
v
_
v
p
_
T
,
T
(T
c
, p
c
) =
T
=
_
m
H
_
T
,

T
(T
c
, H
c
) =
The order parameter for a second order phase transition is a quantity which vanishes in
the disordered phase and is nite in the ordered phase. For the uid, the order parameter
can be chosen to be (v
vap
v
liq
), the dierence in the specic volumes of the vapor and
liquid phases. In the vicinity of the critical point, the system exhibits power law behavior
in many physical quantities, viz.
m(T, H
c
)
_
T
c
T)
(T, H
c
) [T T
c
[
C
M
(T, H
c
) [T T
c
[
m(T
c
, H) [H[
1/
.
(6.9)
The quantities , , , and are the critical exponents associated with the transition.
These exponents satisfy certain equalities, such as the Rushbrooke and Griths relations
and hyperscaling,
3
+ 2 + = 2 (Rushbrooke)
+ = (Griths)
2 = d (hyperscaling) .
(6.10)
Originally such relations were derived as inequalities, and only after the advent of scal-
ing and renormalization group theories it was realized that they held as equalities. We
shall have much more to say about critical behavior later on, when we discuss scaling and
renormalization.
6.2.2 Gibbs-Duhem relation for magnetic systems
Homogeneity of E(S, M, ^) means E = TS +HM+^, and, after invoking the First Law
dE = T dS + HdM +d^, we have
S dT + MdH +^d = 0 . (6.11)
Now consider two magnetic phases in coexistence. We must have d
1
= d
2
, hence
d
1
= s
1
dT m
1
dH = s
2
dT m
2
dH = d
2
, (6.12)
3
In the third of the following exponent equalities, d is the dimension of space and is the correlation
length exponent.
where m = M/^ is the magnetization per site and s = S/^ is the specic entropy. Thus,
we obtain the Clapeyron equation for magnetic systems,
_
dH
dT
_
coex
=
s
1
s
2
m
1
m
2
. (6.13)
Thus, if m
1
,= m
2
and
_
dH
dT
_
coex
= 0, then we must have s
1
= s
2
, which says that there is
no latent heat associated with the transition. This absence of latent heat is a consequence
of the symmetry which guarantees that F(T, H, ^) = F(T, H, ^).
6.3 Order-Disorder Transitions
Another application of the Ising model lies in the theory of order-disorder transitions in
alloys. Examples include Cu
3
Au, CuZn, and other compounds. In CuZn, the Cu and Zn
atoms occupy sites of a body centered cubic (BCC) lattice, forming an alloy known as -
brass. Below T
c
740 K, the atoms are ordered, with the Cu preferentially occupying one
simple cubic sublattice and the Zn preferentially occupying the other.
The energy is a sum of pairwise interactions, with a given link contributing
AA
,
BB
, or
AB
, depending on whether it is an A-A, B-B, or A-B/B-A link. Here A and B represent
Cu and Zn, respectively. Thus, we can write the energy of the link ij) as
E
ij
=
AA
P
A
i
P
A
j
+
BB
P
B
i
P
B
j
+
AB
_
P
A
i
P
B
j
+P
B
i
P
A
j
_
, (6.14)
where
P
A
i
=
1
2
(1 +
i
) =
_
1 if site i contains Cu
0 if site i contains Zn
P
B
i
=
1
2
(1
i
) =
_
1 if site i contains Zn
0 if site i contains Cu .
The Hamiltonian is then
H =
ij)
E
ij
=
ij)
_
1
4
_
AA
+
BB
2
AB
_
j
+
1
4
_
AA
BB
_
(
i
+
j
) +
1
4
_
AA
+
BB
+ 2
AB
_
_
= J
ij)
j
H
i
+E
0
, (6.15)
where the exchange constant J and the magnetic eld H are given by
J =
1
4
_
2
AB
AA
BB
_
H =
1
4
_
BB
AA
_
,
(6.16)
6.4. MEAN FIELD THEORY 301
Figure 6.3: Order-disorder transition on the square lattice. Below T = T
c
, order develops
spontaneously on the two
2 sublattices. There is perfect sublattice order at T = 0

(left panel).
and E
0
=
1
8
Nz
_
AA
+
BB
+2
AB
_
, where N is the total number of lattice sites and z = 8 is
the lattice coordination number, which is the number of nearest neighbors of any given site.
Note that
2
AB
>
AA
+
BB
= J > 0 (ferromagnetic)
2
AB
<
AA
+
BB
= J < 0 (antiferromagnetic) .
The antiferromagnetic case is depicted in g. 6.3.
6.4 Mean Field Theory
Consider the Ising model Hamiltonian,
H = J
ij)
j
H
i
, (6.17)
where the rst sum on the RHS is over all links of the lattice. Each spin can be either up
( = +1) or down ( = 1). We further assume that the spins are located on a Bravais
lattice
4
and that the coupling J
ij
= J
_
[R
i
R
j
[
_
, where R
i
is the position of the i
th
spin.
On each site i we decompose
i
into a contribution from its thermodynamic average and a
uctuation term, i.e.
i
=
i
) +
i
. (6.18)
4
A Bravais lattice is one in which any site is equivalent to any other site through an appropriate discrete
translation. Examples of Bravais lattices include the linear chain, square, triangular, simple cubic, face-
centered cubic, etc. lattices. The honeycomb lattice is not a Bravais lattice, because there are two sets of
inequivalent sites those in the center of a Y and those in the center of an upside down Y.
We will write
i
) m, the local magnetization (dimensionless), and assume that m is
independent of position i. Then
j
= (m +
i
) (m +
j
)
= m
2
+m(
i
+
j
) +
i
j
= m
2
+m(
i
+
j
) +
i
j
.
(6.19)
The last term on the RHS of the second equation above is quadratic in the uctuations,
and we assume this to be negligibly small. Thus, we obtain the mean eld Hamiltonian
H
MF
=
1
2
NzJ m
2
_
H +zJm
_
i
, (6.20)
where N is the total number of lattice sites. The rst term is a constant, although the value
of m is yet to be determined. The Boltzmann weights are then completely determined by the
second term, which is just what we would write down for a Hamiltonian of noninteracting
spins in an eective mean eld
H
e
= H +zJm . (6.21)
In other words, H
e
= H
ext
+ H
int
, where the external eld is applied eld H
ext
= H, and
the internal eld is H
int
= zJm. The internal eld accounts for the interaction with the
average values of all other spins coupled to a spin at a given site, hence it is often called
the mean eld. Since the spins are noninteracting, we have
m =
e
H
e
e
H
e
e
H
e
+e
H
e
= tanh
_
H +zJm
k
B
T
_
. (6.22)
It is a simple matter to solve for the free energy, given the noninteracting Hamiltonian

H
MF
.
Z = Tr e

H
MF
= e
1
2
NzJ m
2
_
e
(H+zJm)
_
N
= e
F
. (6.23)
We now dene dimensionless variables:
f
F
NzJ
,
k
B
T
zJ
, h
H
zJ
, (6.24)
and obtain the dimensionless free energy
f(m, h, ) =
1
2
m
2
ln
_
e
(m+h)/
+e
(m+h)/
_
. (6.25)
Dierentiating with respect to m gives the mean eld equation,
m = tanh
_
m+h
_
, (6.26)
which is equivalent to the self-consistency requirement, m =
i
).
Figure 6.4: Left panel: self-consistency equation m = tanh(m/) at temperatures = 1.5
(dark red) and = 0.65 (blue). Right panel: mean eld free energy, with energy shifted by
ln 2 so that f(m = 0, ) = 0.
6.4.1 h = 0
When h = 0 the mean eld equation becomes
m = tanh
_
m
_
. (6.27)
This nonlinear equation can be solved graphically, as in the top panel of g. 6.4. The RHS
in a tanh function which gets steeper with decreasing t. If, at m = 0, the slope of tanh(m/)
is smaller than unity, then the curve y = tanh(m/h) will intersect y = m only at m = 0.
However, if the slope is larger than unity, there will be three such intersections. Since the
slope is 1/, we identify
c
= 1 as the mean eld transition temperature.
In the low temperature phase < 1, there are three solutions to the mean eld equations.
One solution is always at m = 0. The other two solutions must be related by the m m
symmetry of the free energy (when h = 0). The exact free energies are plotted in the
bottom panel of g. 6.4, but it is possible to make analytical progress by assuming m is
small and Taylor expanding the free energy f(m, ) in powers of m:
f(m, ) =
1
2
m
2
ln 2 ln cosh
_
m
_
= ln 2 +
1
2
(1
1
) m
2
+
m
4
12
3

m
6
45
5
+. . . .
(6.28)
Note that the sign of the quadratic term is positive for > 1 and negative for < 1. Thus,
the shape of the free energy f(m, ) as a function of m qualitatively changes at this point,
c
= 1, the mean eld transition temperature, also known as the critical temperature.
For >
c
, the free energy f(m, ) has a single minimum at m = 0. Below
c
, the curvature
at m = 0 reverses, and m = 0 becomes a local maximum. There are then two equivalent
minima symmetrically displaced on either side of m = 0. Dierentiating with respect to m,
we nd these local minima. For <
c
, the local minima are found at
m
2
= 3
2
(1 ) = 3(1 ) +O
_
(1 )
2
_
. (6.29)
Thus, we nd for [ 1[ 1,
m(, h = 0) =
3
_
1
_
1/2
+
, (6.30)
where the + subscript indicates that this solution is only for 1 > 0. For > 1 the only
solution is m = 0. The exponent with which m() vanishes as
c
is denoted . I.e.
m(, h = 0) (
c
)
+
.
6.4.2 Specic heat
We can now expand the free energy f(, h = 0). We nd
f(, h = 0) =
_
ln 2 if >
c
ln 2
3
4
(1 )
2
+O
_
(1 )
4
_
if <
c
.
(6.31)
Thus, if we compute the heat capacity, we nd in the vicinity of =
c
c
V
=

2
f
2
=
_
0 if >
c
3
2
if <
c
.
(6.32)
Thus, the specic heat is discontinuous at =
c
. We emphasize that this is only valid near
=
c
= 1. The general result valid for all is
5
c
V
() =
1

m
2
() m
4
()
1 +m
2
()
, (6.33)
With this expression one can check both limits 0 and
c
. As 0 the magneti-
zation saturates and one has m
2
() 1 4 e
2/
. The numerator then vanishes as e
2/
,
which overwhelms the denominator that itself vanishes as
2
. As a result, c
V
( 0) = 0,
as expected. As 1, invoking m
2
3(1 ) we recover c
V
(
c
) =
3
2
.
In the theory of critical phenomena, c
V
() [
c
[
as
c
. We see that mean eld
theory yields = 0.
5
To obtain this result, one writes f = f
`
, m()
and then dierentiates twice with respect to , using

the chain rule. Along the way, any naked (i.e. undierentiated) term proportional to
f
m
may be dropped,
since this vanishes at any by the mean eld equation.
Figure 6.5: Results at nite eld h = 0.1. Mean eld free energy f(m, h, ) (bottom; energy
shifted by ln 2) and self-consistency equation m = tanh
_
(m+h)/
_
(top) at temperatures
= 1.5 (dark red), = 0.9 (dark green), and = 0.65 (blue).
6.4.3 h ,= 0
Consider without loss of generality the case h > 0. The minimum of the free energy
f(m, h, ) now lies at m > 0 for any . At low temperatures, the double well structure we
found in the h = 0 case is tilted so that the right well lies lower in energy than the left well.
This is depicted in g. 6.5. As the temperature is raised, the local minimum at m < 0
vanishes, annihilating with the local maximum in a saddle-node bifurcation. To nd where
this happens, one sets
f
m
= 0 and

2
f
m
2
= 0 simultaneously, resulting in
h
() =
1

2
ln
_
1 +
1
1
1
_
. (6.34)
The solutions lie at h = h
(). For <

c
= 1 and h
_
h
() , +h
()
, there are three

solutions to the mean eld equation. Equivalently we could in principle invert the above
expression to obtain
(h). For >
(h), there is only a single global minimum in the free

energy f(m) and there is no local minimum. Note
(h = 0) = 1.
Assuming h [ 1[ 1, the mean eld solution for m(, h) will also be small, and we
expand the free energy in m, and to linear order in h:
f(m, h, ) = ln 2 +
1
2
(1
1
) m
2
+
m
4
12
3

hm
= f
0
+
1
2
( 1) m
2
+
1
12
m
4
hm+. . . .
(6.35)
2D Ising 3D Ising CO
2
Exponent MFT (exact) (numerical) (expt.)
0 0 0.125
<
0.1
1/2 1/8 0.313 0.35
1 7/4 1.25 1.26
3 15 5 4.2
Table 6.1: Critical exponents from mean eld theory as compared with exact results for the
two-dimensional Ising model, numerical results for the three-dimensional Ising model, and
experiments on the liquid-gas transition in CO
2
. Source: H. E. Stanley, Phase Transitions
and Critical Phenomena.
Setting
f
m
= 0, we obtain
1
3
m
3
+ ( 1) mh = 0 . (6.36)
If > 1 then we have a solution m = h/( 1). The m
3
term can be ignored because it is
higher order in h, and we have assumed h [ 1[ 1. This is known as the Curie-Weiss
law. The magnetic susceptibility behaves as
() =
m
h
=
1
1
[ 1[
, (6.37)
where the magnetization critical exponent is = 1. If < 1 then while there is still a
solution at m = h/( 1), it lies at a local maximum of the free energy, as shown in g. 6.5.
The minimum of the free energy occurs close to the h = 0 solution m = m
0
()
3 (1),
and writing m = m
0
+m we nd m to linear order in h as m(, h) = h/2(1 ). Thus,
m(, h) =
3 (1 ) +
h
2(1 )
. (6.38)
Once again, we nd that

() diverges as [ 1[
with = 1. The exponent on either

side of the transition is the same.
Finally, we can set =
c
and examine m(h). We nd, from eqn. 6.37,
m( =
c
, h) = (3h)
1/3
h
1/
, (6.39)
where is a new critical exponent. Mean eld theory gives = 3. Note that at =
c
= 1
we have m = tanh(m +h), and inverting we nd
h(m, =
c
) =
1
2
ln
_
1 +m
1 m
_
m =
m
3
3
+
m
5
5
+. . . , (6.40)
which is consistent with what we just found for m(h, =
c
).
How well does mean eld theory do in describing the phase transition of the Ising model?
In table 6.1 we compare our mean eld results for the exponents , , , and with exact
values for the two-dimensional Ising model, numerical work on the three-dimensional Ising
model, and experiments on the liquid-gas transition in CO
2
. The rst thing to note is that
the exponents are dependent on the dimension of space, and this is something that mean
eld theory completely misses. In fact, it turns out that the mean eld exponents are exact
provided d > d
u
, where d
u
is the upper critical dimension of the theory. For the Ising
model, d
u
= 4, and above four dimensions (which is of course unphysical) the mean eld
exponents are in fact exact. We see that all in all the MFT results compare better with the
three dimensional exponent values than with the two-dimensional ones this makes sense
since MFT does better in higher dimensions. The reason for this is that higher dimensions
means more nearest neighbors, which has the eect of reducing the relative importance of
the uctuations we neglected to include.
6.4.4 Magnetization dynamics
Dissipative processes drive physical systems to minimum energy states. We can crudely
model the dissipative dynamics of a magnet by writing the phenomenological equation
dm
ds
=
f
m
, (6.41)
where s is a dimensionless time variable. Under these dynamics, the free energy is never
increasing:
df
ds
=
f
m
m
s
=
_
f
m
_
2
0 . (6.42)
Clearly the xed point of these dynamics, where m = 0, is a solution to the mean eld
equation
f
m
= 0.
The phase ow for the equation m = f
(m) is shown in g. 6.6. As we have seen, for any

value of h there is a temperature
below which the free energy f(m) has two local minima
and one local maximum. When h = 0 the minima are degenerate, but at nite h one of the
minima is a global minimum. Thus, for <
(h) there are three solutions to the mean eld

equations. In the language of dynamical systems, under the dynamics of eqn. 6.41, minima
of f(m) correspond to attractive xed points and maxima to repulsive xed points. If h > 0,
the rightmost of these xed points corresponds to the global minimum of the free energy. As
is increased, this xed point evolves smoothly. At =
, the (metastable) local minimum

and the local maximum coalesce and annihilate in a saddle-note bifurcation. However at
h = 0 all three xed points coalesce at =
c
and the bifurcation is a supercritical pitchfork.
As a function of t at nite h, the dynamics are said to exhibit an imperfect bifurcation, which
is a deformed supercritical pitchfork.
The solution set for the mean eld equation is simply expressed by inverting the tanh
function to obtain h(, m). One readily nds
h(, m) =

2
ln
_
1 +m
1 m
_
m . (6.43)
As we see in the bottom panel of g. 6.7, m(h) becomes multivalued for h
_
h
() , +h
()
,
where h
() is given in eqn. 6.34. Now imagine that <

c
and we slowly ramp the eld
Figure 6.6: Dissipative magnetization dynamics m = f
(m). Bottom panel shows h
()
from eqn. 6.34. For (, h) within the blue shaded region, the free energy f(m) has a global
minimum plus a local minimum and a local maximum. Otherwise f(m) has only a single
global maximum. Top panels show an imperfect bifurcation in the magnetization dynamics
at h = 0.0215 , for which
= 0.90 Temperatures shown: = 0.65 (blue), =
(h) = 0.90
(green), and = 1.2. The rightmost stable xed point corresponds to the global minimum
of the free energy. The bottom of the middle two upper panels shows h = 0, where both
of the attractive xed points and the repulsive xed point coalesce into a single attractive
xed point (supercritical pitchfork bifurcation).
h from a large negative value to a large positive value, and then slowly back down to its
original value. On the time scale of the magnetization dynamics, we can regard h(s) as
a constant. (Remember the time variable is s here.) Thus, m(s) will ow to the nearest
stable xed point. Initially the system starts with m = 1 and h large and negative, and
there is only one xed point, at m
1. As h slowly increases, the xed point value m
also slowly increases. As h exceeds h
(), a saddle-node bifurcation occurs, and two new

xed points are created at positive m, one stable and one unstable. The global minimum
of the free energy still lies at the xed point with m
< 0. However, when h crosses h = 0,

the global minimum of the free energy lies at the most positive xed point m
. The dy-
namics, however, keep the system stuck in what is a metastable phase. This persists until
Figure 6.7: Top panel : hysteresis as a function of ramping the dimensionless magnetic eld
h at = 0.40. Dark red arrows below the curve follow evolution of the magnetization on slow
increase of h. Dark grey arrows above the curve follow evolution of the magnetization on
slow decrease of h. Bottom panel : solution set for m(, h) as a function of h at temperatures
= 0.40 (blue), =
c
= 1.0 (dark green), and t = 1.25 (red).
h = +h
(), at which point another saddle-note bifurcation occurs, and the attractive xed
point at m
< 0 annihilates with the repulsive xed point. The dynamics then act quickly
to drive m to the only remaining xed point. This process is depicted in the top panel of
g. 6.7. As one can see from the gure, the the system follows a stable xed point until the
xed point disappears, even though that xed point may not always correspond to a global
minimum of the free energy. The resulting m(h) curve is then not reversible as a function of
time, and it possesses a characteristic shape known as a hysteresis loop. Etymologically, the
word hysteresis derives from the Greek , which means lagging behind. Systems
which are hysteretic exhibit a history-dependence to their status, which is not uniquely
determined by external conditions. Hysteresis may be exhibited with respect to changes in
applied magnetic eld, changes in temperature, or changes in other externally determined
parameters.
6.4.5 Beyond nearest neighbors
Suppose we had started with the more general model,
H =
i<j
J
ij

i
j
H
i
=
1
2
i,=j
J
ij

i
j
H
i
,
(6.44)
where J
ij
is the coupling between spins on sites i and j. In the top equation above, each
pair (ij) is counted once in the interaction term; this may be replaced by a sum over all i
and j if we include a factor of
1
2
.
6
The resulting mean eld Hamiltonian is then
H
MF
=
1
2
N

J(0) m
2
_
H +

J(0) m
_
i
. (6.45)
Here,

J(q) is the Fourier transform of the interaction matrix J
ij
:
7
J(q) =
R
J(R) e
iqR
. (6.46)
For nearest neighbor interactions only, one has

J(0) = zJ, where z is the lattice coordination
number, i.e. the number of nearest neighbors of any given site. The scaled free energy is
as in eqn. 6.25, with f = F/N

J(0), = k
B
T/

J(0), and h = H/

J(0). The analysis proceeds
precisely as before, and we conclude
c
= 1, i.e. k
B
T
MF
c
=

J(0).
6.5 Ising Model with Long-Ranged Forces
Consider an Ising model where J
ij
= J/N for all i and j, so that there is a very weak
interaction between every pair of spins. The Hamiltonian is then
H =
J
2N
_
i
_
2
H
k
. (6.47)
Z = Tr

i
exp
_
J
2N
_
i
_
2
+H
i
_
. (6.48)
6
The self-interaction terms with i = j contribute a constant to

H and may be either included or excluded.
However, this property only pertains to the
i
= 1 model. For higher spin versions of the Ising model, say
where S
i
|1, 0, +1, then S
2
i
is not constant and we should explicitly exclude the self-interaction terms.
7
The sum in the discrete Fourier transform is over all direct Bravais lattice vectors and the wavevector q
may be restricted to the rst Brillouin zone. These terms are familiar from elementary solid state physics.
6.5. ISING MODEL WITH LONG-RANGED FORCES 311
We now invoke the Gaussian integral,
dx e
x
2
x
=
_
2
/4
. (6.49)
Thus,
exp
_
J
2N
_
i
_
2
_
=
_
NJ
2
_
1/2

_
dm e
1
2
NJm
2
+Jm
P
i
i
, (6.50)
and we can write the partition function as
Z =
_
NJ
2
_
1/2

_
dm e
1
2
NJm
2
_
e
(H+Jm)
_
N
=
_
N
2
_
1/2

_
dme
NA(m)/
,
(6.51)
where = k
B
T/J, h = H/J, and
A(m) =
1
2
m
2
ln
_
2 cosh
_
h +m
__
. (6.52)
Since N , we can perform the integral using the method of steepest descents. Thus,
we must set
dA
dm
= 0 = m
= tanh
_
m
+h
_
. (6.53)
Expanding about m = m
, we write
A(m) = A(m
) +
1
2
A
(m
) (mm
)
2
+
1
6
A
(m
) (mm
)
3
+. . . . (6.54)
Performing the integrations, we obtain
Z =
_
N
2
_
1/2
e
NA(m
)/
d exp
_
NA
(m
)
2
m
2
NA
(m
)
6
m
3
+. . .
_
=
1
_
A
(m
)
e
NA(m
)/
_
1 +O(N
1
)
_
. (6.55)
The corresponding free energy per site
f =
F
NJ
= A(m
) +

2N
ln A
(m
) +O(N
2
) , (6.56)
where m
is the solution to the mean eld equation which minimizes A(m). Mean eld
theory is exact for this model!
6.6 Variational Density Matrix Method
Suppose we are given a Hamiltonian

H. From this we construct the free energy, F:
F = E TS
= Tr (

H) +k
B
T Tr ( ln ) .
(6.57)
Here, is the density matrix
8
. A physical density matrix must be (i) normalized (i.e.
Tr = 1), (ii) Hermitian, and (iii) non-negative denite (i.e. all the eigenvalues of must
be non-negative).
Our goal is to extremize the free energy subject to the various constraints on . Let us
assume that is diagonal in the basis of eigenstates of

H, i.e.
=
, (6.58)
where P
is the probability that the system is in state
_
. Then
F =
+k
B
T
ln P
. (6.59)
Thus, the free energy is a function of the set P
. We now extremize F subject to the

normalization constraint. This means we form the extended function
F
_
P
,
_
= F
_
P
_
+
_
1
_
, (6.60)
and then freely extremize over both the probabilities P
as well as the Lagrange multiplier

. This yields the Boltzmann distribution,
P
eq
=
1
Z
exp(E
/k
B
T) , (6.61)
where Z =
e
E/k
B
T
= Tr e

H/k
B
T
is the canonical partition function, which is related
to through
= k
B
T (ln Z 1) . (6.62)
Note that the Boltzmann weights are, appropriately, all positive.
If the spectrum of

H is bounded from below, our extremum should in fact yield a minimum
for the free energy F. Furthermore, since we have freely minimized over all the probabil-
ities, subject to the single normalization constraint, any distribution P
other than the

equilibrium one must yield a greater value of F.
Alas, the Boltzmann distribution, while exact, is often intractable to evaluate. For one-
dimensional systems, there are general methods such as the transfer matrix approach which
8
How do we take the logarithm of a matrix? The rule is this: A = ln B if B = exp(A). The exponential
of a matrix may be evaluated via its Taylor expansion.
6.6. VARIATIONAL DENSITY MATRIX METHOD 313
do permit an exact evaluation of the free energy. However, beyond one dimension the
situation is in general hopeless. A family of solvable (integrable) models exists in two di-
mensions, but their solutions require specialized techniques and are extremely dicult. The
idea behind the variational density matrix approximation is to construct a tractable trial
density matrix which depends on a set of variational parameters x
, and to minimize
with respect to this (nite) set.
6.6.1 Variational density matrix for the Ising model
Consider once again the Ising model Hamiltonian,
H =
i<j
J
ij

i
j
H
i
. (6.63)
The states of the system
_
may be labeled by the values of the spin variables:

1
,
2
, . . .
_
. We assume the density matrix is diagonal in this basis, i.e.
N
_
_
()
,
, (6.64)
where
,
=
i
,
i
. (6.65)
Indeed, this is the case for the exact density matrix, which is to say the Boltzmann weight,
N
(
1
,
2
, . . .) =
1
Z
e

H(
1
,...,
N
)
. (6.66)
We now write a trial density matrix which is a product over contributions from independent
single sites:
N
(
1
,
2
, . . .) =
i
(
i
) , (6.67)
where
() =
_
1 +m
2
_
,1
+
_
1 m
2
_
,1
. (6.68)
Note that weve changed our notation slightly. We are denoting by () the corresponding
diagonal element of the matrix
=
_
1+m
2
0
0
1m
2
_
, (6.69)
and the full density matrix is a tensor product over the single site matrices:
N
= . (6.70)
Note that and hence
N
are appropriately normalized. The variational parameter here is
m, which, if is to be non-negative denite, must satisfy 1 m 1. The quantity m has
the physical interpretation of the average spin on any given site, since
i
) =
() = m. (6.71)
We may now evaluate the average energy:
E = Tr (
N
H) =
i<j
J
ij
m
2
H
i
m
=
1
2
N

J(0) m
2
NHm , (6.72)
where once again

J(0) is the discrete Fourier transform of J(R) at wavevector q = 0. The
entropy is given by
S = k
B
Tr (
N
ln
N
) = Nk
B
Tr ( ln )
= Nk
B
_
_
1 +m
2
_
ln
_
1 +m
2
_
+
_
1 m
2
_
ln
_
1 m
2
_
_
. (6.73)
We now dene the dimensionless free energy per site: f F/N

J(0). We have
f(m, h, ) =
1
2
m
2
hm+
_
_
1 +m
2
_
ln
_
1 +m
2
_
+
_
1 m
2
_
ln
_
1 m
2
_
_
, (6.74)
where k
B
T/

J(0) is the dimensionless temperature, and h H/

J(0) the dimensionless
magnetic eld, as before. We extremize f(m) by setting
f
m
= 0 = mh +

2
ln
_
1 +m
1 m
_
. (6.75)
Solving for m, we obtain
m = tanh
_
m+h
_
, (6.76)
which is precisely what we found in eqn. 6.26.
Note that the optimal value of m indeed satises the requirement [m[ 1 of non-negative
probability. This nonlinear equation may be solved graphically. For h = 0, the unmagne-
tized solution m = 0 always applies. However, for < 1 there are two additional solutions
at m = m
A
(), with m
A
() =
_
3(1 ) + O
_
(1 )
3/2
_
for t close to (but less than)
one. These solutions, which are related by the Z
2
symmetry of the h = 0 model, are in
fact the low energy solutions. This is shown clearly in gure 6.8, where the variational
free energy f(m, t) is plotted as a function of m for a range of temperatures interpolating
between high and low values. At the critical temperature
c
= 1, the lowest energy state
changes from being unmagnetized (high temperature) to magnetized (low temperature).
For h > 0, there is no longer a Z
2
symmetry (i.e.
i

i
i). The high temperature
solution now has m > 0 (or m < 0 if h < 0), and this smoothly varies as t is lowered,
approaching the completely polarized limit m = 1 as 0. At very high temperatures,
the argument of the tanh function is small, and we may approximate tanh(x) x, in which
case
m(h, ) =
h

c
. (6.77)
This is called the Curie-Weiss law. One can infer
c
from the high temperature susceptibility
() = (m/h)
h=0
by plotting

1
versus and extrapolating to obtain the -intercept.
Figure 6.8: Variational eld free energy f = f(m, h, ) + ln 2 versus magnetization m
at six equally spaced temperatures interpolating between high ( = 1.25, red) and low
( = 0.75, blue) values. Top panel: h = 0. Bottom panel: h = 0.06.
In our case,

() = (
c
)
1
. For low and weak h, there are two inequivalent minima in
the free energy.
When m is small, it is appropriate to expand f(m, h, ), obtaining
f(m, h, ) = ln 2 hm+
1
2
( 1) m
2
+

12
m
4
+

30
m
6
+

56
m
8
+. . . . (6.78)
This is known as the Landau expansion of the free energy in terms of the order parameter
m. An order parameter is a thermodynamic variable which distinguishes ordered and
disordered phases. Typically = 0 in the disordered (high temperature) phase, and ,= 0
in the ordered (low temperature) phase. When the order sets in continuously, i.e. when
is continuous across
c
, the phase transition is said to be second order. When changes
abruptly, the transition is rst order. It is also quite commonplace to observe phase tran-
sitions between two ordered states. For example, a crystal, which is an ordered state, may
change its lattice structure, say from a high temperature tetragonal phase to a low temper-
ature orthorhombic phase. When the high T phase possesses the same symmetries as the
low T phase, as in the tetragonal-to-orthorhombic example, the transition may be second
order. When the two symmetries are completely unrelated, for example in a hexagonal-to-
tetragonal transition, or in a transition between a ferromagnet and an antiferromagnet, the
transition is in general rst order.
Throughout this discussion, we have assumed that the interactions J
ij
are predominantly
ferromagnetic, i.e. J
ij
> 0, so that all the spins prefer to align. When J
ij
< 0, the
interaction is said to be antiferromagnetic and prefers anti-alignment of the spins (i.e.
j
= 1.). Clearly not every pair of spins can be anti-aligned there are two possible
spin states and a thermodynamically extensive number of spins. But on the square lattice,
for example, if the only interactions J
ij
are between nearest neighbors and the interactions
are antiferromagnetic, then the lowest energy conguration (T = 0 ground state) will be
one in which spins on opposite sublattices are anti-aligned. The square lattice is bipartite
it breaks up into two interpenetrating sublattices A and B (which are themselves square
lattices, rotated by 45
with respect to the original, and with a larger lattice constant

by a factor of
2), such that any site in A has nearest neighbors in B, and vice versa.
The honeycomb lattice is another example of a bipartite lattice. So is the simple cubic
lattice. The triangular lattice, however, is not bipartite (it is tripartite). Consequently,
with nearest neighbor antiferromagnetic interactions, the triangular lattice Ising model is
highly frustrated. The moral of the story is this: antiferromagnetic interactions can give
rise to complicated magnetic ordering, and, when frustrated by the lattice geometry, may
have nite specic entropy even at T = 0.
6.6.2 Mean Field Theory of the Potts Model
The Hamiltonian for the Potts model is
H =
i<j
J
ij

i
,
j
H
i
,1
. (6.79)
Here,
i
1, . . . , q, with integer q. This is the so-called q-state Potts model. The
quantity H is analogous to an external magnetic eld, and preferentially aligns (for H > 0)
the local spins in the = 1 direction. We will assume H 0.
The q-component set is conveniently taken to be the integers from 1 to q, but it could be
anything, such as
i
tomato, penny, ostrich, Grateful Dead ticket from 1987, . . . . (6.80)
The interaction energy is J
ij
if sites i and j contain the same object (q possibilities), and
0 if i and j contain dierent objects (q
2
q possibilities).
The two-state Potts model is equivalent to the Ising model. Let the allowed values of be
1. Then the quantity
,
=
1
2
+
1
2

(6.81)
equals 1 if =
, and is zero otherwise. The three-state Potts model cannot be written

as a simple three-state Ising model, i.e. one with a bilinear interaction
where
1, 0, +1. However, it is straightforward to verify the identity
,
= 1 +
1
2

+
3
2

2
2
(
2
+
2
) . (6.82)
Thus, the q = 3-state Potts model is equivalent to a S = 1 (three-state) Ising model which
includes both bilinear (
) and biquadratic (
2
2
) interactions, as well as a local eld term
which couples to the square of the spin,
2
. In general one can nd such correspondences
for higher q Potts models, but, as should be expected, the interactions become increasingly
complex, with bi-cubic, bi-quartic, bi-quintic, etc. terms.
Getting back to the mean eld theory, we write the single site variational density matrix
as a diagonal matrix with entries
() = x
,1
+
_
1 x
q 1
_
_
1
,1
_
, (6.83)
with
N
(
1
, . . . ,
N
) = (
1
) (
N
). Note that Tr () = 1. The variational parameter
is x. When x = q
1
, all states are equally probable. But for x > q
1
, the state = 1 is
preferred, and the other (q 1) states have identical but smaller probabilities. It is a simple
matter to compute the energy and entropy:
E = Tr (
N
H) =
1
2
N

J(0)
_
x
2
+
(1 x)
2
q 1
_
NHx
S = k
B
Tr (
N
ln
N
) = Nk
B
_
xln x + (1 x) ln
_
1 x
q 1
__
.
(6.84)
The dimensionless free energy per site is then
f(x, , u) =
1
2
_
x
2
+
(1 x)
2
q 1
_
+
_
xln x + (1 x) ln
_
1 x
q 1
__
hx , (6.85)
where h = H/

J(0). We now extremize with respect to x to obtain the mean eld equation,
f
x
= 0 = x +
1 x
q 1
+ ln x ln
_
1 x
q 1
_
h . (6.86)
Note that for h = 0, x = q
1
is a solution, corresponding to a disordered state in which
all states are equally probable. At high temperatures, for small h, we expect x q
1
h.
Indeed, using Mathematica one can set
x q
1
+s , (6.87)
and expand the mean eld equation in powers of s. One obtains
h =
q (q 1)
q 1
s +
q
3
(q 2)
2 (q 1)
2
s
2
+O(s
3
) . (6.88)
For weak elds, [h[ 1, and we have
s() =
(q 1) u
q (q 1)
+O(h
2
) , (6.89)
which again is of the Curie-Weiss form. The dierence s = x q
1
is the order parameter
for the transition.
Finally, one can expand the free energy in powers of s, obtaining the Landau expansion,
f(s, , h) =
2h + 1
2q
ln q hs +
q (q 1)
2 (q 1)
s
2
+
(q 2)
6 (q 1)
2
q
3
s
3
+
(q
2
3q + 3)
12 (q 1)
3
q
4
s
4
1
20
_
1 (q 1)
4
_
q
4
s
5
+
1
30
_
1 + (q 1)
5
_
q
5
s
6
+. . . .
(6.90)
Note that, for q = 2, the coecients of s
3
, s
5
, and higher order odd powers of s vanish in
the Landau expansion. This is consistent with what we found for the Ising model, and is
related to the Z
2
symmetry of that model. For q > 3, there is a cubic term in the mean eld
free energy, and thus we generically expect a rst order transition, as we shall see below
when we discuss Landau theory.
6.6.3 Mean Field Theory of the XY Model
Consider the so-called XY model, in which each site contains a continuous planar spin,
represented by an angular variable
i
[, ]:
H =
i<j
J
ij
cos
_
j
_
H
i
cos
i
. (6.91)
We write the (diagonal elements of the) full density matrix once again as a product:
N
(
1
,
2
, . . .) =
i
(
i
) . (6.92)
Our goal will be to extremize the free energy with respect to the function (). To this
end, we compute
E = Tr (
N

H) =
1
2
N

J(0)
Tr
_
e
i
_
2
N HTr
_
cos
_
. (6.93)
The entropy is
S = Nk
B
Tr ( ln ) . (6.94)
Note that for any function A(), we have
9
Tr
_
A)
d
2
() A() . (6.95)
9
The denominator of 2 in the measure is not necessary, and in fact it is even slightly cumbersome.
It divides out whenever we take a ratio to compute a thermodynamic average. I introduce this factor to
preserve the relation Tr 1 = 1. I personally nd unnormalized traces to be profoundly unsettling on purely
aesthetic grounds.
We now extremize the functional F
_
()
= E TS with respect to (), under the

condition that Tr = 1. We therefore use Lagranges method of undetermined multipliers,
writing
F = F Nk
B
T
_
Tr 1
_
. (6.96)
Note that

F is a function of the Lagrange multiplier and a functional of the density
matrix (). The prefactor Nk
B
T which multiplies is of no mathematical consequence
we could always redene the multiplier to be
Nk
B
T. It is present only to maintain
homogeneity and proper dimensionality of F
with itself dimensionless and of order N

0
.
We now have

F
()
=

()
_
1
2
N

J(0)
Tr
_
e
i
_
2
N HTr
_
cos
_
+Nk
B
T Tr
_
ln
_
Nk
B
T
_
Tr 1
_
_
.
To this end, we note that
()
Tr
_
A
_
=

()
d
2
() A() =
1
2
A() . (6.97)
Thus, we have

F
()
=
1
2
N

J(0)
1
2
_
Tr

_
e
i
_
e
i
+ Tr

_
e
i
_
e
i
_
N H
cos
2
+Nk
B
T
1
2
_
ln () + 1
_
Nk
B
T

2
.
(6.98)
Now let us dene
Tr
_
e
i
_
=
d
2
() e
i
me
i
0
. (6.99)
We then have
ln () =
J(0)
k
B
T
m cos(
0
) +
H
k
B
T
cos + 1. (6.100)
Clearly the free energy will be reduced if
0
= 0 so that the mean eld is maximal and
aligns with the external eld, which prefers = 0. Thus, we conclude
() = ( exp
_
H
e
k
B
T
cos
_
, (6.101)
where
H
e
=

J(0) m + H (6.102)
and ( = e
1
. The value of is then determined by invoking the constraint,
Tr = 1 = (
d
2
exp
_
H
e
k
B
T
cos
_
= ( I
0
(H
e
/k
B
T) , (6.103)
where I
0
(z) is the Bessel function. We are free to dene

H
e
k
B
T
, (6.104)
and to treat as our single variational parameter.
We then have the normalized density matrix
() =
e
cos
2
e
cos
=
e
cos
I
0
()
. (6.105)
We next compute the following averages:
e
i
) =
d
2
() e
i
=
I
1
()
I
0
()
(6.106)
cos(
)
_
= Re
e
i
e
i
_
=
_
I
1
()
I
0
()
_
2
, (6.107)
as well as
Tr ( ln ) =
d
2
e
cos
I
0
()
_
cos ln I
0
()
_
=
I
1
()
I
0
()
ln I
0
() . (6.108)
The dimensionless free energy per site is therefore
f(, h, ) =
1
2
_
I
1
()
I
0
()
_
2
+ ( h)
I
1
()
I
0
()
lnI
0
() , (6.109)
with = k
B
T/

J(0) and h = H/

J(0) and f = F/N

J(0) as before.
For small , we may expand the Bessel functions, using
I
(z) = (
1
2
z)
k=0
(
1
4
z
2
)
k
k! (k + + 1)
, (6.110)
to obtain
f(, h, ) =
1
4
_

1
2
_
2
+
1
64
_
2 3
_
1
2
h +
1
16
h
3
+. . . . (6.111)
This predicts a second order phase transition at
c
=
1
2
.
10
Note also the Curie-Weiss form
of the susceptibility at high :
f
= 0 = =
h

c
+. . . . (6.112)
10
Note that the coecient of the quartic term in is negative for >
2
3
. At = c =
1
2
, the coecient is
positive, but for larger one must include higher order terms in the Landau expansion.
6.7. LANDAU THEORY OF PHASE TRANSITIONS 321
6.7 Landau Theory of Phase Transitions
Landaus theory of phase transitions is based on an expansion of the free energy of a
thermodynamic system in terms of an order parameter, which is nonzero in an ordered
phase and zero in a disordered phase. For example, the magnetization M of a ferromagnet
in zero external eld but at nite temperature typically vanishes for temperatures T > T
c
,
where T
c
is the critical temperature, also called the Curie temperature in a ferromagnet.
A low order expansion in powers of the order parameter is appropriate suciently close to
the phase transition, i.e. at temperatures such that the order parameter, if nonzero, is still
small.
The simplest example is the quartic free energy,
f(m, h = 0, ) = f
0
+
1
2
am
2
+
1
4
bm
4
, (6.113)
where f
0
= f
0
(), a = a(), and b = b(). Here, is a dimensionless measure of the
temperature. If for example the local exchange energy in the ferromagnet is J, then we
might dene = k
B
T/zJ, as before. Let us assume b > 0, which is necessary if the free
energy is to be bounded from below
11
. The equation of state ,
f
m
= 0 = am+bm
3
, (6.114)
has three solutions in the complex m plane: (i) m = 0, (ii) m =
_
a/b , and (iii) m =
_
a/b . The latter two solutions lie along the (physical) real axis if a < 0. We assume
that there exists a unique temperature
c
where a(
c
) = 0. Minimizing f, we nd
<
c
: f() = f
0
a
2
4b
>
c
: f() = f
0
.
(6.115)
The free energy is continuous at
c
since a(
c
) = 0. The specic heat, however, is discon-
tinuous across the transition, with
c
_
+
c
_
c
_
c
_
=
c
=c
_
a
2
4b
_
=
c
_
a
(
c
)
2
2b(
c
)
. (6.116)
The presence of a magnetic eld h breaks the Z
2
symmetry of m m. The free energy
becomes
f(m, h, ) = f
0
+
1
2
am
2
+
1
4
bm
4
hm , (6.117)
and the mean eld equation is
bm
3
+amh = 0 . (6.118)
This is a cubic equation for m with real coecients, and as such it can either have three real
solutions or one real solution and two complex solutions related by complex conjugation.
11
It is always the case that f is bounded from below, on physical grounds. Were b negative, wed have to
consider higher order terms in the Landau expansion.
Figure 6.9: Phase diagram for the quartic mean eld theory f = f
0
+
1
2
am
2
+
1
4
bm
4
hm,
with b > 0. There is a rst order line at h = 0 extending from a = and terminating in
a critical point at a = 0. For [h[ < h
(a) (dashed red line) there are three solutions to the

mean eld equation, corresponding to one global minimum, one local minimum, and one
local maximum. Insets show behavior of the free energy f(m).
Clearly we must have a < 0 in order to have three real roots, since bm
3
+am is monotonically
increasing otherwise. The boundary between these two classes of solution sets occurs when
two roots coincide, which means f
(m) = 0 as well as f
(m) = 0. Simultaneously solving

these two equations, we nd
h
(a) =
2
3
3/2
(a)
3/2
b
1/2
, (6.119)
or, equivalently,
a
(h) =
3
2
2/3
b
1/3
[h[
2/3
. (6.120)
If, for xed h, we have a < a
(h), then there will be three real solutions to the mean eld
equation f
(m) = 0, one of which is a global minimum (the one for which m h > 0). For
a > a
(h) there is only a single global minimum, at which m also has the same sign as h.
If we solve the mean eld equation perturbatively in h/a, we nd
m(a, h) =
h
a

b
a
4
h
3
+O(h
5
) (a > 0)
=
[a[
1/2
b
1/2
+
h
2 [a[

3 b
1/2
8 [a[
5/2
h
2
+O(h
3
) (a < 0) .
(6.121)
6.7.1 Cubic terms in Landau theory : rst order transitions
Next, consider a free energy with a cubic term,
f = f
0
+
1
2
am
2
1
3
ym
3
+
1
4
bm
4
, (6.122)
with b > 0 for stability. Without loss of generality, we may assume y > 0 (else send
m m). Note that we no longer have m m (i.e. Z
2
) symmetry. The cubic term
favors positive m. What is the phase diagram in the (a, y) plane?
Extremizing the free energy with respect to m, we obtain
f
m
= 0 = amym
2
+bm
3
. (6.123)
This cubic equation factorizes into a linear and quadratic piece, and hence may be solved
simply. The three solutions are m = 0 and
m = m

y
2b

_
_
y
2b
_
2
a
b
. (6.124)
We now see that for y
2
< 4ab there is only one real solution, at m = 0, while for y
2
> 4ab
there are three real solutions. Which solution has lowest free energy? To nd out, we
compare the energy f(0) with f(m
+
)
12
. Thus, we set
f(m) = f(0) =
1
2
am
2
1
3
ym
3
+
1
4
bm
4
= 0 , (6.125)
and we now have two quadratic equations to solve simultaneously:
0 = a ym+bm
2
0 =
1
2
a
1
3
ym+
1
4
bm
2
= 0 .
(6.126)
Eliminating the quadratic term gives m = 3a/y. Finally, substituting m = m
+
gives us a
relation between a, b, and y:
y
2
=
9
2
ab . (6.127)
Thus, we have the following:
a >
y
2
4b
: 1 real root m = 0
y
2
4b
> a >
2y
2
9b
: 3 real roots; minimum at m = 0
2y
2
9b
> a : 3 real roots; minimum at m =
y
2b
+
_
_
y
2b
_
2
a
b
(6.128)
The solution m = 0 lies at a local minimum of the free energy for a > 0 and at a local
maximum for a < 0. Over the range
y
2
4b
> a >
2y
2
9b
, then, there is a global minimum at
m = 0, a local minimum at m = m
+
, and a local maximum at m = m
, with m
+
> m
> 0.
For
2y
2
9b
> a > 0, there is a local minimum at a = 0, a global minimum at m = m
+
, and
a local maximum at m = m
, again with m
+
> m
> 0. For a < 0, there is a local

maximum at m = 0, a local minimum at m = m
, and a global minimum at m = m

+
, with
m
+
> 0 > m
. See g. 6.10.
12
We neednt waste our time considering the m = m
solution, since the cubic term prefers positive m.

Figure 6.10: Behavior of the quartic free energy f(m) =
1
2
am
2
1
3
ym
3
+
1
4
bm
4
. A: y
2
< 4ab
; B: 4ab < y
2
<
9
2
ab ; C and D: y
2
>
9
2
ab. The thick black line denotes a line of rst order
transitions, where the order parameter is discontinuous across the transition.
6.7.2 Magnetization dynamics
Suppose we now impose some dynamics on the system, of the simple relaxational type
m
t
=
f
m
, (6.129)
where is a phenomenological kinetic coecient. Assuming y > 0 and b > 0, it is convenient
to adimensionalize by writing
m
y
b
u , a
y
2
b
r , t
b
y
2
s . (6.130)
Then we obtain
u
s
=
u
, (6.131)
where the dimensionless free energy function is
(u) =
1
2
ru
2
1
3
u
3
+
1
4
u
4
. (6.132)
We see that there is a single control parameter, r. The xed points of the dynamics are
then the stationary points of (u), where
(u) = 0, with
(u) = u(r u +u
2
) . (6.133)
Figure 6.11: Fixed points for (u) =
1
2
ru
2
1
3
u
3
+
1
4
u
4
and ow under the dynamics
u =
(u). Solid curves represent stable xed points and dashed curves unstable xed
points. Magenta arrows show behavior under slowly increasing control parameter r and
dark blue arrows show behavior under slowly decreasing r. For u > 0 there is a hysteresis
loop. The thick black curve shows the equilibrium thermodynamic value of u(r), i.e. that
value which minimizes the free energy (u). There is a rst order phase transition at r =
2
9
,
where the thermodynamic value of u jumps from u = 0 to u =
2
3
.
The solutions to
(u) = 0 are then given by

u
= 0 , u
=
1
2

_
1
4
r . (6.134)
For r >
1
4
there is one xed point at u = 0, which is attractive under the dynamics
u =
(u) since
(0) = r. At r =
1
4
there occurs a saddle-node bifurcation and a pair of
xed points is generated, one stable and one unstable. As we see from g. 6.9, the interior
xed point is always unstable and the two exterior xed points are always stable. At r = 0
there is a transcritical bifurcation where two xed points of opposite stability collide and
bounce o one another (metaphorically speaking).
At the saddle-node bifurcation, r =
1
4
and u =
1
2
, and we nd (u =
1
2
; r =
1
4
) =
1
192
, which
is positive. Thus, the thermodynamic state of the system remains at u = 0 until the value
of (u
+
) crosses zero. This occurs when (u) = 0 and
(u) = 0, the simultaneous solution

of which yields r =
2
9
and u =
2
3
.
Suppose we slowly ramp the control parameter r up and down as a function of the di-
mensionless time s. Under the dynamics of eqn. 6.131, u(s) ows to the rst stable xed
point encountered this is always the case for a dynamical system with a one-dimensional
phase space. Then as r is further varied, u follows the position of whatever locally stable
xed point it initially encountered. Thus, u
_
r(s)
_
evolves smoothly until a bifurcation is
encountered. The situation is depicted by the arrows in g. 6.11. The equilibrium thermo-
dynamic value for u(r) is discontinuous; there is a rst order phase transition at r =
2
9
, as
weve already seen. As r is increased, u(r) follows a trajectory indicated by the magenta
arrows. For an negative initial value of u, the evolution as a function of r will be reversible.
However, if u(0) is initially positive, then the system exhibits hysteresis, as shown. Starting
with a large positive value of r, u(s) quickly evolves to u = 0
+
, which means a positive
innitesimal value. Then as r is decreased, the system remains at u = 0
+
even through the
rst order transition, because u = 0 is an attractive xed point. However, once r begins to
go negative, the u = 0 xed point becomes repulsive, and u(s) quickly ows to the stable
xed point u
+
=
1
2
+
_
1
4
r. Further decreasing r, the system remains on this branch. If
r is later increased, then u(s) remains on the upper branch past r = 0, until the u
+
xed
point annihilates with the unstable xed point at u
=
1
2

_
1
4
r, at which time u(s)
quickly ows down to u = 0
+
again.
6.7.3 Sixth order Landau theory : tricritical point
Finally, consider a model with Z
2
symmetry, with the Landau free energy
f = f
0
+
1
2
am
2
+
1
4
bm
4
+
1
6
cm
6
, (6.135)
with c > 0 for stability. We seek the phase diagram in the (a, b) plane. Extremizing f with
respect to m, we obtain
f
m
= 0 = m(a +bm
2
+cm
4
) , (6.136)
which is a quintic with ve solutions over the complex m plane. One solution is obviously
m = 0. The other four are
m =
b
2c

_
b
2c
_
2
a
c
. (6.137)
For each symbol in the above equation, there are two options, hence four roots in all.
If a > 0 and b > 0, then four of the roots are imaginary and there is a unique minimum at
m = 0.
For a < 0, there are only three solutions to f
(m) = 0 for real m, since the choice for

the sign under the radical leads to imaginary roots. One of the solutions is m = 0. The
other two are
m =
b
2c
+
_
_
b
2c
_
2
a
c
. (6.138)
Figure 6.12: Behavior of the sextic free energy f(m) =
1
2
am
2
+
1
4
bm
4
+
1
6
cm
6
. A: a > 0 and
b > 0 ; B: a < 0 and b > 0 ; C: a < 0 and b < 0 ; D: a > 0 and b <
4
ac ; E: a > 0
and
4
ac < b < 2
ac ; F: a > 0 and 2
ac < b < 0. The thick dashed line is a line

of second order transitions, which meets the thick solid line of rst order transitions at the
tricritical point, (a, b) = (0, 0).
The most interesting situation is a > 0 and b < 0. If a > 0 and b < 2
ac, all ve roots

are real. There must be three minima, separated by two local maxima. Clearly if m
is a
solution, then so is m
. Thus, the only question is whether the outer minima are of lower
energy than the minimum at m = 0. We assess this by demanding f(m
) = f(0), where
m
is the position of the largest root (i.e. the rightmost minimum). This gives a second
quadratic equation,
0 =
1
2
a +
1
4
bm
2
+
1
6
cm
4
, (6.139)
which together with equation 6.136 gives
b =
4
ac . (6.140)
Figure 6.13: Free energy (u) =
1
2
ru
2
1
4
u
4
+
1
6
u
6
for several dierent values of the control
parameter r.
Thus, we have the following, for xed a > 0:
b > 2
ac : 1 real root m = 0
2
ac > b >
4
ac : 5 real roots; minimum at m = 0 (6.141)
ac > b : 5 real roots; minima at m =
b
2c
+
_
_
b
2c
_
2
a
c
The point (a, b) = (0, 0), which lies at the conuence of a rst order line and a second order
line, is known as a tricritical point.
6.7.4 Hysteresis for the sextic potential
Once again, we consider the dissipative dynamics m = f
(m). We adimensionalize by
writing
m
_
[b[
c
u , a
b
2
c
r , t
c
b
2
s . (6.142)
Then we obtain once again the dimensionless equation
u
s
=
u
, (6.143)
where
(u) =
1
2
ru
2
1
4
u
4
+
1
6
u
6
. (6.144)
In the above equation, the coecient of the quartic term is positive if b > 0 and negative
if b < 0. That is, the coecient is sgn(b). When b > 0 we can ignore the sextic term for
suciently small u, and we recover the quartic free energy studied earlier. There is then a
second order transition at r = 0. .
New and interesting behavior occurs for b > 0. The xed points of the dynamics are
obtained by setting
(u) = 0. We have
(u) =
1
2
ru
2
1
4
u
4
+
1
6
u
6
(u) = u(r u
2
+u
4
) .
(6.145)
Thus, the equation
(u) = 0 factorizes into a linear factor u and a quartic factor u

4
u
2
+r
which is quadratic in u
2
. Thus, we can easily obtain the roots:
r < 0 : u
= 0 , u
=
_
1
2
+
_
1
4
r
0 < r <
1
4
: u
= 0 , u
=
_
1
2
+
_
1
4
r , u
=
_
1
2

_
1
4
r
r >
1
4
: u
= 0 .
(6.146)
In g. 6.14, we plot the xed points and the hysteresis loops for this system. At r =
1
4
,
there are two symmetrically located saddle-node bifurcations at u =
1
2
. We nd (u =
2
, r =
1
4
) =
1
48
, which is positive, indicating that the stable xed point u
= 0 remains
the thermodynamic minimum for the free energy (u) as r is decreased through r =
1
4
.
Setting (u) = 0 and
(u) = 0 simultaneously, we obtain r =

3
16
and u =
3
2
. The
thermodynamic value for u therefore jumps discontinuously from u = 0 to u =
3
2
(either
branch) at r =
3
16
; this is a rst order transition.
Under the dissipative dynamics considered here, the system exhibits hysteresis, as indicated
in the gure, where the arrows show the evolution of u(s) for very slowly varying r(s). When
the control parameter r is large and positive, the ow is toward the sole xed point at u
= 0.
At r =
1
4
, two simultaneous saddle-node bifurcations take place at u
=
1
2
; the outer
branch is stable and the inner branch unstable in both cases. At r = 0 there is a subcritical
pitchfork bifurcation, and the xed point at u
= 0 becomes unstable.
Suppose one starts o with r
1
4
with some value u > 0. The ow u =
(u) then
rapidly results in u 0
+
. This is the high temperature phase in which there is no
magnetization. Now let r increase slowly, using s as the dimensionless time variable. The
scaled magnetization u(s) = u
_
r(s)
_
will remain pinned at the xed point u
= 0
+
. As
r passes through r =
1
4
, two new stable values of u
appear, but our system remains at

u = 0
+
, since u
= 0 is a stable xed point. But after the subcritical pitchfork, u
= 0
becomes unstable. The magnetization u(s) then ows rapidly to the stable xed point at
u
=
1
2
, and follows the curve u
(r) =
_
1
2
+ (
1
4
r)
1/2
_
1/2
for all r < 0.
Now suppose we start increasing r (i.e. increasing temperature). The magnetization follows
the stable xed point u
(r) =
_
1
2
+ (
1
4
r)
1/2
_
1/2
past r = 0, beyond the rst order phase
Figure 6.14: Fixed points
(u
) = 0 for the sextic potential (u) =

1
2
ru
2
1
4
u
4
+
1
6
u
6
,
and corresponding dynamical ow (arrows) under u =
(u). Solid curves show stable

xed points and dashed curves show unstable xed points. The thick solid black and solid
grey curves indicate the equilibrium thermodynamic values for u; note the overall u u
symmetry. Within the region r [0,
1
4
] the dynamics are irreversible and the system exhibits
the phenomenon of hysteresis. There is a rst order phase transition at r =
3
16
.
transition point at r =
3
16
, and all the way up to r =
1
4
, at which point this xed point is
annihilated at a saddle-node bifurcation. The ow then rapidly takes u u
= 0
+
, where
it remains as r continues to be increased further.
Within the region r
_
0,
1
4
of control parameter space, the dynamics are said to be

irreversible and the behavior of u(s) is said to be hysteretic.
6.8 Correlation and Response in Mean Field Theory
Consider the Ising model,
H =
1
2
i,j
J
ij

i
k
H
k

k
, (6.147)
where the local magnetic eld on site k is now H
k
. We assume without loss of generality that
J
i
= 0. Now consider the partition function Z = Tr e

H
as a function of the temperature
6.8. CORRELATION AND RESPONSE IN MEAN FIELD THEORY 331
T and the local eld values H
i
. We have
Z
H
i
= Tr
_
i
e

H
_
= Z
i
)
2
Z
H
i
H
j
=
2
Tr
_
j
e

H
_
=
2
Z
i
j
) .
(6.148)
Thus,
m
i
=
F
H
i
=
i
)
ij
=
m
i
H
j
=

2
F
H
i
H
j
=
1
k
B
T

_
j
)
i
)
j
)
_
.
(6.149)
Expressions such as
i
),
i
j
), etc. are in general called correlation functions. For example,
we dene the spin-spin correlation function C
ij
as
C
ij

i
j
)
i
)
j
) . (6.150)
Expressions such as
F
H
i
and

2
F
H
i
H
j
are called response functions. The above relation
between correlation functions and response functions, C
ij
= k
B
T

ij
, is valid only for the
equilibrium distribution. In particular, this relationship is invalid if one uses an approximate
distribution, such as the variational density matrix formalism of mean eld theory.
The question then arises: within mean eld theory, which is more accurate: correlation
functions or response functions? A simple argument suggests that the response functions
are more accurate representations of the real physics. To see this, lets write the variational
density matrix
var
as the sum of the exact equilibrium (Boltzmann) distribution
eq
=
Z
1
exp(

H) plus a deviation :
var
=
eq
+ . (6.151)
Then if we calculate a correlator using the variational distribution, we have
j
)
var
= Tr
_
var
j
_
= Tr
_
eq
j
_
+ Tr
_

i
j
_
.
(6.152)
Thus, the variational density matrix gets the correlator right to rst order in . On the
other hand, the free energy is given by
F
var
= F
eq
+
eq
+
1
2
2
F
eq
+. . . . (6.153)
Here denotes a state of the system, i.e. [ ) = [
1
, . . . ,
N
), where every spin polarization
is specied. Since the free energy is an extremum (and in fact an absolute minimum) with
respect to the distribution, the second term on the RHS vanishes. This means that the free
energy is accurate to second order in the deviation .
6.8.1 Calculation of the response functions
Consider the variational density matrix
() =
i
(
i
) , (6.154)
where
i
(
i
) =
_
1 +m
i
2
_
i
,1
+
_
1 m
i
2
_
i
,1
. (6.155)
The variational energy E = Tr (

H) is
E =
1
2
ij
J
i,j
m
i
m
j

i
H
i
m
i
(6.156)
and the entropy S = k
B
T Tr ( ln ) is
S = k
B
i
_
_
1 +m
i
2
_
ln
_
1 +m
i
2
_
+
_
1 m
i
2
_
ln
_
1 m
i
2
_
_
. (6.157)
Setting the variation
F
m
i
= 0, with F = E TS, we obtain the mean eld equations,
m
i
= tanh
_
J
ij
m
j
+H
i
_
, (6.158)
where we use the summation convention: J
ij
m
j

j
J
ij
m
j
. Suppose T > T
c
and m
i
is
small. Then we can expand the RHS of the above mean eld equations, obtaining
_
ij
J
ij
_
m
j
= H
i
. (6.159)
Thus, the susceptibility tensor

is the inverse of the matrix (k
B
T I J) :
ij
=
m
i
H
j
=
_
k
B
T I J
_
1
ij
, (6.160)
where I is the identity. Note also that so-called connected averages of the kind in eqn. 6.150
vanish identically if we compute them using our variational density matrix, since all the
sites are independent, hence
j
) = Tr
_
var
j
_
= Tr
_
i
_
Tr
_
j

j
_
=
i
)
j
) , (6.161)
and therefore

ij
= 0 if we compute the correlation functions themselves from the variational
density matrix, rather than from the free energy F. As we have argued above, the latter
approximation is more accurate.
Assuming J
ij
= J(R
i
R
j
), where R
i
is a Bravais lattice site, we can Fourier transform
the above equation, resulting in
m(q) =
H(q)
k
B
T

J(q)

(q)

H(q) . (6.162)
6.8. CORRELATION AND RESPONSE IN MEAN FIELD THEORY 333
Once again, our denition of lattice Fourier transform of a function (R) is
(q)
R
(R) e
iqR
(R) =
_
d
d
q
(2)
d
(q) e
iqR
,
(6.163)
where is the unit cell in real space, called the Wigner-Seitz cell , and

is the rst Brillouin
zone, which is the unit cell in reciprocal space. Similarly, we have
J(q) =
R
J(R)
_
1 iq R
1
2
(q R)
2
+. . .
_
=

J(0)
_
1 q
2
R
2
+O(q
4
)
_
,
(6.164)
where
R
2
R
R
2
J(R)
2d
R
J(R)
. (6.165)
Here we have assumed inversion symmetry for the lattice, in which case
R
R
J(R) =
1
d

R
R
2
J(R) . (6.166)
On cubic lattices with nearest neighbor interactions only, one has R
= a/
2d, where a is
the lattice constant and d is the dimension of space.
Thus, with the identication k
B
T
c
=

J(0), we have

(q) =
1
k
B
(T T
c
) +k
B
T
c
R
2
q
2
+O(q
4
)
=
1
k
B
T
c
R
2
2
+q
2
+O(q
4
)
,
(6.167)
where
= R
_
T T
c
T
c
_
1/2
(6.168)
is the correlation length. With the denition
(T) [T T
c
[
(6.169)
as T T
c
, we obtain the mean eld correlation length exponent =
1
2
. The exact result
for the two-dimensional Ising model is = 1, whereas 0.6 for the d = 3 Ising model.
Note that

(q = 0, T) diverges as (T T
c
)
1
for T > T
c
.
In real space, we have
m
i
=
ij
H
j
, (6.170)
where
ij
=
_
d
d
q
(2)
d

(q) e
iq(R
i
R
j
)
. (6.171)
Note that

(q) is properly periodic under q q+G, where G is a reciprocal lattice vector,
which satises e
iGR
= 1 for any direct Bravais lattice vector R. Indeed, we have

1
(q) = k
B
T

J(q)
= k
B
T J
e
iq
,
(6.172)
where is a nearest neighbor separation vector, and where in the second line we have
assumed nearest neighbor interactions only. On cubic lattices in d dimensions, there are
2d nearest neighbor separation vectors, = a e
, where 1, . . . , d. The real space

susceptibility is then
(R) =
d
1
2

d
d
2
e
in
1
1
e
in
d
d
k
B
T (2J cos
1
+. . . + 2J cos
d
)
, (6.173)
where R = a
d
=1
n
is a general direct lattice vector for the cubic Bravais lattice in d

dimensions, and the n
are integers.
The long distance behavior was discussed in chapter 5 (see 5.5.9 on Ornstein-Zernike
theory
13
). For convenience we reiterate those results:
In d = 1,
d=1
(x) =
_

2k
B
T
c
R
2
_
e
[x[/
. (6.174)
In d > 1, with r and xed,
OZ
d
(r) C
d

(3d)/2
k
B
T R
2
e
r/
r
(d1)/2

_
1 +O
_
d 3
r/
__
, (6.175)
where the C
d
In d > 2, with and r xed (i.e. T T
c
at xed separation r),
d
(r)
C
d
k
B
TR
2
e
r/
r
d2

_
1 +O
_
d 3
r/
__
. (6.176)
In d = 2 dimensions we obtain
d=2
(r)
C
2
k
B
TR
2
ln
_
r
_
e
r/
_
1 +O
_
1
ln(r/)
__
, (6.177)
where the C
d
13
There is a sign dierence between the particle susceptibility dened in chapter 5 and the spin suscepti-
bility dened here. The origin of the dierence is that the single particle potential v as dened was repulsive
for v > 0, meaning the local density response n should be negative, while in the current discussion a positive
magnetic eld H prefers m > 0.
6.9. GLOBAL SYMMETRIES 335
6.9 Global Symmetries
Interacting systems can be broadly classied according to their global symmetry group.
Consider the following ve examples:
H
Ising
=
i<j
J
ij

i
j

i
1, +1
H
pclock
=
i<j
J
ij
cos
_
2(n
i
n
j
)
p
_
n
i
1, 2, . . . , p
H
qPotts
=
i<j
J
ij

i
,
j
i
1, 2, . . . , q (6.178)
H
XY
=
i<j
J
ij
cos(
i
j
)
i

_
0, 2
_
H
O(n)
=
i<j
J
ij

i
S
n1
.
The Ising Hamiltonian is left invariant by the global symmetry group Z
2
, which has two
elements, I and , with

i
=
i
. (6.179)
I is the identity, and
2
= I. By simultaneously reversing all the spins
i

i
, the
interactions remain invariant.
The degrees of freedom of the p-state clock model are integer variables n
i
each of which
ranges from 1 to p. The Hamiltonian is invariant under the discrete group Z
p
, whose p
elements are generated by the single operation , where
n
i
=
_
n
i
+ 1 if n
i
1, 2, . . . , p 1
1 if n
i
= p .
(6.180)
Think of a clock with one hand and p hour markings consecutively spaced by an angle 2/p.
In each site i, a hand points to one of the p hour marks; this determines n
i
. The operation
simply advances all the hours by one tick, with hour p advancing to hour 1, just as 23:00
military time is followed one hour later by 00:00. The interaction cos
_
2(n
i
n
j
)/p
_
is
invariant under such an operation. The p elements of the group Z
p
are then
I , ,
2
, . . . ,
p1
. (6.181)
Weve already met up with the q-state Potts model, where each site supports a spin
i
which can be in any of q possible states, which we may label by integers 1 , . . . , q. The
energy of two interacting sites i and j is J
ij
if
i
=
j
and zero otherwise. This energy
function is invariant under global operations of the symmetric group on q characters, S
q
,
which is the group of permutations of the sequence 1 , 2 , 3 , . . . , q. The group S
q
has q!
Figure 6.15: A domain wall in a one-dimensional Ising model.
elements. Note the dierence between a Z
q
symmetry and an S
q
symmetry. In the former
case, the Hamiltonian is invariant only under the q-element cyclic permutations, e.g.

_
1
2
2
3

q1
q
q
1
_
and its powers
l
with l = 0, . . . , q 1.
All these models the Ising, p-state clock, and q-state Potts models possess a global
symmetry group which is discrete. That is, each of the symmetry groups Z
2
, Z
p
, S
q
is
a discrete group, with a nite number of elements. The XY Hamiltonian

H
XY
on the
other hand is invariant under a continuous group of transformations
i

i
+ , where
i
is the angle variable on site i. More to the point, we could write the interaction term
cos(
i
j
) as
1
2
_
z
i
z
j
+z
i
z
j
_
, where z
i
= e
i
i
is a phase which lives on the unit circle, and z
i
is the complex conjugate of z
i
. The model is then invariant under the global transformation
z
i
e
i
z
i
. The phases e
i
form a group under multiplication, called U(1), which is the same
as O(2). Equivalently, we could write the interaction as

j
, where

i
= (cos
i
, sin
i
),
which explains the O(2), symmetry, since the symmetry operations are global rotations in
the plane, which is to say the two-dimensional orthogonal group. This last representation
generalizes nicely to unit vectors in n dimensions, where
= (
1
,
2
, . . . ,
n
) (6.182)
with

2
= 1. The dot product

j
is then invariant under global rotations in this
n-dimensional space, which is the group O(n).
6.9.1 Lower critical dimension
Depending on whether the global symmetry group of a model is discrete or continuous,
there exists a lower critical dimension d
at or below which no phase transition may take

place at nite temperature. That is, for d d
, the critical temperature is T

c
= 0. Owing to
its neglect of uctuations, mean eld theory generally overestimates the value of T
c
because
it overestimates the stability of the ordered phase. Indeed, there are many examples where
mean eld theory predicts a nite T
c
when the actual critical temperature is T
c
= 0. This
happens whenever d d
.
Lets test the stability of the ordered (ferromagnetic) state of the one-dimensional Ising
model at low temperatures. We consider order-destroying domain wall excitations which
interpolate between regions of degenerate, symmetry-related ordered phase, i.e. and
. For a system with a discrete symmetry at low temperatures, the domain wall is
abrupt, on the scale of a single lattice spacing. If the exchange energy is J, then the energy
6.9. GLOBAL SYMMETRIES 337
Figure 6.16: Domain walls in the two-dimensional (left) and three-dimensional (right) Ising
model.
of a single domain wall is 2J, since a link of energy J is replaced with one of energy +J.
However, there are N possible locations for the domain wall, hence its entropy is k
B
ln N.
For a system with M domain walls, the free energy is
F = 2MJ k
B
T ln
_
N
M
_
= N
_
2Jx +k
B
T
_
xln x + (1 x) ln(1 x)
_
_
,
where x = M/N is the density of domain walls, and where we have used Stirlings approx-
imation for k! when k is large. Extremizing with respect to x, we nd
x
1 x
= e
2J/k
B
T
= x =
1
e
2J/k
B
T
+ 1
. (6.183)
The average distance between domain walls is x
1
, which is nite for nite T. Thus, the
thermodynamic state of the system is disordered, with no net average magnetization.
Consider next an Ising domain wall in d dimensions. Let the linear dimension of the system
be L a, where L is a real number and a is the lattice constant. Then the energy of a single
domain wall which partitions the entire system is 2J L
d1
. The domain wall entropy is
dicult to compute, because the wall can uctuate signicantly, but for a single domain
wall we have S
>
k
B
lnL. Thus, the free energy F = 2JL
d1
k
B
T ln L is dominated by
the energy term if d > 1, suggesting that the system may be ordered. We can do a slightly
better job in d = 2 by writing
Z exp
_
L
d
P
^
P
e
2PJ/k
B
T
_
, (6.184)
where the sum is over all closd loops of perimeter P, and ^
P
is the number of such loops.
An example of such a loop circumscribing a domain is depicted in the left panel of g. 6.16.
It turns out that
^
P

P
P
_
1 +O(P
1
)
_
, (6.185)
where = z 1 with z the lattice coordination number, and is some exponent. We can
understand the
P
factor in the following way. At each step along the perimeter of the loop,
there are = z1 possible directions to go (since one doesnt backtrack). The fact that
the loop must avoid overlapping itself and must return to its original position to be closed
leads to the power law term P
, which is subleading since

P
P
= exp(P ln ln P)
and P ln P for P 1. Thus,
F
1
L
d
P
P
e
(ln 2J)P
, (6.186)
which diverges if ln > 2J, i.e. if T > 2J/k
B
ln(z 1). We identify this singularity with
the phase transition. The high temperature phase involves a proliferation of such loops.
The excluded volume eects between the loops, which we have not taken into account, then
enter in an essential way so that the sum converges. Thus, we have the following picture:
ln < 2J : large loops suppressed ; ordered phase
ln > 2J : large loops proliferate ; disordered phase .
On the square lattice, we obtain
k
B
T
approx
c
=
2J
ln 3
= 1.82 J
k
B
T
exact
c
=
2J
sinh
1
(1)
= 2.27 J .
The agreement is better than we should reasonably expect from such a crude argument.
Nota bene : Beware of arguments which allegedly prove the existence of an ordered phase.
Generally speaking, any approximation will underestimate the entropy, and thus will over-
estimate the stability of the putative ordered phase.
6.9.2 Continuous symmetries
When the global symmetry group is continuous, the domain walls interpolate smoothly
between ordered phases. The energy generally involves a stiness term,
E =
1
2
s
_
d
d
r ()
2
, (6.187)
where (r) is the angle of a local rotation about a single axis and where
s
is the spin
stiness. Of course, in O(n) models, the rotations can be with respect to several dierent
axes simultaneously.
In the ordered phase, we have (r) =
0
, a constant. Now imagine a domain wall in which
(r) rotates by 2 across the width of the sample. We write (r) = 2nx/L, where L is the
linear size of the sample (here with dimensions of length) and n is an integer telling us how
6.10. RANDOM SYSTEMS : IMRY-MA ARGUMENT 339
Figure 6.17: A domain wall in an XY ferromagnet.
many complete twists the order parameter eld makes. The domain wall then resembles
that in g. 6.17. The gradient energy is
E =
1
2
s
L
d1
L
_
0
dx
_
2n
L
_
2
= 2
2
n
2
s
L
d2
. (6.188)
Recall that in the case of discrete symmetry, the domain wall energy scaled as E L
d1
.
Thus, with S
>
k
B
lnL for a single wall, we see that the entropy term dominates if d 2, in
which case there is no nite temperature phase transition. Thus, the lower critical dimension
d
depends on whether the global symmetry is discrete or continuous, with

discrete global symmetry = d
= 1
continuous global symmetry = d
= 2 .
Note that all along we have assumed local, short-ranged interactions. Long-ranged interac-
tions can enhance order and thereby suppress d
.
Thus, we expect that for models with discrete symmetries, d
= 1 and there is no nite

temperature phase transition for d 1. For models with continuous symmetries, d
= 2,
and we expect T
c
= 0 for d 2. In this context we should emphasize that the two-
dimensional XY model does exhibit a phase transition at nite temperature, called the
Kosterlitz-Thouless transition. However, this phase transition is not associated with the
breaking of the continuous global O(2) symmetry and rather has to do with the unbinding
of vortices and antivortices. So there is still no true long-ranged order below the critical
temperature T
KT
, even though there is a phase transition!
6.10 Random Systems : Imry-Ma Argument
Oftentimes, particularly in condensed matter systems, intrinsic randomness exists due to
quenched impurities, grain boundaries, immobile vacancies, etc. How does this quenched
randomness aect a systems attempt to order at T = 0? This question was taken up in
a beautiful and brief paper by J. Imry and S.-K. Ma, Phys. Rev. Lett. 35, 1399 (1975).
Imry and Ma considered models in which there are short-ranged interactions and a random
local eld coupling to the local order parameter:
H
RFI
= J
ij)
i
H
i
i
(6.189)
H
RFO(n)
= J
ij)
i
H
i
, (6.190)
where
H
i
)) = 0 , H
i
H
j
)) =
ij
, (6.191)
where )) denotes a congurational average over the disorder. Imry and Ma reasoned
that a system could try to lower its free energy by forming domains in which the order
parameter took advantage of local uctuations in the random eld. The size of these
domains is assumed to be L
d
, a length scale to be determined. See the sketch in the left
panel of g. 6.18.
There are two contributions to the energy of a given domain: bulk and surface terms. The
bulk energy is
E
bulk
= H
rms
(L
d
/a)
d/2
, (6.192)
where a is the lattice spacing. This is because when we add together (L
d
/a)
d
random elds,
the magnitude of the result is proportional to the square root of the number of terms, i.e.
to (L
d
/a)
d/2
. The quantity H
rms
=
is the root-mean-square uctuation in the random

eld at a given site. The surface energy is
E
surface

_
J (L
d
/a)
d1
(discrete symmetry)
J (L
d
/a)
d2
(continuous symmetry) .
(6.193)
We compute the critical dimension d
c
by balancing the bulk and surface energies,
d 1 =
1
2
d = d
c
= 2 (discrete)
d 2 =
1
2
d = d
c
= 4 (continuous) .
The total free energy is F = (V/L
d
d
) E, where E = E
bulk
+E
surf
. Thus, the free energy
per unit cell is
f =
F
V/a
d
J
_
a
L
d
_
1
2
d
c
H
rms
_
a
L
d
_
1
2
d
. (6.194)
If d < d
c
, the surface term dominates for small L
d
and the bulk term dominates for large
L
d
There is global minimum at
L
d
a
=
_
d
c
d

J
H
rms
_ 2
d
c
d
. (6.195)
For d > d
c
, the relative dominance of the bulk and surface terms is reversed, and there is a
global maximum at this value of L
d
.
6.11. GINZBURG-LANDAU THEORY 341
Figure 6.18: Left panel : Imry-Ma domains for an O(2) model. The arrows point in the
direction of the local order parameter eld
(r)). Right panel : free energy density as a

function of domain size L
d
. Keep in mind that the minimum possible value for L
d
is the
lattice spacing a.
Sketches of the free energy f(L
d
) in both cases are provided in the right panel of g. 6.18.
We must keep in mind that the domain size L
d
cannot become smaller than the lattice
spacing a. Hence we should draw a vertical line on the graph at L
d
= a and discard the
portion L
d
< a as unphysical. For d < d
c
, we see that the state with L
d
= , i.e. the
ordered state, is never the state of lowest free energy. In dimensions d < d
c
, the ordered
state is always unstable to domain formation in the presence of a random eld.
For d > d
c
, there are two possibilities, depending on the relative size of J and H
rms
. We can
see this by evaluating f(L
d
= a) = J H
rms
and f(L
d
= ) = 0. Thus, if J > H
rms
, the
minimum energy state occurs for L
d
= . In this case, the system has an ordered ground
state, and we expect a nite temperature transition to a disordered state at some critical
temperature T
c
> 0. If, on the other hand, J < H
rms
, then the uctuations in H overwhelm
the exchange energy at T = 0, and the ground state is disordered down to the very smallest
length scale (i.e. the lattice spacing a).
Please read the essay, Memories of Shang-Keng Ma, at http://sip.clarku.edu/skma.html.
6.11 Ginzburg-Landau Theory
Including gradient terms in the free energy, we write
F
_
m(x) , h(x)
=
_
d
d
x
_
f
0
+
1
2
a m
2
+
1
4
b m
4
+
1
6
c m
6
hm+
1
2
(m)
2
+. . .
_
. (6.196)
In principle, any term which does not violate the appropriate global symmetry will turn
up in such an expansion of the free energy, with some coecient. Examples include hm
3
(both m and h are odd under time reversal), m
2
(m)
2
, etc. We now ask: what function
m(x) extremizes the free energy functional F
_
m(x) , h(x)
? The answer is that m(x) must

satisfy the corresponding Euler-Lagrange equation, which for the above functional is
a m+b m
3
+c m
5
h
2
m = 0 . (6.197)
If a > 0 and h is small (we assume b > 0 and c > 0), we may neglect the m
3
and m
5
terms
and write
_
a
2
_
m = h , (6.198)
whose solution is obtained by Fourier transform as
m(q) =
h(q)
a +q
2
, (6.199)
which, with h(x) appropriately dened, recapitulates the result in eqn. 6.162. Thus, we
conclude that

(q) =
1
a +q
2
, (6.200)
which should be compared with eqn. 6.167. For continuous functions, we have
m(q) =
_
d
d
x m(x) e
iqx
(6.201)
m(x) =
_
d
d
q
(2)
d
m(q) e
iqx
. (6.202)
We can then derive the result
m(x) =
_
d
d
x

(x x
) h(x
) , (6.203)
where
(x x
) =
1
_
d
d
q
(2)
d
e
iq(xx
)
q
2
+
2
, (6.204)
where the correlation length is =
_
/a (T T
c
)
1/2
, as before.
If a < 0 then there is a spontaneous magnetization and we write m(x) = m
0
+ m(x).
Assuming h is weak, we then have two equations
a +b m
2
0
+c m
4
0
= 0 (6.205)
(a + 3b m
2
0
+ 5c m
4
0
2
) m = h . (6.206)
If a > 0 is small, we have m
2
0
= a/3b and
m(q) =
h(q)
2a +q
2
, (6.207)
6.11.1 Domain wall prole
A particularly interesting application of Ginzburg-Landau theory is its application toward
modeling the spatial prole of defects such as vortices and domain walls. Consider, for
example, the case of Ising (Z
2
) symmetry with h = 0. We expand the free energy density
to order m
4
:
F
_
m(x)
=
_
d
d
x
_
f
0
+
1
2
a m
2
+
1
4
b m
4
+
1
2
(m)
2
_
. (6.208)
We assume a < 0, corresponding to T < T
c
. Consider now a domain wall, where m(x
) = m
0
and m(x +) = +m
0
, where m
0
is the equilibrium magnetization, which
we obtain from the Euler-Lagrange equation,
a m +b m
3
2
m = 0 , (6.209)
assuming a uniform solution where m = 0. This gives m
0
=
_
[a[
_
b . It is useful to
scale m(x) by m
0
, writing m(x) = m
0
(x). The scaled order parameter function (x) will
interpolate between () = 1 and (+) = 1.
It also proves useful to rescale position, writing x = (2/b)
1/2
. Then we obtain
1
2
2
= +
3
. (6.210)
We assume () = () is only a function of one coordinate,
1
. Then the Euler-
Lagrange equation becomes
d
2
d
2
= 2 + 2
3

U
, (6.211)
where
U() =
1
2
_
2
1
_
2
. (6.212)
The potential U() is an inverted double well, with maxima at = 1. The equation
= U
(), where dot denotes dierentiation with respect to , is simply Newtons second
law with time replaced by space. In order to have a stationary solution at where
= 1, the total energy must be E = U( = 1) = 0, where E =
1
2
2
+U(). This leads
to the rst order dierential equation
d
d
= 1
2
, (6.213)
with solution
() = tanh() . (6.214)
Restoring the dimensionful constants,
m(x) =
_
[a[
b
tanh
_
_
b
2
x
_
. (6.215)
6.11.2 Derivation of Ginzburg-Landau free energy
We can make some progress in systematically deriving the Ginzburg-Landau free energy.
Consider the Ising model,
H
k
B
T
=
1
2
i,j
K
ij

i
i
h
i
i
+
1
2
i
K
ii
, (6.216)
where now K
ij
= J
ij
/k
B
T and h
i
= H
i
/k
B
T are the interaction energies and local magnetic
elds in units of k
B
T. The last term on the RHS above cancels out any contribution from
diagonal elements of K
ij
. Our derivation makes use of a generalization of the Gaussian
integral,
dx e
1
2
ax
2
bx
=
_
2
a
_
1/2
e
b
2
/2a
. (6.217)
The generalization is
dx
1

dx
N
e
1
2
A
ij
x
i
x
j
b
i
x
i
=
(2)
N/2
det A
e
1
2
A
1
ij
b
i
b
j
, (6.218)
where we use the Einstein convention of summing over repeated indices, and where we
assume that the matrix A is positive denite (else the integral diverges). This allows us to
write
Z = e
1
2
K
ii
Tr
_
e
1
2
K
ij
j
e
h
i
i
_
= det
1/2
(2K) e
1
2
K
ii
d
1

d
N
e
1
2
K
1
ij

i
j
Tr e
(
i
+h
i
)
i
= det
1/2
(2K) e
1
2
K
ii
d
1

d
N
e
1
2
K
1
ij

i
j
e
P
i
ln[2 cosh(
i
+h
i
)]
d
1

d
N
e
(
1
,...,
N
)
, (6.219)
where
=
1
2
i,j
K
1
ij

i
i
ln cosh(
i
+ h
i
) +
1
2
ln det (2K) +
1
2
Tr K N ln 2 . (6.220)
We assume the model is dened on a Bravais lattice, in which case we can write
i
=
R
i
.
We can then dene the Fourier transforms,
R
=
1
q
e
iqR
(6.221)
q
=
1
R
e
iqR
(6.222)
and
K(q) =
R
K(R) e
iqR
. (6.223)
A few remarks about the lattice structure and periodic boundary conditions are in order.
For a Bravais lattice, we can write each direct lattice vector R as a sum over d basis vectors
with integer coecients, viz.
R =
d
=1
n
, (6.224)
where d is the dimension of space. The reciprocal lattice vectors b
satisfy
a
= 2
, (6.225)
and any wavevector q may be expressed as
q =
1
2
d
=1
. (6.226)
We can impose periodic boundary conditions on a system of size M
1
M
2
M
d
by
requiring
R+
P
d
=1
l
=
R
. (6.227)
This leads to the quantization of the wavevectors, which must then satisfy
e
iM
qa
= e
iM
= 1 , (6.228)
and therefore
= 2m
/M
, where m
is an integer. There are then M

1
M
2
M
d
= N
independent values of q, which can be taken to be those corresponding to m
1, . . . , M
.
Lets now expand the function
_
_
in powers of the
i
, and to rst order in the external
elds h
i
. We obtain
=
1
2
q
_
K
1
(q) 1
_
[
q
[
2
+
1
12
4
R
R
h
R
R
+O
_
6
, h
2
_
(6.229)
+
1
2
Tr K +
1
2
Tr ln(2K) N ln2
On a d-dimensional lattice, for a model with nearest neighbor interactions K
1
only, we
have

K(q) = K
1
e
iq
, where is a nearest neighbor separation vector. These are the
eigenvalues of the matrix K
ij
. We note that K
ij
is then not positive denite, since there are
negative eigenvalues
14
. To x this, we can add a term K
0
everywhere along the diagonal.
We then have
K(q) = K
0
+K
1
cos(q ) . (6.230)
14
To evoke a negative eigenvalue on a d-dimensional cubic lattice, set q
=

a
for all . The eigenvalue is
then 2dK
1
.
Here we have used the inversion symmetry of the Bravais lattice to eliminate the imaginary
term. The eigenvalues are all positive so long as K
0
> zK
1
, where z is the lattice coordi-
nation number. We can therefore write

K(q) =

K(0) q
2
for small q, with > 0. Thus,
we can write
K
1
(q) 1 = a +q
2
+. . . . (6.231)
To lowest order in q the RHS is isotropic if the lattice has cubic symmetry, but anisotropy
will enter in higher order terms. Well assume isotropy at this level. This is not necessary
but it makes the discussion somewhat less involved. We can now write down our Ginzburg-
Landau free energy density:
T = a
2
+
1
2
[[
2
+
1
12

4
h , (6.232)
valid to lowest nontrivial order in derivatives, and to sixth order in .
One might wonder what we have gained over the inhomogeneous variational density matrix
treatment, where we found
F =
1
2
J(q) [ m(q)[
2
H(q) m(q) (6.233)

+k
B
T
i
_
_
1 +m
i
2
_
ln
_
1 +m
i
2
_
+
_
1 m
i
2
_
ln
_
1 m
i
2
_
_
.
Surely we could expand

J(q) =

J(0)
1
2
aq
2
+ . . . and obtain a similar expression for T.
However, such a derivation using the variational density matrix is only approximate. The
method outlined in this section is exact.
Lets return to our complete expression for :
_
=
0
_
_
+
R
v(
R
) , (6.234)
where
0
_
_
=
1
2
q
G
1
(q)

(q)
2
+
1
2
Tr
_
1
1 +G
1
_
+
1
2
Tr ln
_
2
1 +G
1
_
N ln 2 . (6.235)
Here we have dened
v() =
1
2
2
ln cosh (6.236)
=
1
12

4
1
45

6
+
17
2520

8
+. . .
and
G(q) =
K(q)
1

K(q)
. (6.237)
We now want to compute
Z =
_
D
0
(
)
e
P
R
v(
R
)
(6.238)
where
D
d
1
d
2
d
N
. (6.239)
We expand the second exponential factor in a Taylor series, allowing us to write
Z = Z
0
_
1
v(
R
)
_
+
1
2
v(
R
) v(
R
)
_
+. . .
_
, (6.240)
where
Z
0
=
_
D
0
(
)
ln Z
0
=
1
2
Tr
_
ln(1 +G)
G
1 +G
_
+N ln 2 (6.241)
and
F
_
__
=
_
D
F e
0
_
D
0
. (6.242)
To evaluate the various terms in the expansion of eqn. 6.240, we invoke Wicks theorem,
which says
x
i
1
x
i
2
x
i
2L
_
=
dx
1

dx
N
e
1
2

1
ij
x
i
x
j
x
i
1
x
i
2
x
i
2L
_

_
dx
1

dx
N
e
1
2

1
ij
x
i
x
j
=
all distinct
pairings
(
j
1
j
2
(
j
3
j
4
(
j
2L1
j
2L
, (6.243)
where the sets j
1
, . . . , j
2L
are all permutations of the set i
1
, . . . , i
2L
. In particular, we
have
x
4
i
_
= 3
_
(
ii
_
2
. (6.244)
In our case, we have
4
R
_
= 3
_
1
N
q
G(q)
_
2
. (6.245)
Thus, if we write v()
1
12

4
and retain only the quartic term in v(), we obtain
F
k
B
T
= ln Z
0
=
1
2
Tr
_
G
1 +G
ln(1 +G)
_
+
1
4N
_
Tr G
_
2
N ln 2 (6.246)
= N ln 2 +
1
4N
_
Tr G
_
2
1
4
Tr
_
G
2
_
+O
_
G
3
_
.
Note that if we set K
ij
to be diagonal, then

K(q) and hence G(q) are constant functions of
q. The O
_
G
2
_
term then vanishes, which is required since the free energy cannot depend
on the diagonal elements of K
ij
.
6.12 Ginzburg Criterion
Let us dene A(T, H, V, N) to be the usual (i.e. thermodynamic) Helmholtz free energy.
Then
e
A
=
_
Dm e
F[m(x)]
, (6.247)
where the functional F[m(x)] is of the Ginzburg-Landau form, given in eqn. 6.208. The
integral above is a functional integral . We can give it a more precise meaning by dening
its measure in the case of periodic functions m(x) conned to a rectangular box. Then we
can expand
m(x) =
1
q
m
q
e
iqx
, (6.248)
and we dene the measure
Dm dm
0
q
q
x
>0
d Re m
q
d Im m
q
. (6.249)
Note that the fact that m(x) R means that m
q
= m
q
. Well assume T > T
c
and H = 0
and well explore limit T T
+
c
from above to analyze the properties of the critical region
close to T
c
. In this limit we can ignore all but the quadratic terms in m, and we have
e
A
=
_
Dm exp
_
1
2
q
(a +q
2
) [ m
q
[
2
_
(6.250)
=
q
_
k
B
T
a +q
2
_
1/2
. (6.251)
Thus,
A =
1
2
k
B
T
q
ln
_
a +q
2
k
B
T
_
. (6.252)
We now assume that a(T) = t, where t is the dimensionless quantity
t =
T T
c
T
c
, (6.253)
known as the reduced temperature.
We now compute the heat capacity C
V
= T

2
A
T
2
. We are really only interested in the
singular contributions to C
V
, which means that were only interested in dierentiating with
respect to T as it appears in a(T). We divide by ^k
B
where ^ is the number of unit cells
of our system, which we presume is a lattice-based model. Note ^ V/a
d
where V is the
volume and a the lattice constant. The dimensionless heat capacity per lattice site is then
c
C
V
^
=

2
a
d
2
2
_
d
d
q
(2)
d
1
(
2
+q
2
)
2
, (6.254)
6.13. APPENDIX I : EQUIVALENCE OF THE MEAN FIELD DESCRIPTIONS 349
where = (/t)
1/2
[t[
1/2
is the correlation length, and where a
1
is an ultraviolet
cuto. We dene R
(/)
1/2
, in which case
c = R
4
a
d
4d
1
2
/
_
d
d
q
(2)
d
1
(1 + q
2
)
2
, (6.255)
where q q. Thus,
c(t)
_
_
const. if d > 4
ln t if d = 4
t
d
2
2
if d < 4 .
(6.256)
For d > 4, mean eld theory is qualitatively accurate, with nite corrections. In dimensions
d 4, the mean eld result is overwhelmed by uctuation contributions as t 0
+
(i.e. as
T T
+
c
). We see that MFT is sensible provided the uctuation contributions are small,
i.e. provided
R
4
a
d
4d
1 , (6.257)
which entails t t
G
, where
t
G
=
_
a
R
_ 2d
4d
(6.258)
is the Ginzburg reduced temperature. The criterion for the suciency of mean eld theory,
namely t t
G
, is known as the Ginzburg criterion. The region [t[ < t
G
is known as the
critical region.
In a lattice ferromagnet, as we have seen, R
a is on the scale of the lattice spacing itself,

hence t
G
1 and the critical regime is very large. Mean eld theory then fails quickly as
T T
c
. In a (conventional) three-dimensional superconductor, R
is on the order of the

Cooper pair size, and R
/a 10
2
10
3
, hence t
G
= (a/R
)
6
10
18
10
12
is negligibly
narrow. The mean eld theory of the superconducting transition BCS theory is then
valid essentially all the way to T = T
c
.
6.13 Appendix I : Equivalence of the Mean Field Descrip-
tions
In both the variational density matrix and mean eld Hamiltonian methods as applied to
the Ising model, we obtained the same result m = tanh
_
(m +h)/
_
. What is perhaps not
obvious is whether these theories are in fact the same, i.e. if their respective free energies
agree. Indeed, the two free energy functions,
f
A
(m, h, ) =
1
2
m
2
hm+
__
1 +m
2
_
ln
_
1 +m
2
_
+
_
1 m
2
_
ln
_
1 m
2
__
f
B
(m, h, ) = +
1
2
m
2
ln
_
e
+(m+h)/
+e
(m+h)/
_
, (6.259)
where f
A
is the variational density matrix result and f
B
is the mean eld Hamiltonian
result, clearly are dierent functions of their arguments. However, it turns out that upon
minimizing with respect to m in each cast, the resulting free energies obey f
A
(h, ) =
f
B
(h, ). This agreement may seem surprising. The rst method utilizes an approximate
(variational) density matrix applied to the exact Hamiltonian

H. The second method
approximates the Hamiltonian as

H
MF
, but otherwise treats it exactly. The two Landau
expansions seem hopelessly dierent:
f
A
(m, h, ) = ln 2 hm+
1
2
( 1) m
2
+

12
m
4
+

30
m
6
+. . . (6.260)
f
B
(m, h, ) = ln 2 +
1
2
m
2
(m +h)
2
2
+
(m+h)
4
12
3

(m +h)
6
45
5
+. . . . (6.261)
We shall now prove that these two methods, the variational density matrix and the mean
eld approach, are in fact equivalent, and yield the same free energy f(h, ).
Let us generalize the Ising model and write
H =
i<j
J
ij
(
i
,
j
)
i
(
i
) . (6.262)
Here, each spin
i
may take on any of K possible values, s
1
, . . . , s
K
. For the S = 1 Ising
model, we would have K = 3 possibilities, with s
1
= 1, s
2
= 0, and s
3
= +1. But the set
s
, with 1, . . . , K, is completely arbitrary

15
. The local eld term () is also a
completely arbitrary function. It may be linear, with () = H, for example, but it could
also contain terms quadratic in , or whatever one desires.
The symmetric, dimensionless interaction function (,
) = (
, ) is areal symmetric
K K matrix. According to the singular value decomposition theorem, any such matrix
may be written in the form
(,
) =
N
s
p=1
A
p
p
()
p
(
) , (6.263)
where the A
p
are coecients (the singular values), and the
_
p
()
_
are the singular
vectors. The number of terms N
s
in this decomposition is such that N
s
K. This treatment
can be generalized to account for continuous .
6.13.1 Variational Density Matrix
The most general single-site variational density matrix is written
() =
K
=1
x
,s
. (6.264)
15
It neednt be an equally spaced sequence, for example.
6.13. APPENDIX I : EQUIVALENCE OF THE MEAN FIELD DESCRIPTIONS 351
Thus, x
is the probability for a given site to be in state , with = s
. The x
are the
K variational parameters, subject to the single normalization constraint,
= 1. We
now have
f =
1
N

J(0)
_
Tr (

H) +k
B
T Tr ( ln )
_
=
1
2
A
p
p
(s
)
p
(s
) x
(s
) x
lnx
, (6.265)
where () = ()/

J(0). We extremize in the usual way, introducing a Lagrange undeter-
mined multiplier to enforce the constraint. This means we extend the function f
_
x
_
,
writing
f
(x
1
, . . . , x
K
, ) = f(x
1
, . . . , x
K
) +
_
K
=1
x
1
_
, (6.266)
and freely extremizing with respect to the (K + 1) parameters x
1
, . . . , x
K
, ). This yields
K nonlinear equations,
0 =
f
A
p
p
(s
)
p
(s
) x
(s
) + ln x
+ + , (6.267)
for each , and one linear equation, which is the normalization condition,
0 =
f
1 . (6.268)
We cannot solve these nonlinear equations analytically, but they may be recast, by expo-
nentiating them, as
x
=
1
Z
exp
_
1
A
p
p
(s
)
p
(s
) x
+(s
)
_
_
, (6.269)
with
Z = e
(/)+1
=
exp
_
1
A
p
p
(s
)
p
(s
) x
+(s
)
_
_
. (6.270)
From the logarithm of x
, we may compute the entropy, and, nally, the free energy:

f(h, ) =
1
2
A
p
p
(s
)
p
(s
) x
ln Z , (6.271)
which is to be evaluated at the solution of 6.267,
_
x
(h, )
_
6.13.2 Mean Field Approximation
We now derive a mean eld approximation in the spirit of that used in the Ising model
above. We write
p
() =
p
()
_
+
p
() , (6.272)
and abbreviate

p
=
p
()
_
, the thermodynamic average of
p
() on any given site. We
then have
p
()
p
(
) =

2
p
+

p
() +

p
(
) +
p
()
p
(
) (6.273)
=
2
p
+

p
_
p
() +
p
(
)
_
+
p
()
p
(
) . (6.274)
The product
p
()
p
(
) is of second order in uctuations, and we neglect it. This leads

us to the mean eld Hamiltonian,
H
MF
= +
1
2
N

J(0)
p
A
p

2
p
i
_
J(0)
p
A
p

p
(
i
) + (
i
)
_
. (6.275)
The free energy is then
f
_
p
, h,
_
=
1
2
p
A
p

2
p
ln
exp
_
1
p
A
p

p
(s
) +(s
)
_
_
. (6.276)
The variational parameters are the mean eld values
_
p
_
.
The single site probabilities x
are then
x
=
1
Z
exp
_
1
p
A
p

p
(s
) +(s
)
_
_
, (6.277)
with Z implied by the normalization
= 1. These results reproduce exactly what we

found in eqn. 6.267, since the mean eld equation here, f/
p
= 0, yields
p
=
K
=1
p
(s
) x
. (6.278)
The free energy is immediately found to be
f(h, ) =
1
2
p
A
p

2
p
ln Z , (6.279)
which again agrees with what we found using the variational density matrix.
Thus, whether one extremizes with respect to the set x
1
, . . . , x
K
, , or with respect to
the set
p
, the results are the same, in terms of all these parameters, as well as the free
energy f(h, ). Generically, both approaches may be termed mean eld theory since the
variational density matrix corresponds to a mean eld which acts on each site independently.
6.14. APPENDIX II : BLUME-CAPEL MODEL 353
6.14 Appendix II : Blume-Capel Model
The Blume-Capel model provides a simple and convenient way to model systems with
vacancies. The simplest version of the model is written
H =
1
2
i,j
J
ij
S
i
S
j
+
i
S
2
i
. (6.280)
The spin variables S
i
range over the values 1 , 0 , +1, so this is an extension of the S = 1
Ising model. We explicitly separate out the diagonal terms, writing J
ii
0, and placing
them in the second term on the RHS above. We say that site i is occupied if S
i
= 1 and
vacant if S
i
= 0, and we identify as the vacancy creation energy, which may be positive
or negative, depending on whether vacancies are disfavored or favored in our system.
We make the mean eld Ansatz, writing S
i
= m + S
i
. This results in the mean eld
Hamiltonian,
H
MF
=
1
2
N

J(0) m
2

J(0) m
i
S
i
+
i
S
2
i
. (6.281)
Once again, we adimensionalize, writing f F/N

J(0), = k
B
T/

J(0), and = /

J(0).
We assume

J(0) > 0. The free energy per site is then
f(, , m) =
1
2
m
2
ln
_
1 + 2e
/
cosh(m/)
_
. (6.282)
Extremizing with respect to m, we obtain the mean eld equation,
m =
2 sinh(m/)
exp(/) + 2 cosh(m/)
. (6.283)
Note that m = 0 is always a solution. Finding the slope of the RHS at m = 0 and setting
it to unity gives us the critical temperature:
c
=
2
exp(/
c
) + 2
. (6.284)
This is an implicit equation for
c
in terms of the vacancy energy .
Lets now expand the free energy in terms of the magnetization m. We nd, to fourth order,
f = ln
_
1 + 2e
/
_
+
1
2
_

2
2 + exp(/)
_
m
2
(6.285)
+
1
12
_
2 + exp(/)
_
3
_
6
2 + exp(/)
1
_
m
4
+. . . .
Note that setting the coecient of the m
2
term to zero yields the equation for
c
. However,
upon further examination, we see that the coecient of the m
4
term can also vanish. As
we have seen, when both the coecients of the m
2
and the m
4
terms vanish, we have a
tricritical point
16
. Setting both coecients to zero, we obtain
t
=
1
3
,
t
=
2
3
ln 2 . (6.286)
16
We should really check that the coecient of the sixth order term is positive, but that is left as an
exercise to the eager student.
Figure 6.19: Mean eld phase diagram for the Blume-Capel model. The black dot signies
a tricritical point, where the coecients of m
2
and m
4
in the Landau free energy expansion
both vanish. The dashed curve denotes a rst order transition, and the solid curve a second
order transition. The thin dotted line is the continuation of the
c
() relation to zero
temperature.
At = 0, it is easy to see we have a rst order transition, simply by comparing the energies
of the paramagnetic (S
i
= 0) and ferromagnetic (S
i
= +1 or S
i
= 1) states. We have
E
MF
N

J(0)
=
_
0 if m = 0
1
2
if m = 1 .
(6.287)
These results are in fact exact, and not only valid for the mean eld theory. Mean eld
theory is approximate because it neglects uctuations, but at zero temperature, there are
no uctuations to neglect!
The phase diagram is shown in g. 6.19. Note that for large and negative, vacancies are
strongly disfavored, hence the only allowed states on each site have S
i
= 1, which is our
old friend the two-state Ising model. Accordingly, the phase boundary there approaches the
vertical line
c
= 1, which is the mean eld transition temperature for the two-state Ising
model.
6.15 Appendix III : Ising Antiferromagnet in an External
Field
Consider the following model:
H = J
ij)
j
H
i
, (6.288)
6.15. APPENDIX III : ISING ANTIFERROMAGNET IN AN EXTERNAL FIELD 355
with J > 0 and
i
= 1. Weve solved for the mean eld phase diagram of the Ising
ferromagnet; what happens if the interactions are antiferromagnetic?
It turns out that under certain circumstances, the ferromagnet and the antiferromagnet
behave exactly the same in terms of their phase diagram, response functions, etc. This
occurs when H = 0, and when the interactions are between nearest neighbors on a bipartite
lattice. A bipartite lattice is one which can be divided into two sublattices, which we call A
and B, such that an A site has only B neighbors, and a B site has only A neighbors. The
square, honeycomb, and body centered cubic (BCC) lattices are bipartite. The triangular
and face centered cubic lattices are non-bipartite. Now if the lattice is bipartite and the
interaction matrix J
ij
is nonzero only when i and j are from dierent sublattices (they
neednt be nearest neighbors only), then we can simply redene the spin variables such that
j
=
_
+
j
if j A
j
if j B .
(6.289)
Then
j
=
i
j
, and in terms of the new spin variables the exchange constant has
reversed. The thermodynamic properties are invariant under such a redenition of the spin
variables.
We can see why this trick doesnt work in the presence of a magnetic eld, because the eld
H would have to be reversed on the B sublattice. In other words, the thermodynamics of an
Ising ferromagnet on a bipartite lattice in a uniform applied eld is identical to that of the
Ising antiferromagnet, with the same exchange constant (in magnitude), in the presence of
a staggered eld H
A
= +H and H
B
= H.
We treat this problem using the variational density matrix method, using two independent
variational parameters m
A
and m
B
for the two sublattices:
A
() =
1 +m
A
2

,1
+
1 m
A
2

,1
(6.290)
B
() =
1 +m
B
2

,1
+
1 m
B
2

,1
. (6.291)
With the usual adimensionalization, f = F/NzJ, = k
B
T/zJ, and h = H/zJ, we have the
free energy
f(m
A
, m
B
) =
1
2
m
A
m
B
1
2
h(m
A
+m
B
)
1
2
s(m
A
)
1
2
s(m
B
) , (6.292)
where the entropy function is
s(m) =
_
1 +m
2
ln
_
1 +m
2
_
+
1 m
2
ln
_
1 m
2
_
_
. (6.293)
Note that
ds
dm
=
1
2
ln
_
1 +m
1 m
_
,
d
2
s
dm
2
=
1
1 m
2
. (6.294)
Figure 6.20: Graphical solution to the mean eld equations for the Ising antiferromagnet
in an external eld, here for = 0.6. Clockwise from upper left: (a) h = 0.1, (b) h = 0.5,
(c) h = 1.1, (d) h = 1.4.
Dierentiating f(m
A
, m
B
) with respect to the variational parameters, we obtain two coupled
mean eld equations:
f
m
A
= 0 = m
B
= h

2
ln
_
1 +m
A
1 m
A
_
(6.295)
f
m
B
= 0 = m
A
= h

2
ln
_
1 +m
B
1 m
B
_
. (6.296)
Recognizing tanh
1
(x) =
1
2
ln
_
(1 +x)/(1 x)
_
, we may write these equations in an equiv-
alent but perhaps more suggestive form:
m
A
= tanh
_
h m
B
_
, m
B
= tanh
_
h m
A
_
. (6.297)
In other words, the A sublattice sites see an internal eld H
A,int
= zJm
B
from their B
neighbors, and the B sublattice sites see an internal eld H
B,int
= zJm
A
from their A
neighbors.
6.15. APPENDIX III : ISING ANTIFERROMAGNET IN AN EXTERNAL FIELD 357
Figure 6.21: Mean eld phase diagram for the Ising antiferromagnet in an external eld.
The phase diagram is symmetric under reection in the h = 0 axis.
We can solve these equations graphically, as in g. 6.20. Note that there is always a
paramagnetic solution with m
A
= m
B
= m, where
m = h

2
ln
_
1 +m
1 m
_
m = tanh
_
h m
_
. (6.298)
However, we can see from the gure that there will be three solutions to the mean eld
equations provided that
m
A
m
B
< 1 at the point of the solution where m
A
= m
B
= m. This
gives us two equations with which to eliminate m
A
and m
B
, resulting in the curve
h
() = m+

2
ln
_
1 +m
1 m
_
with m =
1 . (6.299)
Thus, for < 1 and [h[ < h
() there are three solutions to the mean eld equations. It is

usually the case, the broken symmetry solutions, which mean those for which m
A
,= m
B
in
our case, are of lower energy than the symmetric solution(s). We show the curve h
() in
g. 6.21.
We can make additional progress by dening the average and staggered magnetizations m
and m
s
,
m
1
2
(m
A
+m
B
) , m
s

1
2
(m
A
m
B
) . (6.300)
We expand the free energy in terms of m
s
:
f(m, m
s
) =
1
2
m
2
1
2
m
2
s
hm
1
2
s(m+m
s
)
1
2
s(mm
s
)
=
1
2
m
2
hm s(m)
1
2
_
1 + s
(m)
_
m
2
s

1
24
s
(m) m
4
s
+. . . . (6.301)
The term quadratic in m
s
vanishes when s
(m) = 1, i.e. when m =
1 . It is easy
to obtain
d
3
s
dm
3
=
2m
(1 m
2
)
2
,
d
4
s
dm
4
=
2 (1 + 3m
2
)
(1 m
2
)
3
, (6.302)
from which we learn that the coecient of the quartic term,
1
24
s
(m), never vanishes.

Therefore the transition remains second order down to = 0, where it nally becomes rst
order.
We can conrm the 0 limit directly. The two competing states are the ferromagnet,
with m
A
= m
B
= 1, and the antiferromagnet, with m
A
= m
B
= 1. The free energies
of these states are
f
FM
=
1
2
h , f
AFM
=
1
2
. (6.303)
There is a rst order transition when f
FM
= f
AFM
, which yields h = 1.
6.16 Appendix IV : Canted Quantum Antiferromagnet
Consider the following model for quantum S =
1
2
spins:
H =
ij)
_
J
_
x
i

x
j
+
y
i
y
j
_
+
z
i
z
j
_
+
1
4
K
ijkl)
z
i
z
j
z
k
z
l
, (6.304)
where
i
is the vector of Pauli matrices on site i. The spins live on a square lattice. The
second sum is over all square plaquettes. All the constants J, , and K are positive.
Lets take a look at the Hamiltonian for a moment. The J term clearly wants the spins
to align ferromagnetically in the (x, y) plane (in internal spin space). The term prefers
antiferromagnetic alignment along the z axis. The K term discourages any kind of moment
along z and works against the term. Wed like our mean eld theory to capture the
physics behind this competition.
Accordingly, we break up the square lattice into two interpenetrating
2 square
sublattices (each rotated by 45
with respect to the original), in order to be able to describe

an antiferromagnetic state. In addition, we include a parameter which describes the
canting angle that the spins on these sublattices make with respect to the x-axis. That is,
we write
A
=
1
2
+
1
2
m
_
sin
x
+ cos
z
) (6.305)
B
=
1
2
+
1
2
m
_
sin
x
cos
z
) . (6.306)
Note that Tr
A
= Tr
B
= 1 so these density matrices are normalized. Note also that the
mean direction for a spin on the A and B sublattices is given by
m
A,B
= Tr (
A,B
) = mcos z +msin x . (6.307)
6.16. APPENDIX IV : CANTED QUANTUM ANTIFERROMAGNET 359
Thus, when = 0, the system is an antiferromagnet with its staggered moment lying along
the z axis. When =
1
2
, the system is a ferromagnet with its moment lying along the x
axis.
Finally, the eigenvalues of
A,B
are still
=
1
2
(1 m), hence
Tr (
A
ln
A
) = Tr (
B
ln
B
) = s(m) (6.308)
where
s(m) =
_
1 +m
2
ln
_
1 +m
2
_
+
1 m
2
ln
_
1 m
2
_
_
. (6.309)
Note that we have taken m
A
= m
B
= m, unlike the case of the antiferromagnet in a
uniform eld. The reason is that there remains in our model a symmetry between A and B
sublattices.
The free energy is now easily calculated:
F = Tr (

H) +k
B
T Tr ( ln )
= 2N
_
J sin
2
+ cos
2
_
m
2
+
1
4
NKm
4
cos
4
Nk
B
T s(m) (6.310)
We can adimensionalize by dening /J, K/4J, and k
B
T/4J. Then the free
energy per site is f F/4NJ is
f(m, ) =
1
2
m
2
+
1
2
_
1
_
m
2
cos
2
+
1
4
m
4
cos
4
s(m) . (6.311)
There are two variational parameters: m and . We thus obtain two coupled mean eld
equations,
f
m
= 0 = m+
_
1
_
m cos
2
+m
3
cos
4
+
1
2
ln
_
1 +m
1 m
_
(6.312)
f
= 0 =
_
1 +m
2
cos
2
_
m
2
sin cos . (6.313)
Lets start with the second of the mean eld equations. Assuming m ,= 0, it is clear from
eqn. 6.311 that
cos
2
=
_
_
0 if < 1
( 1)/m
2
if 1 1 +m
2
1 if 1 +m
2
.
(6.314)
Suppose < 1. Then we have cos = 0 and the rst mean eld equation yields the familiar
result
m = tanh
_
m/
_
. (6.315)
Figure 6.22: Mean eld phase diagram for the model of eqn. 6.304 for the case = 1.
Along the axis, then, we have the usual ferromagnet-paramagnet transition at
c
= 1.
For 1 < < 1 +m
2
we have canting with an angle
=
(m) = cos
1
_
1
m
2
. (6.316)
Substituting this into the rst mean eld equation, we once again obtain the relation m =
tanh
_
m/
_
. However, eventually, as is increased, the magnetization will dip below the
value m
0

_
( 1)/. This occurs at a dimensionless temperature
0
=
m
0
tanh
1
(m
0
)
< 1 ; m
0
=
_
1
. (6.317)
For >
0
, we have > 1+m
2
, and we must take cos
2
= 1. The rst mean eld equation
then becomes
mm
3
=

2
ln
_
1 +m
1 m
_
, (6.318)
or, equivalently, m = tanh
_
(m m
3
)/
_
. A simple graphical analysis shows that a
nontrivial solution exists provided < . Since cos = 1, this solution describes an
antiferromagnet, with m
A
= m z and m
B
= m z. The resulting mean eld phase diagram
is then as depicted in g. 6.22.
6.17 Appendix V : Coupled Order Parameters
Consider the Landau free energy
f(m, ) =
1
2
a
m
m
2
+
1
4
b
m
m
4
+
1
2
a
2
+
1
4
b
4
+
1
2
m
2
2
. (6.319)
6.17. APPENDIX V : COUPLED ORDER PARAMETERS 361
We write
a
m

m
m
, a
, (6.320)
where
m
=
T T
c,m
T
0
,
=
T T
c,
T
0
, (6.321)
where T
0
is some temperature scale. We assume without loss of generality that T
c,m
> T
c,
.
We begin by rescaling:
m
_
m
b
m
_
1/2
m ,
_
m
b
m
_
1/2
. (6.322)
We then have
f =
0
_
r
_
1
2
m
m
2
+
1
4
m
4
_
+r
1
_
1
2

2
+
1
4
4
_
+
1
2
m
2
2
_
, (6.323)
where
0
=
(b
m
b
)
1/2
, r =

m
_
b
b
m
_
1/2
, =

(b
m
b
)
1/2
. (6.324)
It proves convenient to perform one last rescaling, writing
m r
1/4
m ,

r
1/4
. (6.325)
Then
f =
0
_
1
2
q
m
m
2
+
1
4
m
4
+
1
2
q
1
2
+
1
4

4
+
1
2
m
2
2
_
, (6.326)
where
q =
r =
_
_
1/2
_
b
b
m
_
1/4
. (6.327)
Note that we may write
f(m, ) =

0
4
_
m
2
2
_
_
1
1
__
m
2
2
_
+

0
2
_
m
2
2
_
_
q
m
q
1
_
.
The eigenvalues of the above 2 2 matrix are 1 , with corresponding eigenvectors
_
1
1
_
.
Since
2
> 0, we are only interested in the rst eigenvector
_
1
1
_
, corresponding to the
eigenvalue 1 + . Clearly when < 1 the free energy is unbounded from below, which is
unphysical.
We now set
f
m
= 0 ,
f
= 0 , (6.328)
and identify four possible phases:
Phase I : m = 0, = 0. The free energy is f
I
= 0.
Phase II : m ,= 0 with = 0. The free energy is
f =

0
2
_
q
m
m
2
+
1
2
m
4
_
, (6.329)
hence we require
m
< 0 in this phase, in which case
m
II
=
_
q
m
, f
II
=
0
4
q
2
2
m
. (6.330)
Phase III : m = 0 with ,= 0. The free energy is
f =

0
2
_
q
1
2
+
1
2

4
_
, (6.331)
hence we require
< 0 in this phase, in which case
III
=
_
q
1
, f
III
=
0
4
q
2
. (6.332)
Phase IV : m ,= 0 and ,= 0. Varying f yields
_
1
1
__
m
2
2
_
=
_
q
m
q
1
_
, (6.333)
with solution
m
2
=
q
m
q
1
2
1
(6.334)
2
=
q
1
q
m
2
1
. (6.335)
Since m
2
and
2
must each be nonnegative, phase IV exists only over a yet-to-be-
determined subset of the entire parameter space. The free energy is
f
IV
=
q
2
2
m
+q
2
2
m
4(
2
1)
. (6.336)
We now dene
m
and

m
= (T
c,m
T
c,)
/T
0
. Note that > 0. There are
three possible temperature ranges to consider.
(1)
>
m
> 0. The only possible phases are I and IV. For phase IV, we must impose
the conditions m
2
> 0 and
2
> 0. If
2
> 1, then the numerators in eqns. 6.334 and
6.335 must each be positive:
<
q
2
, <
q
2
m
< min
_
q
2
q
2
m
_
. (6.337)
But since either q
2
m
/
or its inverse must be less than or equal to unity, this requires

< 1, which is unphysical.
If on the other hand we assume
2
< 1, the non-negativeness of m
2
and
2
requires
>
q
2
, >
q
2
m
> max
_
q
2
q
2
m
_
> 1 . (6.338)
Thus, > 1 and we have a contradiction.
Therefore, the only allowed phase for > 0 is phase I.
(2)
> 0 >
m
. Now the possible phases are I, II, and IV. We can immediately rule out
phase I because f
II
< f
I
. To compare phases II and IV, we compute
f = f
IV
f
II
=
(q
m
q
1
)
2
4(
2
1)
. (6.339)
Thus, phase II has the lower energy if
2
> 1. For
2
< 1, phase IV has the lower
energy, but the conditions m
2
> 0 and
2
> 0 then entail
q
2
< <
q
2
m
q
2
[
m
[ >
> 0 . (6.340)
Thus, is restricted to the range

_
1 ,
q
2
[
m
[
_
. (6.341)
With
m
< 0 and
+ > 0, the condition q

2
[
m
[ >
is found to be
< <

q
2
+ 1
. (6.342)
Thus, phase IV exists and has lower energy when
< <

r + 1
and 1 < <
+
r
, (6.343)
where r = q
2
.
(3) 0 >
>
m
. In this regime, any phase is possible, however once again phase I can be
ruled out since phases II and III are of lower free energy. The condition that phase II
have lower free energy than phase III is
f
II
f
III
=

0
4
_
q
2
q
2
2
m
_
< 0 , (6.344)
i.e. [
[ < r[
m
[, which means r[[ > [[ . If r > 1 this is true for all < 0, while
if r < 1 phase II is lower in energy only for [[ < /(1 r).
We next need to test whether phase IV has an even lower energy than the lower of
phases II and III. We have
f
IV
f
II
=
(q
m
q
1
)
2
4(
2
1)
(6.345)
f
IV
f
III
=
(q
m
q
1
)
2
4(
2
1)
. (6.346)
In both cases, phase IV can only be the true thermodynamic phase if
2
< 1. We
then require m
2
> 0 and
2
> 0, which xes

_
1 , min
_
q
2
q
2
m
_
_
. (6.347)
The upper limit will be the rst term inside the rounded brackets if q
2
[
m
[ <
, i.e.
if r[[ < [[ . This is impossible if r > 1, hence the upper limit is given by the
second term in the rounded brackets:
r > 1 :
_
1 ,
+
r
_
(condition for phase IV) . (6.348)
If r < 1, then the upper limit will be q
2
m
/
= r/( +) if [[ > /(1 r), and will

be
/q
2
m
= ( +)/r if [[ < /(1 r).
r < 1 ,

1 r
< < :
_
1 ,
+
r
_
(phase IV) (6.349)
r < 1 , <

1 r
:
_
1 ,
r
+
_
(phase IV) . (6.350)
Representative phase diagrams for the cases r > 1 and r < 1 are shown in g. 6.23.
Figure 6.23: Phase diagram for = 0.5, r = 1.5 (top) and = 0.5, r = 0.25 (bottom). The
hatched purple region is unphysical, with a free energy unbounded from below. The blue
lines denote second order transitions. The thick red line separating phases II and III is a
rst order line.
Chapter 7
Nonequilibrium Phenomena
7.1 References
H. Smith and H. H. Jensen, Transport Phenomena (Oxford, 1989)
An outstanding, thorough, and pellucid presentation of the theory of Boltzmann trans-
port in classical and quantum systems.
E. M. Lifshitz and L. P. Pitaevskii, Physical Kinetics (Pergamon, 1981)
Volume 10 in the famous Landau and Lifshitz Course of Theoretical Physics. Surpris-
ingly readable, and with many applications (some advanced).
1967, and with good reason. The later chapters discuss transport phenomena at an
undergraduate level.
N. G. Van Kampen, Stochastic Processes in Physics and Chemistry (3
rd
edition,
North-Holland, 2007)
This is a very readable and useful text. A relaxed but meaty presentation.
R. Balescu, Equilibrium and Nonequilibrium Statistical Mechanics (Wiley, 1975)
An advanced text, but one with a lot of physically motivated discussion. A large
fraction of the book is dedicated to nonequilibrium statistical mechanics.
A comprehensive graduate level text with an emphasis on nonequilibrium phenomena.
J. A. McLennan, Introduction to Non-equilibrium Statistical Mechanics (Prentice-
Hall, 1989)
A detailed modern text on the Boltzmann equation.
367
368 CHAPTER 7. NONEQUILIBRIUM PHENOMENA
7.2 Equilibrium, Nonequilibrium and Local Equilibrium
Classical equilibrium statistical mechanics is described by the full N-body distribution,
f(x
1
, . . . , x
N
; p
1
, . . . , p
N
) =
_
_
Z
1
N

1
N!
e

H
N
(p,x)
OCE
1
N!
e
N
e

H
N
(p,x)
GCE .
(7.1)
We assume a Hamiltonian of the form
H
N
=
N
i=1
p
2
i
2m
+
N
i=1
v(x
i
) +
N
i<j
u(x
i
x
j
), (7.2)
typically with v = 0, i.e. only two-body interactions. The quantity
f(x
1
, . . . , x
N
; p
1
, . . . , p
N
)
d
d
x
1
d
d
p
1
h
d

d
d
x
N
d
d
p
N
h
d
(7.3)
is the propability of nding N particles in the system, with particle #1 lying within d
3
x
1
of
x
1
and having momentum within d
d
p
1
of p
1
, etc. The temperature T and chemical potential
are constants, independent of position. Note that f(x
i
, p
i
) is dimensionless.
Nonequilibrium statistical mechanics seeks to describe thermodynamic systems which are
out of equilibrium, meaning that the distribution function is not given by the Boltzmann
distribution above. For a general nonequilibrium setting, it is hopeless to make progress
wed have to integrate the equations of motion for all the constituent particles. However,
typically we are concerned with situations where external forces or constraints are imposed
over some macroscopic scale. Examples would include the imposition of a voltage drop
across a metal, or a temperature dierential across any thermodynamic sample. In such
cases, scattering at microscopic length and time scales described by the mean free path
and the collision time work to establish local equilibrium throughout the system. A
local equilibrium is a state described by a space and time varying temperature T(r, t) and
chemical potential (r, t). As we will see, the Boltzmann distribution with T = T(r, t)
and = (r, t) will not be a solution to the evolution equation governing the distribution
function. Rather, the distribution for systems slightly out of equilibrium will be of the form
f = f
0
+f, where f
0
describes a state of local equilibrium.
We will mainly be interested in the one-body distribution
f(r, p, t) = h
d
i=1
(x
i
r) (p
i
p)
_
(7.4)
= N
_
N
i=2
d
d
x
i
d
d
p
i
h
d
f(r, x
2
, . . . , x
N
; p, p
2
, . . . , p
N
) .
This is also dimensionless. It is the density of particles in phase space, where phase space
volumes are measured in units of h
d
. In the GCE, we sum the RHS above over N. Assuming
7.3. BOLTZMANN EQUATION 369
v = 0 so that there is no one-body potential to break translational symmetry, the equilibrium
distribution is time-independent and space-independent:
f
0
(r, p) =
_
n
3
T
e
p
2
/2mk
B
T
OCE
e
/k
B
T
e
p
2
/2mk
B
T
GCE .
(7.5)
From the one-body distribution we can compute things like the particle current, j, and the
energy current, j
:
j(r) =
_
d
d
p
h
d
f(r, p)
p
m
(7.6)
j
(r) =
_
d
d
p
h
d
f(r, p) (p)
p
m
, (7.7)
where (p) = p
2
/2m. Clearly these currents both vanish in equilibrium, when f = f
0
, since
f
0
(r, p) depends only on p
2
and not on the direction of p.
When the individual particles are not point particles, they possess angular momentum as
well as linear momentum. Following Lifshitz and Pitaevskii, we abbreviate = (p, L) for
these two variables for the case of diatomic molecules, and = (p, L, n L) in the case of
spherical top molecules, where n is the symmetry axis of the top. We then have, in d = 3
dimensions,
d =
_
_
h
3
d
3
p point particles
h
5
d
3
p LdLd
L
diatomic molecules
h
6
d
3
p L
2
dLd
L
d cos symmetric tops ,
(7.8)
where = cos
1
( n

L). We will call the set the kinematic variables. The instantaneous
number density at r is then
n(r, t) =
_
d f(r, , t) . (7.9)
7.3 Boltzmann Equation
For simplicity of presentation, we assume point particles. Recall that
f(r, p, t)
d
3
r d
3
p
h
3

_
# of particles with positions within d
3
r of
r and momenta within d
3
p of p at time t.
(7.10)
We now ask how the distribution functions f(r, p, t) evolves in time. It is clear that in the
absence of collisions, the distribution function must satisfy the continuity equation,
f
t
+(uf) = 0 . (7.11)
This is just the condition of number conservation for particles. Take care to note that
and u are six-dimensional phase space vectors:
u = ( x , y , z , p
x
, p
y
, p
z
) (7.12)
=
_

x
,

y
,

z
,

p
x
,

p
y
,

p
z
_
. (7.13)
The continuity equation describes a distribution in which each constituent particle evolves
according to a prescribed dynamics, which for a mechanical system is specied by
dr
dt
=
H
p
= v(p) ,
dp
dt
=
H
r
= F
ext
, (7.14)
where F is an external applied force. Here,
H(p, r) = (p) +U
ext
(r) . (7.15)
For example, if the particles are under the inuence of gravity, then U
ext
(r) = mg r and
F = U
ext
= mg.
Note that as a consequence of the dynamics, we have u = 0, i.e. phase space ow is
incompressible, provided that (p) is a function of p alone, and not of r. Thus, in the
absence of collisions, we have
f
t
+u f = 0 . (7.16)
The dierential operator D
t

t
+ u is sometimes called the convective derivative,
because D
t
f is the time derivative of f in a comoving frame of reference.
Next we must consider the eect of collisions, which are not accounted for by the semiclassi-
cal dynamics. In a collision process, a particle with momentump and one with momentum p
can instantaneously convert into a pair with momenta p
and p
, provided total momentum

is conserved: p + p = p
+ p
. This means that D

t
f ,= 0. Rather, we should write
f
t
+ r
f
r
+ p
f
p
=
_
f
t
_
coll
(7.17)
where the right side is known as the collision integral . The collision integral is in general a
function of r, p, and t and a functional of the distribution f.
After a trivial rearrangement of terms, we can write the Boltzmann equation as
f
t
=
_
f
t
_
str
+
_
f
t
_
coll
, (7.18)
where
_
f
t
_
str
r
f
r
p
f
p
(7.19)
is known as the streaming term. Thus, there are two contributions to f/t : streaming
and collisions.
7.3.1 Collisionless Boltzmann equation
In the absence of collisions, the Boltzmann equation is given by
f
t
+

p

f
r
U
ext
f
p
= 0 . (7.20)
In order to gain some intuition about how the streaming term aects the evolution of the
distribution f(r, p, t), consider a case where F
ext
= 0. We then have
f
t
+
p
m

f
r
= 0 . (7.21)
Clearly, then, any function of the form
f(r, p, t) =
_
r
pt
m
, p
_
(7.22)
will be a solution to the collisionless Boltzmann equation. One possible solution would be
the Boltzmann distribution,
f(r, p, t) = e
/k
B
T
e
p
2
/2mk
B
T
, (7.23)
which is time-independent
1
.
For a slightly less trivial example, let the initial distribution be (r, p) = Ae
r
2
/2
2
e
p
2
/2
2
,
so that
f(r, p, t) = Ae
_
r
pt
m
_
2
/2
2
e
p
2
/2
2
. (7.24)
Consider the one-dimensional version, and rescale position, momentum, and time so that
f(x, p, t) = Ae
1
2
( x p
t)
2
e
1
2
p
2
. (7.25)
Consider the level sets of f, where f(x, p, t) = Ae
1
2
2
. The equation for these sets is
x = p
t
_
2
p
2
. (7.26)
For xed

t, these level sets describe the loci in phase space of equal probability densities,
with the probability density decreasing exponentially in the parameter
2
. For

t = 0,
the initial distribution describes a Gaussian cloud of particles with a Gaussian momentum
distribution. As

t increases, the distribution widens in x but not in p each particle moves
with a constant momentum, so the set of momentum values never changes. However, the
level sets in the ( x , p) plane become elliptical, with a semimajor axis oriented at an angle
= ctn
1
(t) with respect to the x axis. For

t > 0, he particles at the outer edges of the
cloud are more likely to be moving away from the center. See the sketches in g. 7.1
Suppose we add in a constant external force F
ext
. Then it is easy to show (and left as an
exercise to the reader to prove) that any function of the form
f(r, p, t) = A
_
r
pt
m
+
F
ext
t
2
2m
, p
F
ext
t
m
_
(7.27)
satises the collisionless Boltzmann equation.
1
Indeed, any arbitrary function of p alone would be a solution. Ultimately, we require some energy
exchanging processes, such as collisions, in order for any initial nonequilibrium distribution to converge to
the Boltzmann distribution.
Figure 7.1: Level sets for a sample f( x, p,
t) = Ae
1
2
( x p
t)
2
e
1
2
p
2
, for values f = Ae
1
2
2
with in equally spaced intervale from = 0.2 (red) to = 1.2 (blue).
7.3.2 Collisional invariants
Consider a function A(r, p) of position and momentum. Its average value at time t is
A(t) =
_
d
3
r d
3
p
h
3
A(r, p) f(r, p, t) . (7.28)
Taking the time derivative,
dA
dt
=
_
d
3
r d
3
p
h
3
A(r, p)
f
t
=
_
d
3
r d
3
p
h
3
A(r, p)
_

r
( rf)

p
( pf) +
_
f
t
_
coll
_
=
_
d
3
r d
3
p
h
3
_
_
A
r

dr
dt
+
A
p

dp
dt
_
f +A(r, p)
_
f
t
_
coll
_
. (7.29)
Hence, if A is preserved by the dynamics between collisions, then
2
dA
dt
=
A
r

dr
dt
+
A
p

dp
dt
= 0 . (7.30)
We therefore have that the rate of change of A is determined wholly by the collision integral
dA
dt
=
_
d
3
r d
3
p
h
3
A(r, p)
_
f
t
_
coll
. (7.31)
Quantities which are then conserved in the collisions satisfy

A = 0. Such quantities are
called collisional invariants. Examples of collisional invariants include the particle number
(A = 1), the components of the total momentum (A = p
) (in the absence of broken

translational invariance, due e.g. to the presence of walls), and the total energy (A = (p)).
7.3.3 Scattering processes
What sort of processes contribute to the collision integral? There are two broad classes to
consider. The rst involves potential scattering, where a particle in state [) scatters, in
the presence of an external potential, to a state [
). Recall that is an abbreviation for

the set of kinematic variables, e.g. = (p, L) in the case of a diatomic molecule. For point
particles, = (p
x
, p
y
, p
z
) and d = d
3
p/h
3
.
We now dene the function w
_
[
_
such that
w
_
[
_
f(r, ) d
=
_
rate at which a particle at (r, ) scatters
[) [
) within d
of (r,
) at time t.
(7.32)
The units of w are therefore [w] = L
3
/T. The dierential scattering cross section for particle
scattering is then
d =
w
_
[
_
[v[
d
, (7.33)
where v = p/m is the particles velocity.
The second class is that of two-particle scattering processes, i.e. [
1
) [
1
). We dene
the scattering function w
_
1
[
1
_
by
w
_
1
[
1
_
f(r, ) f(r,
1
) d
1
d
1
=
_
_
rate at which a particle at (r, ) scatters
with a particle within d
1
of (r,
1
) into
a region within d
1
of [
1
) at time t.
(7.34)
2
Recall from classical mechanics the denition of the Poisson bracket, |A, B =
A
r

B
p

B
r

A
p
.
Then from Hamiltons equations r =
H
p
and p =
H
r
, where H(p, r, t) is the Hamiltonian, we have
dA
dt
= |A, H. Invariants have zero Poisson bracket with the Hamiltonian.
Figure 7.2: Left: single particle scattering process [) [
). Right: two-particle scattering

process [
1
) [
1
).
The dierential scattering cross section is then
d =
w
_
1
[
1
_
f(r, )
[v v[
d
1
. (7.35)
We assume, in both cases, that any scattering occurs locally, i.e. the particles attain their
asymptotic kinematic states on distance scales small compared to the mean interparticle
separation. In this case we can treat each scattering process independently. This assumption
is particular to rareed systems, i.e. gases, and is not appropriate for dense liquids. The
two types of scattering processes are depicted in g. 7.2.
In computing the collision integral for the state [r, ), we must take care to sum over
contributions from transitions out of this state, i.e. [) [
), which reduce f(r, ), and

transitions into this state, i.e. [
) [), which increase f(r, ). Thus, for one-body

scattering, we have
D
Dt
f(r, , t) =
_
f
t
_
coll
=
_
d
_
w( [
) f(r,
, t) w(
[ ) f(r, , t)
_
. (7.36)
For two-body scattering, we have
D
Dt
f(r, , t) =
_
f
t
_
coll
(7.37)
=
_
d
1
_
d
_
d
1
_
w
_
1
[
1
_
f(r,
, t) f(r,
1
, t)
w
_
1
[
1
_
f(r, , t) f(r,
1
, t)
_
.
7.3.4 Detailed balance
Classical mechanics places some restrictions on the form of the kernel w
_
1
[
1
_
. In
particular, if
T
= (p, L) denotes the kinematic variables under time reversal, then
w
_
1
[
1
_
= w
_
T
1
[
T
1
T
_
. (7.38)
This is because the time reverse of the process [
1
) [
1
) is [
T
1
T
) [
T
T
1
).
In equilibrium, we must have
w
_
1
[
1
_
f
0
() f
0
(
1
) d
4
= w
_
T
1
[
T
1
T
_
f
0
(
T
) f
0
(
1
T
) d
4
T
(7.39)
where
d
4
d d
1
d
1
, d
4
T
d
T
d
T
1
d
T
d
1
T
. (7.40)
Since d = d
T
etc., we may cancel the dierentials above, and after invoking eqn. 7.38
and suppressing the common r label, we nd
f
0
() f
0
(
1
) = f
0
(
T
) f
0
(
1
T
) . (7.41)
This is the condition of detailed balance. For the Boltzmann distribution, we have
f
0
() = Ae
/k
B
T
, (7.42)
where A is a constant and where = () is the kinetic energy, e.g. () = p
2
/2m in the
case of point particles. Note that (
T
) = (). Detailed balance is satised because the
kinematics of the collision requires energy conservation:
+
1
=
1
. (7.43)
Since momentum is also kinematically conserved, i.e.
p +p
1
= p
+p
1
, (7.44)
any distribution of the form
f
0
() = Ae
(pV )/k
B
T
(7.45)
also satises detailed balance, for any velocity parameter V . This distribution is appropriate
for gases which are owing with average particle V .
In addition to time-reversal, parity is also a symmetry of the microscopic mechanical laws.
Under the parity operation P, we have r r and p p. Note that a pseudovector
such as L = r p is unchanged under P. Thus,
P
= (p, L). Under the combined
operation of C = PT, we have
C
= (p, L). If the microscopic Hamiltonian is invariant
under C, then we must have
w
_
1
[
1
_
= w
_
C
1
[
C
1
C
_
. (7.46)
For point particles, invariance under T and P then means
w(p
, p
1
[ p, p
1
) = w(p, p
1
[ p
, p
1
) , (7.47)
and therefore the collision integral takes the simplied form,
Df(p)
Dt
=
_
f
t
_
coll
(7.48)
=
_
d
3
p
1
h
3
_
d
3
p
h
3
_
d
3
p
1
h
3
w(p
, p
1
[ p, p
1
)
_
f(p
) f(p
1
) f(p) f(p
1
)
_
,
where we have suppressed both r and t variables.
The most general statement of detailed balance is
f
0
(
) f
0
(
1
)
f
0
() f
0
(
1
)
=
w
_
1
[
1
_
w
_
1
[
1
_ . (7.49)
Under this condition, the collision term vanishes for f = f
0
, which is the equilibrium
distribution.
7.4 H-Theorem
Lets consider the Boltzmann equation with two particle collisions. We dene the local (i.e.
r-dependent) quantity
H
(r, t)
_
d (f) f . (7.50)
At this point, (f) is arbitrary. Note that the (f) factor has r and t dependence through
its dependence on f, which itself is a function of r, , and t. We now compute
H
t
=
_
d
(f)
t
=
_
d
d(f)
df
f
t
=
_
d u (f)
_
d
d(f)
df
_
f
t
_
coll
=
_
d n (uf)
_
d
d(f)
df
_
f
t
_
coll
. (7.51)
The rst term on the last line follows from the divergence theorem, and vanishes if we
assume f = 0 for innite values of the kinematic variables, which is the only physical
possibility. Thus, the rate of change of H
is entirely due to the collision term. Thus,

H
t
=
_
d
_
d
1
_
d
_
d
1
_
w
_
1
[
1
_
ff
1
w
_
1
[
1
_
f
_
=
_
d
_
d
1
_
d
_
d
1
w
_
1
[
1
_
ff
1
(
) , (7.52)
where f f(), f
f(
), f
1
f(
1
), f
1
f(
1
),

=

(), with
=
d(f)
df
= +f
d
df
. (7.53)
7.4. H-THEOREM 377
We now invoke the symmetry
w
_
1
[
1
_
= w
_
[
1
_
, (7.54)
which allows us to write
H
t
=
1
2
_
d
_
d
1
_
d
_
d
1
w
_
1
[
1
_
ff
1
(
1
) . (7.55)
Now let us consider (f) = ln f. We dene H H
=lnf
. We then have
H
t
=
1
2
_
d
_
d
1
_
d
_
d
1
wf
1
xln x , (7.56)
where w w
_
1
[
1
_
and x ff
1
/f
1
. We next invoke the result
_
d
_
d
1
w
_
1
[
1
_
=
_
d
_
d
1
w
_
1
[
1
_
(7.57)
which is a statement of unitarity of the scattering matrix
3
. Multiplying both sides by
f() f(
1
), then integrating over and
1
, and nally changing variables (,
1
)
(
1
), we nd
0 =
_
d
_
d
1
_
d
_
d
1
w
_
ff
1
f
1
_
=
_
d
_
d
1
_
d
_
d
1
wf
1
(x 1) . (7.58)
Multiplying this result by
1
2
and adding it to the previous equation for

H, we arrive at our
nal result,
H
t
=
1
2
_
d
_
d
1
_
d
_
d
1
wf
1
(xln x x + 1) . (7.59)
Note that w, f
, and f
1
are all nonnegative. It is then easy to prove that the function
g(x) = xln x x + 1 is nonnegative for all positive x values
4
, which therefore entails the
important result
H(r, t)
t
0 . (7.60)
Boltzmanns H function is the space integral of the H density: H =
_
d
3
r H.
Thus, everywhere in space, the function H(r, t) is monotonically decreasing or constant,
due to collisions. In equilibrium,

H = 0 everywhere, which requires x = 1, i.e.
f
0
() f
0
(
1
) = f
0
(
) f
0
(
1
) , (7.61)
or, taking the logarithm,
ln f
0
() + ln f
0
(
1
) = ln f
0
(
) + ln f
0
(
1
) . (7.62)
3
See Lifshitz and Pitaevskii, Physical Kinetics, 2.
4
The function g(x) = xln x x + 1 satises g
(x) = lnx, hence g
(x) < 0 on the interval x [0, 1)

and g
(x) > 0 on x (1, ]. Thus, g(x) monotonically decreases from g(0) = 1 to g(1) = 0, and then
monotonically increases to g() = , never becoming negative.
But this means that ln f
0
is itself a collisional invariant, and if 1, p, and are the only
collisional invariants, then ln f
0
must be expressible in terms of them. Thus,
ln f
0
=

k
B
T
+
V p
k
B
T

k
B
T
, (7.63)
where , V , and T are constants which parameterize the equilibrium distribution f
0
(p),
corresponding to the chemical potential, ow velocity, and temperature, respectively.
7.5 Weakly Inhomogeneous Gas
Consider a gas which is only weakly out of equilibrium. We follow the treatment in Lifshitz
and Pitaevskii, 6. As the gas is only slightly out of equilibrium, we seek a solution to
the Boltzmann equation of the form f = f
0
+f, where f
0
is describes a local equilibrium.
Recall that such a distribution function is annihilated by the collision term in the Boltzmann
equation but not by the streaming term, hence a correction f must be added in order to
obtain a solution.
The most general form of local equilibrium is described by the distribution
f
0
(r, p) = exp
_
(p) +V p
k
B
T
_
, (7.64)
where = (r, t), T = T(r, t), and V = V (r, t) vary in both space and time. Note that
df
0
=
_
d +p dV + ( V p)
dT
T
d
_
_
f
0
_
=
_
1
n
dp +p dV + ( h)
dT
T
d
_
_
f
0
_
(7.65)
where we have assumed V = 0 on average, and used
d =
_
T
_
p
dT +
_
p
_
T
dp
= s dT +
1
n
dp , (7.66)
where s is the entropy per particle and n is the number density. We have further written
h = + Ts, which is the enthalpy per particle, given by h = c
p
T for an ideal gas. Here,
c
p
is the heat capacity per particle at constant pressure
5
. Finally, note that when f
0
is the
Maxwell-Boltzmann distribution, we have
f
0
=
f
0
k
B
T
. (7.67)
5
In the chapter on thermodynamics, we adopted a slightly dierent denition of c
p
as the heat capacity
per mole. In this chapter c
p
is the heat capacity per particle.
7.5. WEAKLY INHOMOGENEOUS GAS 379
The Boltzmann equation is written
_
t
+
p
m

r
+F

p
_
_
f
0
+f
_
=
_
f
t
_
coll
. (7.68)
The RHS of this equation must be of order f because the local equilibrium distribution f
0
is
annihilated by the collision integral. We therefore wish to evaluate one of the contributions
to the LHS of this equation,
f
0
t
+
p
m

f
0
r
+F
f
0
p
=
_
f
0
_
_
1
n
p
t
+
h
T
T
t
+mv
_
(v) V
_
(7.69)
+v
_
m
V
t
+
1
n
p
_
+
h
T
v T F v
_
.
To simplify this, rst note that Newtons laws applied to an ideal uid give

V = p,
where = mn is the mass density. Corrections to this result, e.g. viscosity and nonlinearity
in V , are of higher order.
Next, continuity for particle number means n + (nV ) = 0. We assume V is zero on
average and that all derivatives are small, hence (nV ) = V n + n V n V .
Thus,
ln n
t
=
ln p
t

ln T
t
= V , (7.70)
where we have invoked the ideal gas law n = p/k
B
T above.
Next, we invoke conservation of entropy. If s is the entropy per particle, then ns is the
entropy per unit volume, in which case we have the continuity equation
(ns)
t
+ (nsV ) = n
_
s
t
+V s
_
+s
_
n
t
+ (nV )
_
= 0 . (7.71)
The second bracketed term on the RHS vanishes because of particle continuity, leaving us
with s+Vs s = 0 (since V = 0 on average, and any gradient is rst order in smallness).
Now thermodynamics says
ds =
_
s
T
_
p
dT +
_
s
p
_
T
dp (7.72)
=
c
p
T
dT
k
B
p
dp ,
since T
_
s
T
_
p
= c
p
and
_
s
p
_
T
=
_
v
T
_
p
, where v = V/N. Thus,
c
p
k
B
ln T
t

ln p
t
= 0 . (7.73)
We now have in eqns. 7.70 and 7.73 two equations in the two unknowns
ln T
t
and
lnp
t
,
yielding
ln T
t
=
k
B
c
V
V (7.74)
ln p
t
=
c
p
c
V
V . (7.75)
Finally, invoking the ideal gas law n = p/k
B
T and h = c
p
T, eqn. 7.69 becomes
f
0
t
+
p
m

f
0
r
+F
f
0
p
=
_
f
0
_
_
(p) c
p
T
T
v T (7.76)
+
_
mv

(p)
c
V
/k
B
_
1
F v
_
,
where
1
=
1
2
_
V
+
V
_
. (7.77)
Finally, the Boltzmann equation takes the form
_
(p) c
p
T
T
v T +
_
mv
(p)
c
V
/k
B
_
1
F v
_
f
0
k
B
T
+
f
t
=
_
f
t
_
coll
. (7.78)
Notice we have dropped the terms v
f
r
and F
f
p
, since f must already be rst order
in smallness, and both the

r
operator as well as F add a second order of smallness, which
is negligible. Typically
f
t
is nonzero if the applied force F(t) is time-dependent. We use
the convention of summing over repeated indices. Note that
= 1
= V .
7.6 Relaxation Time Approximation
We now consider a very simple model of the collision integral,
_
f
t
_
coll
=
f f
0
=
f
. (7.79)
This model is known as the relaxation time approximation. Here, f
0
= f
0
(r, p, t) is a
distribution function which describes a local equilibrium at each position r and time t. The
quantity is the relaxation time, which can in principle be momentum-dependent, but
which we shall rst consider to be constant. In the absence of streaming terms, we have
f
t
=
f
= f(r, p, t) = f(r, p, 0) e
t/
. (7.80)
7.6. RELAXATION TIME APPROXIMATION 381
Figure 7.3: Graphic representation of the equation n v
rel
= 1, which yields the scattering
time in terms of the number density n, average particle pair relative velocity v
rel
, and
two-particle total scattering cross section . The equation says that on averageon average
there must be one particle within the tube.
The distribution f then relaxes to the equilibrium distribution f
0
on a time scale . We
note that this approximation is obviously awed in that all quantities even the collisional
invariants relax to their equilibrium values on the scale . In the Appendix, we consider
a model for the collision integral in which the collisional invariants are all preserved, but
everything else relaxes to local equilibrium at a single rate.
7.6.1 Computation of the scattering time
Consider two particles with velocities v and v
. The average of their relative speed is

[v v
[ ) =
_
d
3
v
_
d
3
v
P(v) P(v
) [v v
[ , (7.81)
where P(v) is the Maxwell velocity distribution,
P(v) =
_
m
2k
B
T
_
3/2
exp
_
mv
2
2k
B
T
_
, (7.82)
which follows from the Boltzmann form of the equilibrium distribution f
0
(p). It is left as
an exercise for the student to verify that
v
rel
[v v
[ ) =
4
_
k
B
T
m
_
1/2
. (7.83)
Note that v
rel
=
2 v, where v is the average particle speed. Let be the total scattering

cross section, which for hard spheres is = d
2
, where d is the hard sphere diameter. Then
the rate at which particles scatter is
=
1
= n v
rel
. (7.84)
The particle mean free path is simply
= v =
1
2 n
. (7.85)
While the scattering length is not temperature-dependent within this formalism, the scat-
tering time is T-dependent, with
(T) =
1
n v
rel
4n
_
m
k
B
T
_
1/2
. (7.86)
As T 0, the collision time diverges as T
1/2
, because the particles on average move
more slowly at lower temperatures. The mean free path, however, is independent of T, and
is given by = 1/
2n.
7.6.2 Thermal conductivity
We consider a system with a temperature gradient T and seek a steady state (i.e. time-
independent) solution to the Boltzmann equation. We assume F
= 1
= 0. Appealing to
eqn. 7.78, and using the relaxation time approximation for the collision integral, we have
f =
( c
p
T)
k
B
T
2
(v T) f
0
. (7.87)
We are now ready to compute the energy and particle currents. In order to compute the
local density of any quantity A(r, p), we multiply by the distribution f(r, p) and integrate
over momentum:
A
(r, t) =
_
d
3
p
h
3
A(r, p) f(r, p, t) , (7.88)
For the energy (thermal) current, we let A = v
= p
/m, in which case

A
= j
. Note
that
_
d
3
p pf
0
= 0 since f
0
is isotropic in p even when and T depend on r. Thus, only f
enters into the calculation of the various currents. Thus, the energy (thermal) current is
j
(r) =
_
d
3
p
h
3
v
f
=
n
k
B
T
2
( c
p
T)
_
T
x
, (7.89)
where the repeated index is summed over, and where momentum averages are dened
relative to the equilibrium distribution, i.e.
(p) ) =
_
d
3
p
h
3
(p) f
0
(p)
__
d
3
p
h
3
f
0
(p) =
_
d
3
v P(v) (mv) . (7.90)
In this context, it is useful to point out the identity
d
3
p
h
3
f
0
(p) = nd
3
v P(v) , (7.91)
where
P(v) =
_
m
2k
B
T
_
3/2
e
m(vV )
2
/2k
B
T
(7.92)
is the Maxwell velocity distribution.
Note that if = () is a function of the energy, and if V = 0, then
d
3
p
h
3
f
0
(p) = nd
3
v P(v) = n

P() d , (7.93)
where
P() =
2
(k
B
T)
3/2
1/2
e
/k
B
T
, (7.94)
is the Maxwellian distribution of single particle energies, which is normalized:
_
0
d

P() = 1.
Averages with respect to this distribution are given by
() ) =
_
0
d ()

P() =
2
(k
B
T)
3/2
_
0
d
1/2
() e
/k
B
T
. (7.95)
If () is homogeneous, then for any we have

) =
2
_
+
3
2
_
(k
B
T)
. (7.96)
Due to spatial isotropy, it is clear that we can replace
v
1
3
v
2
=
2
3m

(7.97)
in eqn. 7.89. We then have j
= T, with
=
2n
3mk
B
T
2

2
_
c
p
T
_
) =
5nk
2
B
T
2m
=
5
16
n v k
B
, (7.98)
where we have used c
p
=
5
2
k
B
and v
2
=
8k
B
T
m
. The quantity is called the thermal
conductivity. Note that T
1/2
.
7.6.3 Viscosity
Consider the situation depicted in g. 7.4. A uid lling the space between two large
at plates at z = 0 and z = d is set in motion by a force F = F x applied to the upper
plate; the lower plate is xed. It is assumed that the uids velocity locally matches that
of the plates. Fluid particles at the top have an average x-component of their momentum
p
x
) = mV . As these particles move downward toward lower z values, they bring their
x-momenta with them. Therefore there is a downward ( z-directed) ow of p
x
). Since
x-momentum is constantly being drawn away from z = d plane, this means that there is
a x-directed viscous drag on the upper plate. The viscous drag force per unit area is
given by F
drag
/A = V/d, where V/d = V
x
/z is the velocity gradient and is the shear
viscosity. In steady state, the applied force balances the drag force, i.e. F + F
drag
= 0.
Clearly in the steady state the net momentum density of the uid does not change, and is
Figure 7.4: Gedankenexperiment to measure shear viscosity in a uid. The lower plate is
xed. The viscous drag force per unit area on the upper plate is F
drag
/A = V/d. This
must be balanced by an applied force F.
given by
1
2
V x, where is the uid mass density. The momentum per unit time injected
into the uid by the upper plate at z = d is then extracted by the lower plate at z = 0. The
ratio of the momentum ux density
xz
= n p
x
v
z
) is the drag force on the upper surface
per unit area:
xz
=
V
x
z
. The units of viscosity are [] = M/LT.
We now provide some formal denitions of viscosity. As we shall see presently, there is in
fact a second type of viscosity, called second viscosity or bulk viscosity, which is measurable
although not by the type of experiment depicted in g. 7.4.
The momentum ux tensor
= n p
) is dened to be the current of momentum

component p
in the direction of increasing x
. For a gas in motion with average velocity

V, we have
= nm (V
+v
)(V
+v
) ) (7.99)
= nmV
+nm v
)
= nmV
+
1
3
n v
2
)
= V
+p
,
where v
is the particle velocity in a frame moving with velocity V, and where we have
invoked the ideal gas law p = nk
B
T. The mass density is = nm.
When V is spatially varying,
= p
+ V
, (7.100)
where
is the viscosity stress tensor. Any symmetric tensor, such as
, can be decom-
posed into a sum of (i) a traceless component, and (ii) a component proportional to the
identity matrix. Since
should be, to rst order, linear in the spatial derivatives of the

components of the velocity eld V , there is a unique two-parameter decomposition:

=
_
V
+
V
2
3
V
_
+ V
(7.101)
= 2
_
1

1
3
Tr (1)
_
+ Tr (1)
.
The coecient of the traceless component is , known as the shear viscosity. The coecient
of the component proportional to the identity is , known as the bulk viscosity. The full
stress tensor
contains a contribution from the pressure:
= p
. (7.102)
The dierential force dF
that a uid exerts on on a surface element ndA is

dF
dA , (7.103)
where we are using the Einstein summation convention and summing over the repeated
index . We will now compute the shear viscosity using the Boltzmann equation in the
relaxation time approximation.
Appealing again to eqn. 7.78, with F = 0, we nd
f =

k
B
T
_
mv
+
c
p
T
T
v T

c
V
/k
B
V
_
f
0
. (7.104)
We assume T = V = 0, and we compute the momentum ux:
xz
=
_
d
3
p
h
3
p
x
v
z
f
=
nm
2
k
B
T
1
v
x
v
z
v
)
=
n
k
B
T
_
V
x
z
+
V
z
x
_
mv
2
x
mv
2
z
)
= nk
B
T
_
V
z
x
+
V
x
x
_
. (7.105)
Thus, if V
x
= V
x
(z), we have
xz
= nk
B
T
V
x
z
(7.106)
from which we read o the viscosity,
= nk
B
T =

8
nm v . (7.107)
Note that (T) T
1/2
.
7.6.4 Quick and Dirty Treatment of Transport
Suppose we have some averaged intensive quantity which is spatially dependent through
T(r) or (r) or V (r). For simplicity we will write = (z). We wish to compute the
current of across some surface whose equation is dz = 0. If the mean free path is , then
the value of for particles crossing this surface in the + z direction is (z cos ), where
is the angle the particles velocity makes with respect to z, i.e. cos = v
z
/v. We perform
the same analysis for particles moving in the z direction, for which = (z + cos ).
The current of through this surface is then
j
= n z
_
v
z
>0
d
3
v P(v) v
z
(z cos ) +n z
_
v
z
<0
d
3
v P(v) v
z
(z + cos )
= n

z
z
_
d
3
v P(v)
v
2
z
v
=
1
3
n v

z
z , (7.108)
where v =
_
8k
B
T
m
is the average particle speed. If the z-dependence of comes through
the dependence of on the local temperature T, then we have
j
=
1
3
n v

T
T KT , (7.109)
where
K =
1
3
n v

T
(7.110)
is the transport coecient. If = ), then

T
= c
p
, where c
p
is the heat capacity per
particle at constant pressure. We then nd j
= T with thermal conductivity

=
1
3
n v c
p
. (7.111)
Since c
p
=
5
2
k
B
is the heat capacity per particle for a monatomic gas, we have =
1
2
n v k
B
.
Our earlier calculation using the Boltzmann equation in the relaxation time approximation
gave the same expression but with a numerical prefactor
5
16
rather than
5
6
.
We can make a similar argument for the viscosity. In this case = p
x
) is spatially varying
through its dependence on the ow velocity V (r). Clearly /V
x
= m, hence
j
z
p
x
=
xz
=
1
3
nm v
V
x
z
, (7.112)
from which we identify the viscosity, =
1
3
nm v. Once again, this agrees in its functional
dependences with the Boltzmann equation calculation in the relaxation time approximation.
Only the coecients dier. The ratio of the coecients is K
QDC
/K
BRT
=
8
3
= 0.849 in both
cases
6
.
6
Here we abbreviate QDC for quick and dirty calculation and BRT for Boltzmann equation in the
relaxation time approximation .
Gas (Pa s) (mW/m K) c
p
/k
B
Pr
He 19.5 149 2.50 0.682
Ar 22.3 17.4 2.50 0.666
Xe 22.7 5.46 2.50 0.659
H
2
8.67 179 3.47 0.693
N
2
17.6 25.5 3.53 0.721
O
2
20.3 26.0 3.50 0.711
CH
4
11.2 33.5 4.29 0.74
CO
2
14.8 18.1 4.47 0.71
NH
3
10.1 24.6 4.50 0.90
Table 7.1: Viscosities, thermal conductivities, and Prandtl numbers for some common gases
at T = 293 K and p = 1 atm. (Source: Table 1.1 of Smith and Jensen, with data for triatomic
gases added.)
7.6.5 Thermal diusivity, kinematic viscosity, and Prandtl number
Suppose, under conditions of constant pressure, we add heat q per unit volume to an ideal
gas. We know from thermodynamics that its temperature will then increase by an amount
T = q/nc
p
. If a heat current j
q
ows, then the continuity equation for energy ow requires
nc
p
T
t
+ j
q
= 0 . (7.113)
In a system where there is no net particle current, the heat current j
q
is the same as the
energy current j
, and since j
= T, we obtain a diusion equation for temperature,

T
t
=

nc
p
2
T . (7.114)
The combination
a

nc
p
(7.115)
is known as the thermal diusivity. Our Boltzmann equation calculation in the relaxation
time approximation yielded the result = nk
B
Tc
p
/m. Thus, we nd a = k
B
T/m via
this method. Note that the dimensions of a are the same as for any diusion constant D,
namely [a] = L
2
/T.
Another quantity with dimensions of L
2
/T is the kinematic viscosity, = /, where
= nm is the mass density. We found = nk
B
T from the relaxation time approximation
calculation, hence = k
B
T/m. The ratio /a, called the Prandtl number, Pr = c
p
/m,
is dimensionless. According to our calculations, Pr = 1. According to table 7.1, most
monatomic gases have Pr
2
3
.
7.6.6 Oscillating external force
Suppose a uniform oscillating external force F
ext
(t) = F e
it
is applied. For a system
of charged particles, this force would arise from an external electric eld F
ext
= qEe
it
,
where q is the charge of each particle. Well assume T = 0. The Boltzmann equation is
then written
f
t
+
p
m

f
r
+Fe
it
f
p
=
f f
0
. (7.116)
We again write f = f
0
+f, and we assume f is spatially constant. Thus,
f
t
+Fe
it
v
f
0
=
f
. (7.117)
If we assume f(t) = f() e
it
then the above dierential equation is converted to an
algebraic equation, with solution
f(t) =
e
it
1 i
f
0
F v . (7.118)
We now compute the particle current:
j
(r, t) =
_
d
3
p
h
3
v f
=
e
it
1 i

F
k
B
T
_
d
3
p
h
3
f
0
(p) v
=
e
it
1 i

nF
3k
B
T
_
d
3
v P(v) v
2
=
n
m

F
e
it
1 i
. (7.119)
If the particles are electrons, with charge q = e, then the electrical current is (e) times
the particle current. We then obtain
j
(elec)
(t) =
ne
2
m

E
e
it
1 i

() E
e
it
, (7.120)
where
() =
ne
2
m

1
1 i
(7.121)
is the frequency-dependent electrical conductivity tensor. Of course for fermions such as
electrons, we should be using the Fermi distribution in place of the Maxwell-Boltzmann
distribution for f
0
(p). This aects the relation between n and only, and the nal result
for the conductivity tensor
() is unchanged.
7.7. NONEQUILIBRIUM QUANTUM TRANSPORT 389
7.7 Nonequilibrium Quantum Transport
Almost everything we have derived thus far can be applied, mutatis mutandis, to quantum
systems. The main dierence is that the distribution f
0
corresponding to local equilibrium
is no longer of the Maxwell-Boltzmann form, but rather of the Bose-Einstein or Fermi-Dirac
form,
f
0
(r, k, t) =
_
exp
_
(k) (r, t)
k
B
T(r, t)
_
1
_
1
, (7.122)
where the top sign applies to bosons and the bottom sign to fermions. Here we shift to the
more common notation for quantum systems in which we write the distribution in terms of
the wavevector k = p/ rather than the momentum p. The quantum distributions satisfy
detailed balance with respect to the quantum collision integral
_
f
t
_
coll
=
_
d
3
k
1
(2)
3
_
d
3
k
(2)
3
_
d
3
k
1
(2)
3
w
_
f
1
(1 f) (1 f
1
) ff
1
(1 f
) (1 f
1
)
_
(7.123)
where w = w(k, k
1
[ k
, k
1
), f = f(k), f
1
= f(k
1
), f
= f(k
), and f
1
= f(k
1
), and where
we have assumed time-reversal and parity symmetry. Detailed balance requires
f
1 f

f
1
1 f
1
=
f
1 f
1
1 f
1
, (7.124)
where f = f
0
is the equilibrium distribution. One can check that
f =
1
e
()
1
=
f
1 f
= e
()
, (7.125)
which is the Boltzmann distribution, which we have already shown to satisfy detailed bal-
ance. For the streaming term, we have
df
0
= k
B
T
f
0
d
_

k
B
T
_
= k
B
T
f
0
d
k
B
T

( ) dT
k
B
T
2
+
d
k
B
T
_
=
f
0
r
dr +

T
T
r
dr

k
dk
_
, (7.126)
from which we read o
f
0
r
=
f
0
r
+

T
T
r
_
(7.127)
f
0
k
= v
f
0
. (7.128)
The most important application is to the theory of electron transport in metals and semi-
conductors, in which case f
0
is the Fermi distribution. In this case, the quantum collision
integral also receives a contribution from one-body scattering in the presence of an external
potential U(r), which is given by Fermis Golden Rule:
_
f(k)
t
_
coll
=
2
k
_
[
2
_
f(k
) f(k)
_
_
(k) (k
)
_
(7.129)
=
2
V
_
d
3
k
(2)
3
[

U(k k
)[
2
_
f(k
) f(k)
_
_
(k) (k
)
_
.
The wavevectors are now restricted to the rst Brillouin zone, and the dispersion (k) is no
longer the ballistic form =
2
k
2
/2m but rather the dispersion for electrons in a particular
energy band (typically the valence band) of a solid
7
. Note that f = f
0
satises detailed
balance with respect to one-body collisions as well
8
.
In the presence of a weak electric eld E and a (not necessarily weak) magnetic eld B, we
have, within the relaxation time approximation,
f
t

e
c
v B
f
k
v
_
e E+

T
T
_
f
0
=
f
, (7.130)
where E = ( /e) = E e
1
is the gradient of the electrochemical potential
e
1
. In deriving the above equation, we have worked to lowest order in small quantities.
This entails dropping terms like v
f
r
(higher order in spatial derivatives) and E
f
k
(both
E and f are assumed small). Typically is energy-dependent, i.e. =
_
(k)
_
.
We can use eqn. 7.130 to compute the electrical current j and the thermal current j
q
,
j = 2e
_
d
3
k
(2)
3
v f (7.131)
j
q
= 2
_
d
3
k
(2)
3
( ) v f . (7.132)
Here the factor of 2 is from spin degeneracy of the electrons (we neglect Zeeman splitting).
We shall not carry out these integrals, which are best left to a course on solid state physics.
However, it should be clear that the resulting calculations will lead to a set of linear relations
of the form
E = j + j B +QT + T B (7.133)
j
q
= j + j B T T B . (7.134)
These equations describe a wealth of transport phenomena:
7
We neglect interband scattering here, which can be important in practical applications, but which is
beyond the scope of these notes.
8
The transition rate from [k
) to [k) is proportional to the matrix element and to the product f
(1 f).
The reverse process is proportional to f(1 f
). Subtracting these factors, one obtains f
f, and therefore
the nonlinear terms felicitously cancel in eqn. 7.129.
7.8. LINEARIZED BOLTZMANN EQUATION 391
Electrical resistance (T = B = 0)
An electrical current j will generate an electric eld E = j, where is the electrical
resistivity.
Peltier eect (T = B = 0)
An electrical current j will generate an heat current j
q
= j, where is the Peltier
coecient.
Thermal conduction (j = B = 0)
A temperature gradient T gives rise to a heat current j
q
= T, where is the
thermal conductivity.
Seebeck eect (j = B = 0)
A temperature gradient T gives rise to an electric eld E = QT, where Q is the
Seebeck coecient.
In the presence of a magnetic eld B,
Hall eect (
T
x
=
T
y
= j
y
= 0)
An electrical current j = j
x
x and a eld B = B
z
z yield an electric eld E. The Hall
coecient is R
H
= c
y
/j
x
B
z
= .
Ettingshausen eect (
T
x
= j
y
= j
q,y
= 0)
An electrical current j = j
x
x and a eld B = B
z
z yield a temperature gradient
T
y
.
The Ettingshausen coecient is P =
T
y
_
j
x
B
z
= /.
Nernst eect (j
x
= j
y
=
T
y
= 0)
A temperature gradient T =
T
x
x and a eld B = B
z
z yield an electric eld E.
The Nernst coecient is = c
y
_
T
x
B
z
= .
Righi-Leduc eect (j
x
= j
y
= c
y
= 0)
A temperature gradient T =
T
x
x and a eld B = B
z
z yield an orthogonal tem-
perature gradient
T
y
. The Righi-Leduc coecient is L =
T
y
_
T
x
B
z
= /Q.
7.8 Linearized Boltzmann Equation
We now return to the classical Boltzmann equation and consider a more formal treatment
of the collision term in the linear approximation. We will assume time-reversal symmetry,
in which case
_
f
t
_
coll
=
_
d
3
p
1
h
3
_
d
3
p
h
3
_
d
3
p
1
h
3
w(p
, p
1
[ p, p
1
)
_
f(p
) f(p
1
) f(p) f(p
1
)
_
. (7.135)
The collision integral is nonlinear in the distribution f. We linearize by writing
f(p) = f
0
(p) +f
0
(p) (p) , (7.136)
where we assume (p) is small. We then have, to rst order in ,
_
f
t
_
coll
= f
0
(p) L +O(
2
) , (7.137)
where the action of the linearized collision operator is given by
L =
_
d
3
p
1
h
3
_
d
3
p
h
3
_
d
3
p
1
h
3
w(p
, p
1
[ p, p
1
) f
0
(p
1
)
_
(p
) +(p
1
) (p) (p
1
)
_
. (7.138)
In deriving the above result, we have made use of the detailed balance relation,
f
0
(p) f
0
(p
1
) = f
0
(p
) f
0
(p
1
) . (7.139)
7.8.1 Linear algebraic properties of L
Although L is an integral operator, it shares many properties with other linear operators
with which you are familiar, such as matrices and dierential operators. We can dene an
inner product
9
,

1
[
2
)
_
d
3
p
h
3
f
0
(p)
1
(p)
2
(p) . (7.140)
Note that this is not the usual Hilbert space inner product from quantum mechanics, since
the factor f
0
(p) is included in the metric. This is necessary in order that L be self-adjoint:

1
[ L
2
) = L
1
[
2
) . (7.141)
We can now dene the spectrum of normalized eigenfunctions of L, which we write as
n
(p).
The eigenfunctions satisfy the eigenvalue equation,
L
n
=
n
n
, (7.142)
and may be chosen to be orthonormal,

m
[
n
) =
mn
. (7.143)
Of course, in order to obtain the eigenfunctions
n
we must have detailed knowledge of the
function w(p
, p
1
[ p, p
1
).
Recall that there are ve collisional invariants, which are the particle number, the three
components of the total particle momentum, and the particle energy. To each collisional
9
The requirements of an inner product f[g) are symmetry, linearity, and non-negative deniteness.
7.8. LINEARIZED BOLTZMANN EQUATION 393
invariant, there is an associated eigenfunction
n
with eigenvalue
n
= 0. One can check
that these normalized eigenfunctions are
n
(p) =
1
n
(7.144)
(p) =
p
_
nmk
B
T
(7.145)
E
(p) =
_
2
3n
_
E
k
B
T

3
2
_
. (7.146)
If there are no temperature, chemical potential, or bulk velocity gradients, and there are no
external forces, then the only changes to the distribution are from collisions. The linearized
Boltzmann equation becomes
t
= L . (7.147)
We can therefore write the most general solution in the form
(p, t) =
C
n
n
(p) e
n
t
, (7.148)
where the prime on the sum reminds us that collisional invariants are to be excluded. All
the eigenvalues
n
, aside from the ve zero eigenvalues for the collisional invariants, must
be positive. Any negative eigenvalue would cause (p, t) to increase without bound, and
an initial nonequilibrium distribution would not relax to the equilibrium f
0
(p), which we
regard as unphysical. Henceforth we will drop the prime on the sum but remember that
C
n
= 0 for the ve collisional invariants.
7.8.2 Currents
The particle current is
j =
_
d
3
p
h
3
v f(p) =
_
d
3
p
h
3
f
0
(p) v (p) = v [ ) . (7.149)
The energy current is
j
=
_
d
3
p
h
3
v f(p) =
_
d
3
p
h
3
f
0
(p) v (p) = v [ ) . (7.150)
Consider now the earlier case of a temperature gradient with p = 0. The steady state
linearized Boltzmann equation is
h
k
B
T
2
v T = L . (7.151)
This is an inhomogeneous linear equation for . In general, if we have
L = Y (7.152)
then we can expand in the eigenfunctions
n
and write =
n
C
n
n
. Applying L and
taking the inner product with
j
, we have
C
j
=
1
j
Y [
j
) . (7.153)
Thus, the formal solution to the linearized Boltzmann equation is
(p) =
n
1
n
Y [
n
)
n
(p) . (7.154)
7.9 Stochastic Processes
A stochastic process is one which is partially random, i.e. it is not wholly deterministic.
Typically the randomness is due to phenomena at the microscale, such as the eect of uid
molecules on a small particle, such as a piece of dust in the air. The resulting motion
(called Brownian motion in the case of particles moving in a uid) can be described only
in a statistical sense. That is, the full motion of the system is a functional of one or more
independent random variables. The motion is then described by its averages with respect
to the various random distributions.
7.9.1 Langevin equation and Brownian motion
Consider a particle of mass M subjected to dissipative and random forcing. Well examine
this system in one dimension to gain an understanding of the essential physics. We write
p +p = F +(t) . (7.155)
Here, is the damping rate due to friction, F is a constant external force, and (t) is
a stochastic random force. This equation, known as the Langevin equation, describes a
ballistic particle being bueted by random forcing events. Think of a particle of dust as
it moves in the atmosphere; F would then represent the external force due to gravity and
(t) the random forcing due to interaction with the air molecules. For a sphere of radius a
moving with velocity v in a uid, the Stokes drag is given by F
drag
= 6av, where a is
the radius. Thus,
Stokes
=
6a
M
, (7.156)
where M is the mass of the particle. It is illustrative to compute in some setting. Consider
a micron sized droplet (a = 10
4
cm) of some liquid of density 1.0 g/cm
3
moving in air
at T = 20
C. The viscosity of air at this temperature is = 1.8 10

4
g/cm s.
10
If the
droplet density is constant, then = 9/2a
2
= 8.110
4
s
1
, hence the time scale for viscous
relaxation of the particle is =
1
= 12 s. We should stress that the viscous damping on
the particle is of course due to the uid molecules, in some average coarse-grained sense.
10
The cgs unit of viscosity is the Poise (P). 1 P = 1 g/cms.
7.9. STOCHASTIC PROCESSES 395
The random component to the force (t) would then represent the uctuations with respect
to this average.
We can easily integrate this equation:
d
dt
_
p e
t
_
= F e
t
+(t) e
t
= p(t) = p(0) e
t
+
F
_
1 e
t
_
+
t
_
0
ds (s) e
(st)
(7.157)
Note that p(t) is indeed a functional of the random function (t). We can therefore only
compute averages in order to describe the motion of the system.
The rst average we will compute is that of p itself. In so doing, we assume that (t) has
zero mean:
(t)
_
= 0. Then
p(t)
_
= p(0) e
t
+
F
_
1 e
t
_
. (7.158)
On the time scale
1
, the initial conditions p(0) are eectively forgotten, and asymptoti-
cally for t
1
we have
p(t)
_
F/, which is the terminal momentum.
Next, consider
p
2
(t)
_
=
p(t)
_
2
+
t
_
0
ds
1
t
_
0
ds
2
e
(s
1
t)
e
(s
2
t)
(s
1
) (s
2
)
_
. (7.159)
We now need to know the two-time correlator
(s
1
) (s
2
)
_
. We assume that the correlator
is a function only of the time dierence s = s
1
s
2
, so that the random force (s) satises
(s)
_
= 0 (7.160)
(s
1
) (s
2
)
_
= (s
1
s
2
) . (7.161)
The function (s) is the autocorrelation function of the random force. A macroscopic
object moving in a uid is constantly bueted by uid particles over its entire perimeter.
These dierent uid particles are almost completely uncorrelated, hence (s) is basically
nonzero except on a very small time scale
, which is the time a single uid particle spends

interacting with the object. We can take
0 and approximate
(s) (s) . (7.162)
We shall determine the value of from equilibrium thermodynamic considerations below.
With this form for (s), we can easily calculate the equal time momentum autocorrelation:
p
2
(t)
_
=
p(t)
_
2
+
t
_
0
ds e
2(st)
=
p(t)
_
2
+

2
_
1 e
2t
_
. (7.163)
Consider the case where F = 0 and the limit t
1
. We demand that the object
thermalize at temperature T. Thus, we impose the condition
_
p
2
(t)
2M
_
=
1
2
k
B
T = = 2Mk
B
T , (7.164)
where M is the particles mass. This determines the value of .
We can now compute the general momentum autocorrelator:
p(t) p(t
)
_
p(t)
_
p(t
)
_
=
t
_
0
ds
t
_
0
ds
e
(st)
e
(s
(s) (s
)
_
(7.165)
= e
(t+t
)
t
min
_
0
ds e
2s
= Mk
B
T
_
e
[tt
[
e
(t+t
)
_
,
where
t
min
= min(t, t
) =
_
t if t < t
if t
< t
(7.166)
is the lesser of t and t
. Here we have used the result

t
_
0
ds
t
_
0
ds
e
(s+s
)
(s s
) =
t
min
_
0
ds
t
min
_
0
ds
e
(s+s
)
(s s
)
=
t
min
_
0
ds e
2s
=
1
2
_
e
2t
min
1
_
. (7.167)
One way to intuitively understand this result is as follows. The double integral over s and
s
is over a rectangle of dimensions t t
. Since the -function can only be satised when

s = s
, there can be no contribution to the integral from regions where s > t
or s
> t. Thus,
the only contributions can arise from integration over the square of dimensions t
min
t
min
.
Note also
t +t
2 min(t, t
) = [t t
[ . (7.168)
Lets now compute the position x(t). We have
x(t) = x(0) +
t
_
0
ds v(s)
= x(0) +
t
_
0
ds
_
_
v(0)
F
M
_
e
s
+
F
M
_
+
1
M
t
_
0
ds
s
_
0
ds
1
(s
1
) e
(s
1
s)
=
x(t)
_
+
1
M
t
_
0
ds
s
_
0
ds
1
(s
1
) e
(s
1
s)
, (7.169)
Figure 7.5: Regions for some of the double integrals encountered in the text.
where, since
(t)
_
= 0,
x(t)
_
= x(0) +
t
_
0
ds
_
_
v(0)
F
M
_
e
s
+
F
M
_
= x(0) +
Ft
M
+
1
_
v(0)
F
M
_
_
1 e
t
_
. (7.170)
Note that for t 1 we have
x(t)
_
= x(0) +v(0) t +
1
2
M
1
Ft
2
+O(t
3
), as is appropriate
for ballistic particles moving under the inuence of a constant force. This long time limit
of course agrees with our earlier evaluation for the terminal velocity, v
p()
_
/M =
F/M.
We next compute the position autocorrelation:
x(t) x(t
)
_
x(t)
_
x(t
)
_
=
1
M
2
t
_
0
ds
t
_
0
ds
e
(s+s
)
s
_
0
ds
1
s
_
0
ds
1
e
(s
1
+s
2
)
(s
1
) (s
2
)
_
=

2M
2
t
_
0
ds
t
_
0
ds
_
e
[ss
[
e
(s+s
)
_
(7.171)
We have to be careful in computing the double integral of the rst term in brackets on the
RHS. We can assume, without loss of generality, that t t
. Then
t
_
0
ds
t
_
0
ds
e
[ss
[
=
t
_
0
ds
e
s
t
_
s
ds e
s
+
t
_
0
ds
e
s
_
0
ds e
s
= 2
1
t
+
2
_
e
t
+e
t
1 e
(tt
)
_
. (7.172)
We then nd, for t > t
x(t) x(t
)
_
x(t)
_
x(t
)
_
=
2k
B
T
M
t
+
k
B
T
2
M
_
2e
t
+ 2e
t
2 e
(tt
)
e
(t+t
)
_
.
(7.173)
In particular, the equal time autocorrelator is
x
2
(t)
_
x(t)
_
2
=
2k
B
T
M
t +
k
B
T
2
M
_
4e
t
3 e
2t
_
. (7.174)
We see that for long times
x
2
(t)
_
x(t)
_
2
2Dt , (7.175)
where
D =
k
B
T
M
(7.176)
is the diusion constant. For a liquid droplet of radius a = 1 m moving in air at T = 293 K,
for which = 1.8 10
4
P, we have
D =
k
B
T
6a
=
(1.38 10
16
erg/K) (293 K)
6 (1.8 10
4
P) (10
4
cm)
= 1.19 10
7
cm
2
/s . (7.177)
This result presumes that the droplet is large enough compared to the intermolecular dis-
tance in the uid that one can adopt a continuum approach and use the Navier-Stokes
equations, and then assuming a laminar ow.
If we consider molecular diusion, the situation is quite a bit dierent. As we shall derive
below in 7.9.4, the molecular diusion constant is D =
2
/2, where is the mean free
path and is the collision time. As we found in eqn. 7.84, the mean free path , collision
time , number density n, and total scattering cross section are related by
= v =
1
2 n
, (7.178)
where v =
_
8k
B
T/m is the average particle speed. Approximating the particles as hard
spheres, we have = 4a
2
, where a is the hard sphere radius. At T = 293 K, and p = 1 atm,
we have n = p/k
B
T = 2.51 10
19
cm
3
. Since air is predominantly composed of N
2
molecules, we take a = 1.90 10
8
cm and m = 28.0 amu = 4.65 10
23
g, which are
appropriate for N
2
. We nd an average speed of v = 471 m/s and a mean free path of
= 6.21 10
6
cm. Thus, D =
1
2
v = 0.146 cm
2
/s. Though much larger than the diusion
constant for large droplets, this is still too small to explain common experiences. Suppose
we set the characteristic distance scale at d = 10 cm and we ask how much time a point
source would take to diuse out to this radius. The answer is t = d
2
/2D = 343 s, which
is between ve and six minutes. Yet if a perfumed lady passes directly by, or if someone in
the next chair farts, your sense the odor in on the order of a second. What this tells us is
that diusion isnt the only transport process involved in these and like phenomena. More
important are convection currents which distribute the scent much more rapidly.
7.9.2 Langevin equation for a particle in a harmonic well
Consider next the equation
M

X +M

X +M
2
0
X = F
0
+(t) , (7.179)
where F
0
is a constant force. We write X =
F
0
M
2
0
+x and measure x relative to the potential
minimum, yielding
x + x +
2
0
x =
1
M
(t) . (7.180)
At this point there are several ways to proceed.
Perhaps the most straightforward is by use of the Laplace transform. Recall:
x() =
_
0
dt e
t
() (7.181)
x(t) =
_
(
d
2i
e
+t
x() , (7.182)
where the contour ( proceeds from a i to a + i such that all poles of the integrand
lie to the left of (. We then have
1
M
_
0
dt e
t
(t) =
1
M
_
0
dt e
t
_
x + x +
2
0
x
_
= ( +) x(0) x(0) +
_
2
+ +
2
0
_
x() . (7.183)
Thus, we have
x() =
( +) x(0) + x(0)
2
+ +
2
0
+
1
M

1
2
+ +
2
0
_
0
dt e
t
(t) . (7.184)
Now we may write
2
+ +
2
0
= (
+
)(
) , (7.185)
where
=
1
2

_
1
4
2
0
. (7.186)
Note that Re (
) 0 and that +
.
Performing the inverse Laplace transform, we obtain
x(t) =
x(0)
+
e
+
t
_
+
x(0)
_
e
+
t
e
t
_
+
_
0
ds K(t s) (s) , (7.187)
where
K(t s) =
(t s)
M (
+
)
_
e
+
(ts)
e
(ts)
_
(7.188)
is the response kernel and (t s) is the step function which is unity for t > s and zero
otherwise. The response is causal , i.e. x(t) depends on (s) for all previous times s < t, but
not for future times s > t. Note that K() decays exponentially for , if Re(
) < 0.
The marginal case where
0
= 0 and
+
= 0 corresponds to the diusion calculation we
performed in the previous section.
7.9.3 General Linear Autonomous Inhomogeneous ODEs
We can also solve general autonomous linear inhomogeneous ODEs of the form
d
n
x
dt
n
+a
n1
d
n1
x
dt
n1
+. . . +a
1
dx
dt
+a
0
x = (t) . (7.189)
We can write this as
L
t
x(t) = (t) , (7.190)
where L
t
is the n
th
order dierential operator
L
t
=
d
n
dt
n
+a
n1
d
n1
dt
n1
+. . . +a
1
d
dt
+a
0
. (7.191)
The general solution to the inhomogeneous equation is given by
x(t) = x
h
(t) +
dt
G(t, t
) (t
) , (7.192)
where G(t, t
) is the Greens function. Note that L

t
x
h
(t) = 0. Thus, in order for eqns.
7.190 and 7.201 to be true, we must have
L
t
x(t) =
this vanishes
..
L
t
x
h
(t) +
dt
L
t
G(t, t
) (t
) = (t) , (7.193)
which means that
L
t
G(t, t
) = (t t
) , (7.194)
where (t t
) is the Dirac -function.

If the dierential equation L
t
x(t) = (t) is dened over some nite or semi-innite t interval
with prescribed boundary conditions on x(t) at the endpoints, then G(t, t
) will depend on
t and t
separately. For the case we are now considering, let the interval be the entire real
line t (, ). Then G(t, t
) = G(t t
) is a function of the single variable t t
.
Note that L
t
= L
_
d
dt
_
may be considered a function of the dierential operator
d
dt
. If we
now Fourier transform the equation L
t
x(t) = (t), we obtain
dt e
it
(t) =
dt e
it
_
d
n
dt
n
+a
n1
d
n1
dt
n1
+. . . +a
1
d
dt
+a
0
_
x(t) (7.195)
=
dt e
it
_
(i)
n
+a
n1
(i)
n1
+. . . +a
1
(i) +a
0
_
x(t) .
Thus, if we dene
L() =
n
k=0
a
k
(i)
k
, (7.196)
then we have
L() x() =

() , (7.197)
where a
n
1. According to the Fundamental Theorem of Algebra, the n
th
degree poly-
nomial

L() may be uniquely factored over the complex plane into a product over n
roots:
L() = (i)
n
(
1
)(
2
) (
n
) . (7.198)
If the a
k
are all real, then
_
L()
=

L(
), hence if is a root then so is
. Thus,
the roots appear in pairs which are symmetric about the imaginary axis. I.e. if = a +ib
is a root, then so is
= a +ib.
The general solution to the homogeneous equation is
x
h
(t) =
n
=1
A
e
it
, (7.199)
which involves n arbitrary complex constants A
i
. The susceptibility, or Greens function in
Fourier space,

G() is then
G() =
1
L()
=
i
n
(
1
)(
2
) (
n
)
, (7.200)
Note that
_
G()
=

G(), which is equivalent to the statement that G(t t
) is a real
function of its argument. The general solution to the inhomogeneous equation is then
x(t) = x
h
(t) +
dt
G(t t
) (t
) , (7.201)
where x
h
(t) is the solution to the homogeneous equation, i.e. with zero forcing, and where
G(t t
) =
d
2
e
i(tt
)

G()
= i
n
d
2
e
i(tt
)
(
1
)(
2
) (
n
)
=
n
=1
e
i
(tt
)
i L
)
(t t
) , (7.202)
where we assume that Im
< 0 for all . This guarantees causality the response x(t) to

the inuence (t
) is nonzero only for t > t
.
As an example, consider the familiar case
L() =
2
i +
2
0
= (
+
) (
) , (7.203)
with
=
i
2
, and =
_
2
0
1
4
2
. This yields
L
) = (
+
) = 2 . (7.204)
Then according to equation 7.202,
G(s) =
_
e
i
+
s
iL
(
+
)
+
e
i
s
iL
)
_
(s)
=
_
e
s/2
e
is
2i
+
e
s/2
e
is
2i
_
(s)
=
1
e
s/2
sin(s) (s) . (7.205)
Now let us evaluate the two-point correlation function
x(t) x(t
)
_
, assuming the noise is
correlated according to
(s) (s
)
_
= (s s
). We assume t, t
so the transient
contribution x
h
is negligible. We then have
x(t) x(t
)
_
=
ds
ds
G(t s) G(t
(s) (s
)
_
(7.206)
=
d
2
()

G()
2
e
i(tt
)
. (7.207)
7.9.4 Discrete random walk
Consider an object moving on a one-dimensional lattice in such a way that every time step
it moves either one unit to the right or left, at random. If the lattice spacing is , then after
n time steps the position will be
x
n
=
n
j=1
j
, (7.208)
where
j
=
_
+1 if motion is one unit to right at time step j
1 if motion is one unit to left at time step j .
(7.209)
Clearly
j
) = 0, so x
n
) = 0. Now let us compute
x
2
n
_
=
2
n
j=1
n
=1
j
) = n
2
, (7.210)
where we invoke
_
=
jj
. (7.211)
If the length of each time step is , then we have, with t = n,
x
2
(t)
_
=

2
t , (7.212)
and we identify the diusion constant
D =

2
2
. (7.213)
Suppose, however, the random walk is biased, so that the probability for each independent
step is given by
P() = p
,1
+q
,1
, (7.214)
where p +q = 1. Then
j
) = p q = 2p 1 (7.215)
j
) = (p q)
2
_
1
jj
_
+
jj
= (2p 1)
2
+ 4 p (1 p)
jj
. (7.216)
Then
x
n
) = (2p 1) n (7.217)
x
2
n
_
x
n
_
2
= 4 p (1 p)
2
n . (7.218)
7.9.5 Fokker-Planck equation
Suppose x(t) is a stochastic variable. We dene the quantity
x(t) x(t +t) x(t) , (7.219)
and we assume
x(t)
_
= F
1
_
x(t)
_
t (7.220)
_
x(t)
2
_
= F
2
_
x(t)
_
t (7.221)
but
_
x(t)
n
_
= O
_
(t)
2
_
for n > 2. The n = 1 term is due to drift and the n = 2 term is
due to diusion. Now consider the conditional probability density, P(x, t [ x
0
, t
0
), dened
to be the probability distribution for x x(t) given that x(t
0
) = x
0
. The conditional
probability density satises the composition rule,
P(x
2
, t
2
[ x
0
, t
0
) =
dx
1
P(x
2
, t
2
[ x
1
, t
1
) P(x
1
, t
1
[ x
0
, t
0
) , (7.222)
for any value of t
1
. This is also known as the Chapman-Kolmogorov equation. In words,
what it says is that the probability density for (being at) x
2
at time
1
given that for x
0
at
time t
0
is given by the product of the probability density for x
2
at time t
2
given that x
1
at t
1
multiplied by that for x
1
at t
1
given x
0
at t
0
, integrated over x
. This should be intuitively

obvious, since if we pick any time t
1
[t
0
, t
2
], then the particle had to be somewhere at
that time. Indeed, one wonders how Chapman and Kolmogorov got their names attached
to a result that is so obvious. At any rate, a picture is worth a thousand words: see g. 7.6.
Proceeding, we may write
P(x, t +t [ x
0
, t
0
) =
dx
P(x, t +t [ x
, t) P(x
, t [ x
0
, t
0
) . (7.223)
Now
P(x, t +t [ x
, t) =
_
x x(t) x
__
(7.224)
=
_
1 +
x(t)
_
d
dx
+
1
2
_
x(t)
2
_
d
2
dx
2
+. . .
_
(x x
)
= (x x
) +F
1
(x
)
d (x x
)
dx
t +F
2
(x
)
d
2
(x x
)
dx
2
t +O
_
(t)
2
_
,
where the average is over the random variables. We now insert this result into eqn. 7.223,
integrate by parts, divide by t, and then take the limit t 0. The result is the Fokker-
Planck equation,
P
t
=

x
_
F
1
(x) P(x, t)
+
1
2
2
x
2
_
F
2
(x) P(x, t)
. (7.225)
Figure 7.6: Interpretive sketch of the mathematics behind the Chapman-Kolmogorov equa-
tion.
7.9.6 Brownian motion redux
Lets apply our Fokker-Planck equation to a description of Brownian motion. From our
earlier results, we have
F
1
(x) =
F
M
, F
2
(x) = 2D . (7.226)
A formal proof of these results is left as an exercise for the reader. The Fokker-Planck
equation is then
P
t
= u
P
x
+D

2
P
x
2
, (7.227)
where u = F/M is the average terminal velocity. If we make a Galilean transformation
and dene
y = x ut , s = t (7.228)
then our Fokker-Planck equation takes the form
P
s
= D

2
P
y
2
. (7.229)
This is known as the diusion equation. Eqn. 7.227 is also a diusion equation, rendered
in a moving frame.
While the Galilean transformation is illuminating, we can easily solve eqn. 7.227 without
it. Lets take a look at this equation after Fourier transforming from x to q:
P(x, t) =
dq
2
e
iqx

P(q, t) (7.230)
P(q, t) =
dx e
iqx
P(x, t) . (7.231)
Then as should be well known to you by now, we can replace the operator

x
with multi-
plication by iq, resulting in
P(q, t) = (Dq
2
+iqu)

P(q, t) , (7.232)
with solution
P(q, t) = e
Dq
2
t
e
iqut

P(q, 0) . (7.233)
We now apply the inverse transform to get back to x-space:
P(x, t) =
dq
2
e
iqx
e
Dq
2
t
e
iqut
dx
e
iqx
P(x
, 0)
=
dx
P(x
, 0)
dq
2
e
Dq
2
t
e
iq(xutx
)
=
dx
K(x x
, t) P(x
, 0) , (7.234)
where
K(x, t) =
1
4Dt
e
(xut)
2
/4Dt
(7.235)
is the diusion kernel . We now have a recipe for obtaining P(x, t) given the initial conditions
P(x, 0). If P(x, 0) = (x), describing a particle conned to an innitesimal region about
the origin, then P(x, t) = K(x, t) is the probability distribution for nding the particle at
x at time t. There are two aspects to K(x, t) which merit comment. The rst is that the
center of the distribution moves with velocity u. This is due to the presence of the external
force. The second is that the standard deviation =
2Dt is increasing in time, so the

distribution is not only shifting its center but it is also getting broader as time evolves.
This movement of the center and broadening are what we have called drift and diusion,
respectively.
7.10 Appendix I : Example Problem (advanced)
Problem : The linearized Boltzmann operator L is a complicated functional. Suppose we
7.10. APPENDIX I : EXAMPLE PROBLEM (ADVANCED) 407
replace L by L, where
L = = (v, t) +
_
m
2k
B
T
_
3/2
_
d
3
u exp
_
mu
2
2k
B
T
_
(7.236)
_
1 +
m
k
B
T
u v +
2
3
_
mu
2
2k
B
T

3
2
__
mv
2
2k
B
T

3
2
_
_
(u, t) .
Show that L shares all the important properties of L. What is the meaning of ? Expand
(v, t) in spherical harmonics and Sonine polynomials,
(v, t) =
rm
a
rm
(t) S
r
+
1
2
(x) x
/2
Y
m
( n), (7.237)
with x = mv
2
/2k
B
T, and thus express the action of the linearized Boltzmann operator
algebraically on the expansion coecients a
rm
(t).
The Sonine polynomials S
n
(x) are a complete, orthogonal set which are convenient to use

in the calculation of transport coecients. They are dened as
S
n
(x) =
n
m=0
( +n + 1) (x)
m
( +m+ 1) (n m)! m!
, (7.238)
and satisfy the generalized orthogonality relation
_
0
dxe
x
x
S
n
(x) S
n
(x) =
( +n + 1)
n!

nn
. (7.239)
Solution : The important properties of L are that it annihilate the ve collisional invariants,
i.e. 1, v, and v
2
, and that all other eigenvalues are negative. That this is true for L can be
veried by an explicit calculation.
Plugging the conveniently parameterized form of (v, t) into L, we have
L =
rm
a
rm
(t) S
r
+
1
2
(x) x
/2
Y
m
( n) +

2
3/2
rm
a
rm
(t)
_
0
dx
1
x
1/2
1
e
x
1
_
d n
1
_
1 + 2 x
1/2
x
1/2
1
n n
1
+
2
3
_
x
3
2
__
x
1
3
2
_
_
S
r
+
1
2
(x
1
) x
/2
1
Y
m
( n
1
) , (7.240)
where weve used
u =
_
2k
B
T
m
x
1/2
1
, du =
_
k
B
T
2m
x
1/2
1
dx
1
. (7.241)
Now recall Y
0
0
( n) =
1
4
and
Y
1
1
( n) =
_
3
8
sin e
i
Y
1
0
( n) =
_
3
4
cos Y
1
1
( n) = +
_
3
8
sin e
i
S
0
1/2
(x) = 1 S
0
3/2
(x) = 1 S
1
1/2
(x) =
3
2
x ,
which allows us to write
1 = 4 Y
0
0
( n) Y
0
0
( n
1
) (7.242)
n n
1
=
4
3
_
Y
1
0
( n) Y
1
0
( n
1
) +Y
1
1
( n) Y
1
1
( n
1
) +Y
1
1
( n) Y
1
1
( n
1
)
_
. (7.243)
We can do the integrals by appealing to the orthogonality relations for the spherical har-
monics and Sonine polynomials:
_
d nY
m
( n) Y
l
( n) =
ll

mm
(7.244)
_
0
dxe
x
x
S
n
(x) S
n
(x) =
(n + + 1)
(n + 1)

nn
. (7.245)
Integrating rst over the direction vector n
1
,
L =
rm
a
rm
(t) S
r
+
1
2
(x) x
/2
Y
m
( n) (7.246)
+
2
rm
a
rm
(t)
_
0
dx
1
x
1/2
1
e
x
1
_
d n
1
_
Y
0
0
( n) Y
0
0
( n
1
) S
0
1/2
(x) S
0
1/2
(x
1
)
+
2
3
x
1/2
x
1/2
1
1
=1
Y
1
m
( n) Y
1
m
( n
1
) S
0
3/2
(x) S
0
3/2
(x
1
)
+
2
3
Y
0
0
( n) Y
0
0
( n
1
) S
1
1/2
(x) S
1
1/2
(x
1
)
_
S
r
+
1
2
(x
1
) x
/2
1
Y
m
( n
1
) ,
we obtain the intermediate result
L =
rm
a
rm
(t) S
r
+
1
2
(x) x
/2
Y
m
( n)
+
2
rm
a
rm
(t)
_
0
dx
1
x
1/2
1
e
x
1
_
Y
0
0
( n)
l0
m0
S
0
1/2
(x) S
0
1/2
(x
1
)
+
2
3
x
1/2
x
1/2
1
1
=1
Y
1
m
( n)
l1
mm
S
0
3/2
(x) S
0
3/2
(x
1
)
+
2
3
Y
0
0
( n)
l0
m0
S
1
1/2
(x) S
1
1/2
(x
1
)
_
S
r
+
1
2
(x
1
) x
1/2
1
. (7.247)
Appealing now to the orthogonality of the Sonine polynomials, and recalling that
(
1
2
) =
, (1) = 1 , (z + 1) = z (z) , (7.248)

7.11. APPENDIX II : DISTRIBUTIONS AND FUNCTIONALS 409
we integrate over x
1
. For the rst term in brackets, we invoke the orthogonality relation
with n = 0 and =
1
2
, giving (
3
2
) =
1
2
. For the second bracketed term, we have n = 0

but =
3
2
, and we obtain (
5
2
) =
3
2
(
3
2
), while the third bracketed term involves leads to
n = 1 and =
1
2
, also yielding (
5
2
) =
3
2
(
3
2
). Thus, we obtain the simple and pleasing
result
L =
rm
a
rm
(t) S
r
+
1
2
(x) x
/2
Y
m
( n) (7.249)
where the prime on the sum indicates that the set
CI =
_
(0, 0, 0) , (1, 0, 0) , (0, 1, 1) , (0, 1, 0) , (0, 1, 1)
_
(7.250)
are to be excluded from the sum. But these are just the functions which correspond to the
ve collisional invariants! Thus, we learn that
rm
(v) = ^
rm
S
r
+
1
2
(x) x
/2
Y
m
( n), (7.251)
is an eigenfunction of L with eigenvalue if (r, , m) does not correspond to one of the ve
collisional invariants. In the latter case, the eigenvalue is zero. Thus, the algebraic action
of L on the coecients a
rm
is
(La)
rm
=
_
a
rm
if (r, , m) / CI
= 0 if (r, , m) CI
(7.252)
The quantity =
1
is the relaxation time.
It is pretty obvious that L is self-adjoint, since
[ L )
_
d
3
v f
0
(v) (v) L[(v)]
= n
_
m
2k
B
T
_
3/2
_
d
3
v exp
_
mv
2
2k
B
T
_
(v) (v)
+ n
_
m
2k
B
T
_
3
_
d
3
v
_
d
3
u exp
_
mu
2
2k
B
T
_
exp
_
mv
2
2k
B
T
_
(v)
_
1 +
m
k
B
T
u v +
2
3
_
mu
2
2k
B
T

3
2
__
mv
2
2k
B
T

3
2
_
_
(u)
= L[ ) , (7.253)
where n is the bulk number density and f
0
(v) is the Maxwellian velocity distribution.
7.11 Appendix II : Distributions and Functionals
Let x R be a random variable, and P(x) a probability distribution for x. The average of
any function (x) is then
(x)
_
=
dx P(x) (x)
_

_
dx P(x) . (7.254)
Let (t) be a random function of t, with (t) R, and let P
_
(t)
be the probability
distribution functional for (t). Then if
_
(t)
is a functional of (t), the average of is

given by
_
D P
_
(t)
_
(t)
_
_
D P
_
(t)
(7.255)
The expression
_
D P[] [] is a functional integral . A functional integral is a continuum
limit of a multivariable integral. Suppose (t) were dened on a set of t values t
n
= n. A
functional of (t) becomes a multivariable function of the values
n
(t
n
). The metric
then becomes
D
n
d
n
. (7.256)
In fact, for our purposes we will not need to know any details about the functional measure
D; we will nesse this delicate issue
11
. Consider the generating functional ,
Z
_
J(t)
=
_
D P[] exp
_

_
dt J(t) (t)
_
. (7.257)
It is clear that
1
Z[J]
n
Z[J]
J(t
1
) J(t
n
)
J(t)=0
=
(t
1
) (t
n
)
_
. (7.258)
The function J(t) is an arbitrary source function. We dierentiate with respect to it in
order to nd the -eld correlators.
Lets compute the generating function for a class of distributions of the Gaussian form,
P[] = exp
_
1
2
dt
_
2

2
+
2
_
_
(7.259)
= exp
_
1
2
d
2
_
1 +
2
2
_
()
2
_
. (7.260)
Then Fourier transforming the source function J(t), it is easy to see that
Z[J] = Z[0] exp
_
d
2

J()
2
1 +
2
2
_
. (7.261)
Note that with (t) R and J(t) R we have
() = () and J
() = J().
Transforming back to real time, we have
Z[J] = Z[0] exp
_
1
2
dt
dt
J(t) G(t t
) J(t
)
_
, (7.262)
11
A discussion of measure for functional integrals is found in R. P. Feynman and A. R. Hibbs, Quantum
Mechanics and Path Integrals.
7.11. APPENDIX II : DISTRIBUTIONS AND FUNCTIONALS 411
Figure 7.7: Discretization of a continuous function (t). Upon discretization, a functional
_
(t)
becomes an ordinary multivariable function (

j
).
where
G(s) =

2
e
[s[/
,

G() =

1 +
2
2
(7.263)
is the Greens function, in real and Fourier space. Note that
ds G(s) =

G(0) = . (7.264)
We can now compute
(t
1
) (t
2
)
_
= G(t
1
t
2
) (7.265)
(t
1
) (t
2
) (t
3
) (t
4
)
_
= G(t
1
t
2
) G(t
3
t
4
) +G(t
1
t
3
) G(t
2
t
4
) (7.266)
+G(t
1
t
4
) G(t
2
t
3
) .
The generalization is now easy to prove, and is known as Wicks theorem:
(t
1
) (t
2n
)
_
=
contractions
G(t
i
1
t
i
2
) G(t
i
2n1
t
i
2n
) , (7.267)
where the sum is over all distinct contractions of the sequence 1-2 2n into products
of pairs. How many terms are there? Some simple combinatorics answers this question.
Choose the index 1. There are (2n 1) other time indices with which it can be contracted.
Now choose another index. There are (2n 3) indices with which that index can be con-
tracted. And so on. We thus obtain
C(n)
# of contractions
of 1-2-3 2n
= (2n 1)(2n 3) 3 1 =
(2n)!
2
n
n!
. (7.268)
7.12 Appendix III : More on Inhomogeneous Autonomous
Linear ODES
Note that any n
th
order ODE, of the general form
d
n
x
dt
n
= F
_
x,
dx
dt
, . . . ,
d
n1
x
dt
n1
_
, (7.269)
may be represented by the rst order system = V (). To see this, dene
k
=
d
k1
x/dt
k1
, with k = 1, . . . , n. Thus, for k < n we have
k
=
k+1
, and
n
= F. In
other words,

..
d
dt
_
_
_
_
_
1
.
.
.
n1
n
_
_
_
_
_
=
V ()
..
_
_
_
_
_
2
.
.
.
n
F
_
1
, . . . ,
p
_
_
_
_
_
_
. (7.270)
An inhomogeneous linear n
th
order ODE,
d
n
x
dt
n
+a
n1
d
n1
x
dt
n1
+. . . +a
1
dx
dt
+a
0
x = (t) (7.271)
may be written in matrix form, as
d
dt
_
_
_
_
_
2
.
.
.
n
_
_
_
_
_
=
Q
..
_
_
_
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
a
0
a
1
a
2
a
n1
_
_
_
_
_
_
_
_
_
_
2
.
.
.
n
_
_
_
_
_
+
..
_
_
_
_
_
0
0
.
.
.
(t)
_
_
_
_
_
. (7.272)
Thus,
= Q+ , (7.273)
and if the coecients c
k
are time-independent, i.e. the ODE is autonomous.
For the homogeneous case where (t) = 0, the solution is obtained by exponentiating the
constant matrix Qt:
(t) = exp(Qt) (0) ; (7.274)
the exponential of a matrix may be given meaning by its Taylor series expansion. If the
ODE is not autonomous, then Q = Q(t) is time-dependent, and the solution is given by the
path-ordered exponential,
(t) = Pexp
_
t
_
0
dt
Q(t
)
_
(0) , (7.275)
7.12. APPENDIX III : MORE ON INHOMOGENEOUS AUTONOMOUS LINEAR ODES413
where P is the path ordering operator which places earlier times to the right. As dened, the
equation = V () is autonomous, since the t-advance mapping g
t
depends only on t and
on no other time variable. However, by extending the phase space M from MMR,
which is of dimension n + 1, one can describe arbitrary time-dependent ODEs.
In general, path ordered exponentials are dicult to compute analytically. We will hence-
forth consider the autonomous case where Q is a constant matrix in time. We will assume
the matrix Q is real, but other than that it has no helpful symmetries. We can however
decompose it into left and right eigenvectors:
Q
ij
=
n
=1
R
,i
L
,j
. (7.276)
Or, in bra-ket notation, Q =
[R
)L
[. The normalization condition we use is
_
=
, (7.277)
where
_
_
are the eigenvalues of Q. The eigenvalues may be real or imaginary. Since
the characteristic polynomial P() = det ( I Q) has real coecients, we know that the
eigenvalues of Q are either real or come in complex conjugate pairs.
Consider, for example, the n = 2 system we studied earlier. Then
Q =
_
0 1
2
0

_
. (7.278)
The eigenvalues are as before:
=
1
2

_
1
4
2
0
. The left and right eigenvectors are
L
=
1
1
_
, R
=
_
1
_
. (7.279)
The utility of working in a left-right eigenbasis is apparent once we reect upon the result
f(Q) =
n
=1
f(
_
L
(7.280)
for any function f. Thus, the solution to the general autonomous homogeneous case is
(t)
_
=
n
=1
e
_
L
(0)
_
(7.281)
i
(t) =
n
=1
e
t
R
,i
n
j=1
L
,j

j
(0) . (7.282)
If Re (
) 0 for all , then the initial conditions (0) are forgotten on time scales
=
1
.
Physicality demands that this is the case.
Now lets consider the inhomogeneous case where (t) ,= 0. We begin by recasting eqn.
7.273 in the form
d
dt
_
e
Qt
_
= e
Qt
(t) . (7.283)
We can integrate this directly:
(t) = e
Qt
(0) +
t
_
0
ds e
Q(ts)
(s) . (7.284)
In component notation,
i
(t) =
n
=1
e
t
R
,i
(0)
_
+
n
=1
R
,i
t
_
0
ds e
(ts)
(s)
_
. (7.285)
Note that the rst term on the RHS is the solution to the homogeneous equation, as must
be the case when (s) = 0.
The solution in eqn. 7.285 holds for general Q and (s). For the particular form of Q and
(s) in eqn. 7.272, we can proceed further. For starters, L
[(s)) = L
,n
(s). We can
further exploit a special feature of the Q matrix to analytically determine all its left and
right eigenvectors. Applying Q to the right eigenvector [R
), we obtain
R
,j
=
R
,j1
(j > 1) . (7.286)
We are free to choose R
,1
= 1 for all and defer the issue of normalization to the derivation
of the left eigenvectors. Thus, we obtain the pleasingly simple result,
R
,k
=
k1
. (7.287)
Applying Q to the left eigenvector L
[, we obtain
a
0
L
,n
=
L
,1
(7.288)
L
,j1
a
j1
L
,n
=
L
,j
(j > 1) . (7.289)
From these equations we may derive
L
,k
=
L
,n
k1
j=0
a
j

jk1
=
L
,n
j=k
a
j

jk1
.
The equality in the above equation is derived using the result P(
) =
n
j=0
a
j

j
= 0.
Recall also that a
n
1. We now impose the normalization condition,
k=1
n
L
,k
R
,k
= 1 . (7.290)
This condition determines our last remaining unknown quantity (for a given ), L
,p
:
_
= L
,n
n
k=1
k a
k

k1
= P
) L
,n
, (7.291)
7.13. APPENDIX IV : KRAMERS-KR
ONIG RELATIONS 415

where P
() is the rst derivative of the characteristic polynomial. Thus, we obtain another

neat result,
L
,n
=
1
P
)
. (7.292)
Now let us evaluate the general two-point correlation function,
C
jj
(t, t
j
(t)
j
(t
)
_
j
(t)
_
j
(t
)
_
. (7.293)
We write
(s) (s
)
_
= (s s
) =
d
2
() e
i(ss
)
. (7.294)
When

() is constant, we have
(s) (s
)
_
=

(t) (s s
). This is the case of so-called

white noise, when all frequencies contribute equally. The more general case when

() is
frequency-dependent is known as colored noise. Appealing to eqn. 7.285, we have
C
jj
(t, t
) =
j1
)
t
_
0
ds e
(ts)
t
_
0
ds
(t
)
(s s
) (7.295)
=
j1
d
2
() (e
it
e
t
)(e
it
)
( i
)( +i
)
. (7.296)
In the limit t, t
, assuming Re (
) < 0 for all (i.e. no diusion), the exponentials

e
t
and e
may be neglected, and we then have

C
jj
(t, t
) =
j1
d
2
() e
i(tt
)
( i
)( +i
)
. (7.297)
7.13 Appendix IV : Kramers-Kronig Relations
Suppose ()

G() is analytic in the UHP
12
. Then for all , we must have
d
2
()
+i
= 0 , (7.298)
where is a positive innitesimal. The reason is simple: just close the contour in the UHP,
assuming () vanishes suciently rapidly that Jordans lemma can be applied. Clearly
this is an extremely weak restriction on (), given the fact that the denominator already
causes the integrand to vanish as [[
1
.
12
In this section, we use the notation

() for the susceptibility, rather than

G()
Let us examine the function
1
+i
=

( )
2
+
2

i
( )
2
+
2
. (7.299)
which we have separated into real and imaginary parts. Under an integral sign, the rst
term, in the limit 0, is equivalent to taking a principal part of the integral. That is, for
any function F() which is regular at = ,
lim
0
d
2

( )
2
+
2
F() P
d
2
F()

. (7.300)
The principal part symbol P means that the singularity at = is elided, either by
smoothing out the function 1/() as above, or by simply cutting out a region of integration
of width on either side of = .
The imaginary part is more interesting. Let us write
h(u)

u
2
+
2
. (7.301)
For [u[ , h(u) /u
2
, which vanishes as 0. For u = 0, h(0) = 1/ which diverges as
0. Thus, h(u) has a huge peak at u = 0 and rapidly decays to 0 as one moves o the
peak in either direction a distance greater that . Finally, note that
duh(u) = , (7.302)
a result which itself is easy to show using contour integration. Putting it all together, this
tells us that
lim
0
u
2
+
2
= (u) . (7.303)
Thus, for positive innitesimal ,
1
u i
= P
1
u
i(u) , (7.304)
a most useful result.
We now return to our initial result 7.298, and we separate () into real and imaginary
parts:
() =
() +i
() . (7.305)
(In this equation, the primes do not indicate dierentiation with respect to argument.) We
therefore have, for every real value of ,
0 =
d
2
_
() +i
()
_ _
P
1

i( )
_
. (7.306)
7.13. APPENDIX IV : KRAMERS-KR
ONIG RELATIONS 417

Taking the real and imaginary parts of this equation, we derive the Kramers-Kr onig rela-
tions:
() = +P
()

(7.307)
() = P
()

. (7.308)

210 Course

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

210 Course

Încărcat de

Drepturi de autor:

Formate disponibile

Lecture Notes on Thermodynamics and Statistical Mechanics

C) 38.09 2.05 Zinc 25.3 0.387

C, unless otherwise noted) of some common substances.

> V , with no added heat Q and no work W done. Therefore

being very small, since the dimensions L

are macroscopic, angular momen-

) = 0 and the isothermal equation of state d(pV ) = 0. Equivalently, we can

= 2V , and particle number

X = (E, V, N). Note that Q is a symmetric matrix.

< 0. Clearly we must

< 0, but there still is

H are collinear, hence

, the gas cools under such conditions.

C. We nd = 2.5 atm when = 0.1.

= K m, where the molality m is the number

C x[%]. This can be expressed once again as T

C the latent heat of

C the melt layer thickness

C. At very low temperatures,

C and we toss it into

) is a homogeneous function of degree one:

) are not all independent. We can therefore write

1, . . . , . This gives ( 1) independent equations equations

range over the 1 values 2, . . . , .

(x), the system

(x) < T < T

, so we can choose which

we wish to use in our phase diagram. The results are plotted in

, i.e. innitesimally negative, it is B-rich. The

) planes. The black dot is the critical

(v), given in eqn. 1.495.

(v), which is the locus of points along

(v) is called the spinodal . Below this curve, the system is

(x) > 0 everywhere. The

(p) = x(p) (1.532)

(x) > 0 is that a function

must also be positive

), can be quite handy, especially when used

) 0 for all , or else the probabilities will become negative.

is the canonical momentum conjugate to the generalized coordinate q

be the -advance mapping which

is invertible and volume preserving, then for any set 1

to this relation k times to get

dP() A() . (3.6)

N . Our only assumptions are that the mean and standard

also is called the parti-

are ladder operators satisfying

= 1, which can be taken to be

= 0, hence the ordinary canonical distribution is a stationary solution

= 0, i.e. the bare

= 0, for all k and k

. Then for the Hamiltonian

is called the fugacity.

, and a. Then for the grand canonical Hamiltonian

be the number of sites with spin , where = 1. Clearly N

. What is the partition function? Is it

labels a unique many particle state

occurs many times. We have therefore

. This follows from the multinomial theorem,

(T). For diatomic molecules, then, this

0.6 forms an orientational

A is the Bohr radius. Thus, we have

be the number of spins, and N