Sunteți pe pagina 1din 817

Chapter Twelve

Atoms

12.1 INTRODUCTION

By the nineteenth century, enough evidence had accumulated in


favour of atomic hypothesis of matter. In 1897, the experiments on
electric discharge through gases carried out by the English physicist J.
J. Thomson (1856 1940) revealed that atoms of different elements
contain negatively charged constituents (electrons) that are identical
for all atoms. However, atoms on a whole are electrically neutral.
Therefore, an atom must also contain some positive charge to
neutralise the negative charge of the electrons. But what is the
arrangement of the positive charge and the electrons inside the atom?
In other words, what is the structure of an atom?

The first model of atom was proposed by J. J. Thomson in 1898.


According to this model, the positive charge of the atom is uniformly
distributed throughout the volume of the atom and the negatively
charged electrons are embedded in it like seeds in a watermelon. This
model was picturesquely called plum pudding model of the atom.
However subsequent studies on atoms, as described in this chapter,
showed that the distribution of the electrons and positive charges are
very different from that proposed in this model.
We know that condensed matter (solids and liquids) and dense gases
at all temperatures emit electromagnetic radiation in which a
continuous distribution of several wavelengths is present, though with
different intensities. This radiation is considered to be due to
oscillations of atoms and molecules, governed by the interaction of
each atom or molecule with its neighbours. In contrast, light emitted
from rarefied gases heated in a flame, or excited electrically in a glow
tube such as the familiar neon sign or mercury vapour light has only
certain discrete wavelengths. The spectrum appears as a series of
bright lines. In such gases, the average spacing between atoms is
large. Hence, the radiation emitted can be considered due to
individual atoms rather than because of interactions between atoms or
molecules.

Ernst Rutherford (1871 1937) British physicist who did pioneering work on
radioactive radiation. He discovered alpha-rays and beta-rays. Along with
Federick Soddy, he created the modern theory of radioactivity. He studied the
emanation of thorium and discovered a new noble gas, an isotope of radon,
now known as thoron. By scattering alpha-rays from the metal foils, he
discovered the atomic nucleus and proposed the plenatery model of the atom.
He also estimated the approximate size of the nucleus.

In the early nineteenth century it was also established that each


element is associated with a characteristic spectrum of radiation, for
example, hydrogen always gives a set of lines with fixed relative
position between the lines. This fact suggested an intimate
relationship between the internal structure of an atom and the
spectrum of radiation emitted by it. In 1885, Johann Jakob Balmer
(1825 1898) obtained a simple empirical formula which gave the
wavelengths of a group of lines emitted by atomic hydrogen. Since
hydrogen is simplest of the elements known, we shall consider its
spectrum in detail in this chapter.
Ernst Rutherford (18711937), a former research student of J. J.
Thomson, was engaged in experiments on -particles emitted by
some radioactive elements. In 1906, he proposed a classic
experiment of scattering of these -particles by atoms to investigate
the atomic structure. This experiment was later performed around
1911 by Hans Geiger (18821945) and Ernst Marsden (18891970,
who was 20 year-old student and had not yet earned his bachelors
degree). The details are discussed in Section 12.2. The explanation of
the results led to the birth of Rutherfords planetary model of atom
(also called the nuclear model of the atom). According to this the
entire positive charge and most of the mass of the atom is
concentrated in a small volume called the nucleus with electrons
revolving around the nucleus just as planets revolve around the sun.
Rutherfords nuclear model was a major step towards how we see the
atom today. However, it could not explain why atoms emit light of only
discrete wavelengths. How could an atom as simple as hydrogen,
consisting of a single electron and a single proton, emit a complex
spectrum of specific wavelengths? In the classical picture of an atom,
the electron revolves round the nucleus much like the way a planet
revolves round the sun. However, we shall see that there are some
serious difficulties in accepting such a model.

12.2 ALPHA-PARTICLE SCATTERING AND


RUTHERFORDS NUCLEAR MODEL OF ATOM

At the suggestion of Ernst Rutherford, in 1911, H. Geiger and E.


Marsden performed some experiments. In one of their experiments, as
shown in Fig. 12.1, they directed a beam of

5.5 MeV -particles emitted from a radioactive source at a thin


metal foil made of gold. Figure 12.2 shows a schematic diagram of

this experiment. Alpha-particles emitted by a radioactive


source were collimated into a narrow beam by their passage through
lead bricks. The beam was allowed to fall on a thin foil of gold of
thickness 2.1 107 m. The scattered alpha-particles were observed
through a rotatable detector consisting of zinc sulphide screen and a
microscope. The scattered alpha-particles on striking the screen
produced brief light flashes or scintillations. These flashes may be
viewed through a microscope and the distribution of the number of
scattered particles may be studied as a function of angle of scattering.
Figure 12.1 Geiger-Marsden scattering experiment. The entire apparatus is placed
in a vacuum chamber (not shown in this figure).
Figure 12.2 Schematic arrangement of the Geiger-Marsden experiment.

A typical graph of the total number of -particles scattered at different


angles, in a given interval of time, is shown in Fig. 12.3. The dots in
this figure represent the data points and the solid curve is the
theoretical prediction based on the assumption that the target atom
has a small, dense, positively charged nucleus. Many of the -
particles pass through the foil. It means that they do not suffer any
collisions. Only about 0.14% of the incident -particles scatter by more
than 1; and about 1 in 8000 deflect by more than 90. Rutherford
argued that, to deflect the -particle backwards, it must experience a
large repulsive force. This force could be provided if the greater part of
the mass of the atom and its positive charge were concentrated tightly
at its centre. Then the incoming -particle could get very close to the
positive charge without penetrating it, and such a close encounter
would result in a large deflection. This agreement supported the
hypothesis of the nuclear atom. This is why Rutherford is credited with
the discovery of the nucleus.

In Rutherfords nuclear model of the atom, the entire positive charge


and most of the mass of the atom are concentrated in the nucleus with
the electrons some distance away. The electrons would be moving in
orbits about the nucleus just as the planets do around the sun.
Rutherfords experiments suggested the size of the nucleus to be
about 1015 m to 1014 m. From kinetic theory, the size of an atom
was known to be 1010 m, about 10,000 to 100,000 times larger than
the size of the nucleus (see Chapter 11, Section 11.6 in Class XI
Physics textbook). Thus, the electrons would seem to be at a distance
from the nucleus of about 10,000 to 100,000 times the size of the
nucleus itself. Thus, most of an atom is empty space. With the atom
being largely empty space, it is easy to see why most -particles go
right through a thin metal foil. However, when -particle happens to
come near a nucleus, the intense electric field there scatters it through
a large angle. The atomic electrons, being so light, do not appreciably
affect the -particles.

Figure 12.3 Experimental data points (shown by dots) on scattering of -particles


by a thin foil at different angles obtained by Geiger and Marsden using the setup
shown in Figs. 12.1 and 12.2. Rutherfords nuclear model predicts the solid curve
which is seen to be in good agreement with experiment.

The scattering data shown in Fig. 12.3 can be analysed by employing


Rutherfords nuclear model of the atom. As the gold foil is very thin, it
can be assumed that -particles will suffer not more than one
scattering during their passage through it. Therefore, computation of
the trajectory of an alpha-particle scattered by a single nucleus is
enough. Alpha-particles are nuclei of helium atoms and, therefore,
carry two units, 2e, of positive charge and have the mass of the
helium atom. The charge of the gold nucleus is Ze, where Z is the
atomic number of the atom; for gold Z = 79. Since the nucleus of gold
is about 50 times heavier than an -particle, it is reasonable to
assume that it remains stationary throughout the scattering process.
Under these assumptions, the trajectory of an alpha-particle can be
computed employing Newtons second law of motion and the
Coulombs law for electrostatic force of repulsion between the alpha-
particle and the positively charged nucleus.
The magnitude of this force is

(12.1)
where r is the distance between the -particle and the nucleus. The
force is directed along the line joining the -particle and the nucleus.
The magnitude and direction of the force on an -particle continuously
changes as it approaches the nucleus and recedes away from it.

12.2.1 Alpha-particle trajectory

The trajectory traced by an -particle depends on the impact


parameter, b of collision. The impact parameter is the perpendicular
distance of the initial velocity vector of the -particle from the centre of
the nucleus (Fig. 12.4). A given beam of -particles has a distribution
of impact parameters b, so that the beam is scattered in various
directions with different probabilities (Fig. 12.4). (In a beam, all
particles have nearly same kinetic energy.) It is seen that an -particle
close to the nucleus (small impact parameter) suffers large scattering.
In case of head-on collision, the impact parameter is minimum and the
-particle rebounds back ( ). For a large impact parameter, the -
particle goes nearly undeviated and has a small deflection ( 0).
The fact that only a small fraction of the number of incident particles
rebound back indicates that the number of -particles undergoing
head on collision is small. This, in turn, implies that the mass of the
atom is concentrated in a small volume. Rutherford scattering
therefore, is a powerful way to determine an upper limit to the size of
the nucleus.

Figure 12.4 Trajectory of -particles in the coulomb field of a target nucleus. The
impact parameter, b and scattering angle are also depicted.
Example 12.1 In the Rutherfords nuclear model of the atom, the nucleus
(radius about 1015 m) is analogous to the sun about which the electron move
in orbit (radius 1010 m) like the earth orbits around the sun. If the
dimensions of the solar system had the same proportions as those of the
atom, would the earth be closer to or farther away from the sun than actually it
is? The radius of earths orbit is about 1.5 1011 m. The radius of sun is taken
as 7 108 m.
Solution The ratio of the radius of electrons orbit to the radius of nucleus is
(1010 m)/(1015 m) = 105, that is, the radius of the electrons orbit is 105
times larger than the radius of nucleus. If the radius of the earths orbit around
the sun were 105 times larger than the radius of the sun, the radius of the
earths orbit would be 105 7 108 m =
7 1013 m. This is more than 100 times greater than the actual orbital radius
of earth. Thus, the earth would be much farther away from the sun.
It implies that an atom contains a much greater fraction of empty space than
our solar system does.
Example 12.2 In a Geiger-Marsden experiment, what is the distance of closest
approach to the nucleus of a 7.7 MeV -particle before it comes momentarily
to rest and reverses its direction?
Solution The key idea here is that throughout the scattering process, the total
mechanical energy of the system consisting of an -particle and a gold nucleus
is conserved. The systems initial mechanical energy is Ei, before the particle
and nucleus interact, and it is equal to its mechanical energy Ef when the -
particle momentarily stops. The initial energy Ei is just the kinetic energy K of
the incoming - particle. The final energy Ef is just the electric potential energy
U of the system. The potential energy U can be calculated from Eq. (12.1).
Let d be the centre-to-centre distance between the -particle and the gold
nucleus when the -particle is at its stopping point. Then we can write the
conservation of energy Ei = Ef as

Thus the distance of closest approach d is given by


The maximum kinetic energy found in -particles of natural origin is 7.7 MeV or
1.2 1012 J. Since 1/40 = 9.0 109 N m2/C2. Therefore with e = 1.6 1019
C, we have,

= 3.84 1016 Z m
The atomic number of foil material gold is Z = 79, so that
14 m = 30 fm. (1 fm (i.e. fermi) = 1015 m.)
d (Au) = 3.0 10
The radius of gold nucleus is, therefore, less than 3.0 1014 m. This is not in
very good agreement with the observed result as the actual radius of gold
nucleus is 6 fm. The cause of discrepancy is that the distance of closest
approach is considerably larger than the sum of the radii of the gold nucleus
and the -particle. Thus, the -particle reverses its motion without ever actually
touching the gold nucleus.

12.2.2 Electron orbits


The Rutherford nuclear model of the atom which involves classical
concepts, pictures the atom as an electrically neutral sphere
consisting of a very small, massive and positively charged nucleus at
the centre surrounded by the revolving electrons in their respective
dynamically stable orbits. The electrostatic force of attraction, Fe
between the revolving electrons and the nucleus provides the requisite
centripetal force (Fc) to keep them in their orbits. Thus, for a
dynamically stable orbit in a hydrogen atom
Fe = Fc

(12.2)
Thus the relation between the orbit radius and the electron velocity is

(12.3)
The kinetic energy (K) and electrostatic potential energy (U) of the
electron in hydrogen atom are

(The negative sign in U signifies that the electrostatic force is in the r


direction.) Thus the total energy E of the electron in a hydrogen atom
is

(12.4)
The total energy of the electron is negative. This implies the fact that
the electron is bound to the nucleus. If E were positive, an electron will
not follow a closed orbit around the nucleus.

Example 12.3 It is found experimentally that 13.6 eV energy is required to


separate a hydrogen atom into a proton and an electron. Compute the orbital
radius and the velocity of the electron in a hydrogen atom.
Solution Total energy of the electron in hydrogen atom is 13.6 eV = 13.6
1.6 1019 J = 2.2 1018 J. Thus from Eq. (12.4), we have
This gives the orbital radius

= 5.3 1011 m.
The velocity of the revolving electron can be computed from Eq. (12.3) with m
= 9.1 1031 kg,

12.3 Atomic Spectra

As mentioned in Section 12.1, each element has a characteristic


spectrum of radiation, which it emits. When an atomic gas or vapour is
excited at low pressure, usually by passing an electric current through
it, the emitted radiation has a spectrum which contains certain specific
wavelengths only. A spectrum of this kind is termed as emission line
spectrum and it consists of bright lines on a dark background. The
spectrum emitted by atomic hydrogen is shown in Fig. 12.5. Study of
emission line spectra of a material can therefore serve as a type of
fingerprint for identification of the gas. When white light passes
through a gas and we analyse the transmitted light using a
spectrometer we find some dark lines in the spectrum. These dark
lines correspond precisely to those wavelengths which were found in
the emission line spectrum of the gas. This is called the absorption
spectrum of the material of the gas.

12.3.1 Spectral series

We might expect that the frequencies of the light emitted by a


particular element would exhibit some regular pattern. Hydrogen is the
simplest atom and therefore, has the simplest spectrum. In the
observed spectrum, however, at first sight, there does not seem to be
any resemblance of order or regularity in spectral lines. But the
spacing between lines within certain sets of the hydrogen spectrum
decreases in a regular way (Fig. 12.5). Each of these sets is called a
spectral series. In 1885, the first such series was observed by a
Swedish school teacher Johann Jakob Balmer (18251898) in the
visible region of the hydrogen spectrum. This series is called Balmer
series (Fig. 12.6). The line with the longest wavelength, 656.3 nm in
the red is called H; the next line with wavelength 486.1 nm in the
blue-green is called H, the third line 434.1 nm in the violet is called
H; and so on.
Figure 12.5 Emission lines in the spectrum of hydrogen.

As the wavelength decreases, the lines appear closer together and


are weaker in intensity. Balmer found a simple empirical formula for
the observed wavelengths

(12.5)
where is the wavelength, R is a constant called the Rydberg
constant, and n may have integral values 3, 4, 5, etc. The value of R is
1.097 107 m1. This equation is also called Balmer formula.
Taking n = 3 in Eq. (12.5), one obtains the wavelength of the H line:

= 1.522 106 m1
i.e., = 656.3 nm
For n = 4, one obtains the wavelength of H line, etc. For n = , one
obtains the limit of the series, at = 364.6 nm. This is the shortest
wavelength in the Balmer series. Beyond this limit, no further distinct
lines appear, instead only a faint continuous spectrum is seen.

Figure 12.6 Balmer series in the emission spectrum of hydrogen.

Other series of spectra for hydrogen were subsequently discovered.


These are known, after their discoverers, as Lyman, Paschen,
Brackett, and Pfund series. These are represented by the formulae:
Lyman series:

n = 2,3,4... (12.6)
Paschen series:

n = 4,5,6...
(12.7)
Brackett series:

n = 5,6,7... (12.8)
Pfund series:

n = 6,7,8... (12.9)

The Lyman series is in the ultraviolet, and the Paschen and Brackett
series are in the infrared region.
The Balmer formula Eq. (12.5) may be written in terms of frequency of
the light, recalling that
c =

or
Thus, Eq. (12.5) becomes

(12.10)

There are only a few elements (hydrogen, singly ionised helium, and
doubly ionised lithium) whose spectra can be represented by simple
formula like Eqs. (12.5) (12.9).
Equations (12.5) (12.9) are useful as they give the wavelengths that
hydrogen atoms radiate or absorb. However, these results are
empirical and do not give any reasoning why only certain frequencies
are observed in the hydrogen spectrum.

12.4 Bohr Model of the Hydrogen Atom

The model of the atom proposed by Rutherford assumes that the


atom, consisting of a central nucleus and revolving electron is stable
much like sun-planet system which the model imitates. However, there
are some fundamental differences between the two situations. While
the planetary system is held by gravitational force, the nucleus-
electron system being charged objects, interact by Coulombs Law of
force. We know that an object which moves in a circle is being
constantly accelerated the acceleration being centripetal in nature.
According to classical electromagnetic theory, an accelerating
charged particle emits radiation in the form of electromagnetic waves.
The energy of an accelerating electron should therefore, continuously
decrease. The electron would spiral inward and eventually fall into the
nucleus (Fig. 12.7). Thus, such an atom can not be stable. Further,
according to the classical electromagnetic theory, the frequency of the
electromagnetic waves emitted by the revolving electrons is equal to
the frequency of revolution. As the electrons spiral inwards, their
angular velocities and hence their frequencies would change
continuously, and so will the frequency of the light emitted. Thus, they
would emit a continuous spectrum, in contradiction to the line
spectrum actually observed. Clearly Rutherford model tells only a part
of the story implying that the classical ideas are not sufficient to
explain the atomic structure.
Figure 12.7 An accelerated atomic electron must spiral into the nucleus as it loses
energy.

Niels Henrik David Bohr (1885 1962) Danish physicist who explained the
spectrum of hydrogen atom based on quantum ideas. He gave a theory of
nuclear fission based on the liquid-drop model of nucleus. Bohr contributed to
the clarification of conceptual problems in quantum mechanics, in particular by
proposing the comple- mentary principle.
Example 12.4 According to the classical electromagnetic theory, calculate the
initial frequency of the light emitted by the electron revolving around a proton in
hydrogen atom.
Solution From Example 12.3 we know that velocity of electron moving around
a proton in hydrogen atom in an orbit of radius 5.3 1011 m is 2.2 106 m/s.
Thus, the frequency of the electron moving around the proton is

6.6 1015 Hz.


According to the classical electromagnetic theory we know that the frequency
of the electromagnetic waves emitted by the revolving electrons is equal to the
frequency of its revolution around the nucleus. Thus the initial frequency of the
light emitted is 6.6 1015 Hz.

It was Niels Bohr (1885 1962) who made certain modifications in this
model by adding the ideas of the newly developing quantum
hypothesis. Niels Bohr studied in Rutherfords laboratory for several
months in 1912 and he was convinced about the validity of Rutherford
nuclear model. Faced with the dilemma as discussed above, Bohr, in
1913, concluded that in spite of the success of electromagnetic theory
in explaining large-scale phenomena, it could not be applied to the
processes at the atomic scale. It became clear that a fairly radical
departure from the established principles of classical mechanics and
electromagnetism would be needed to understand the structure of
atoms and the relation of atomic structure to atomic spectra. Bohr
combined classical and early quantum concepts and gave his theory
in the form of three postulates. These are :
(i) Bohrs first postulate was that an electron in an atom could revolve
in certain stable orbits without the emission of radiant energy, contrary
to the predictions of electromagnetic theory. According to this
postulate, each atom has certain definite stable states in which it can
exist, and each possible state has definite total energy. These are
called the stationary states of the atom.
(ii) Bohrs second postulate defines these stable orbits. This postulate
states that the electron revolves around the nucleus only in those
orbits for which the angular momentum is some integral multiple of
h/2 where h is the Plancks constant (= 6.6 1034 J s). Thus the
angular momentum (L) of the orbiting electron is quantised. That is
L = nh/2 (12.11)
(iii) Bohrs third postulate incorporated into atomic theory the early
quantum concepts that had been developed by Planck and Einstein. It
states that an electron might make a transition from one of its
specified non-radiating orbits to another of lower energy. When it does
so, a photon is emitted having energy equal to the energy difference
between the initial and final states. The frequency of the emitted
photon is then given by
h = Ei Ef (12.12)

where Ei and Ef are the energies of the initial and final states and Ei >
Ef.
For a hydrogen atom, Eq. (12.4) gives the expression to determine the
energies of different energy states. But then this equation requires the
radius r of the electron orbit. To calculate r, Bohrs second postulate
about the angular momentum of the electronthe quantisation
condition is used. The angular momentum L is given by
L = mvr
Bohrs second postulate of quantisation [Eq. (12.11)] says that the
allowed values of angular momentum are integral multiples of h/2.

Ln = mvnrn = (12.13)
where n is an integer, rn is the radius of nth possible orbit and vn is the
speed of moving electron in the nth orbit. The allowed orbits are
numbered 1, 2, 3 ..., according to the values of n, which is called the
principal quantum number of the orbit.
From Eq. (12.3), the relation between vn and rn is

Combining it with Eq. (12.13), we get the following expressions for vn


and rn,

(12.14)
and

(12.15)
Eq. (12.14) depicts that the orbital speed in the nth orbit falls by a
factor of n. Using Eq. (12.15), the size of the innermost orbit (n = 1)
can be obtained as
This is called the Bohr radius, represented by the symbol a0. Thus,

(12.16)
Substitution of values of h, m, 0 and e gives a0 = 5.29 1011 m.
From Eq. (12.15), it can also be seen that the radii of the orbits
increase as n2.

The total energy of the electron in the stationary states of the


hydrogen atom can be obtained by substituting the value of orbital
radius in Eq. (12.4) as

or (12.17)
Substituting values, Eq. (12.17) yields

(12.18)
Atomic energies are often expressed in electron volts (eV) rather than
joules. Since 1 eV = 1.6 1019 J, Eq. (12.18) can be rewritten as

(12.19)
The negative sign of the total energy of an electron moving in an orbit
means that the electron is bound with the nucleus. Energy will thus be
required to remove the electron from the hydrogen atom to a distance
infinitely far away from its nucleus (or proton in hydrogen atom).
The derivation of Eqs. (12.17) (12.19) involves the assumption that
the electronic orbits are circular, though orbits under inverse square
force are, in general elliptical. (Planets move in elliptical orbits under
the inverse square gravitational force of the sun.) However, it was
shown by the German physicist Arnold Sommerfeld (1868 1951)
that, when the restriction of circular orbit is relaxed, these equations
continue to hold even for elliptic orbits.

Example 12.5 A 10 kg satellite circles earth once every 2 h in an orbit having a


radius of 8000 km. Assuming that Bohrs angular momentum postulate applies
to satellites just as it does to an electron in the hydrogen atom, find the
quantum number of the orbit of the satellite.
Solution
From Eq. (12.13), we have
m vn rn = nh/2

Here m = 10 kg and rn = 8 106 m. We have the time period T of the circling


satellite as 2 h. That is T = 7200 s.
Thus the velocity vn = 2 rn/T.

The quantum number of the orbit of satellite


2
n = (2 rn) m/(T h).
Substituting the values,
6 2 34
n = (2 8 10 m) 10/(7200 s 6.64 10 J s)
= 5.3 1045
Note that the quantum number for the satellite motion is extremely large! In
fact for such large quantum numbers the results of quantisation conditions
tend to those of classical physics.
12.4.1 Energy levels

The energy of an atom is the least (largest negative value) when its
electron is revolving in an orbit closest to the nucleus i.e., the one for
which n = 1. For n = 2, 3, ... the absolute value of the energy E is
smaller, hence the energy is progressively larger in the outer orbits.
The lowest state of the atom, called the ground state, is that of the
lowest energy, with the electron revolving in the orbit of smallest
radius, the Bohr radius, a0. The energy of this state (n = 1), E1 is
13.6 eV. Therefore, the minimum energy required to free the electron
from the ground state of the hydrogen atom is 13.6 eV. It is called the
ionisation energy of the hydrogen atom. This prediction of the Bohrs
model is in excellent agreement with the experimental value of
ionisation energy.

ORBIT VS STATE (ORBITAL PICTURE) OF ELECTRON IN ATOM

We are introduced to the Bohr Model of atom one time or the other in the
course of physics. This model has its place in the history of quantum
mechanics and particularly in explaining the structure of an atom. It has
become a milestone since Bohr introduced the revolutionary idea of definite
energy orbits for the electrons, contrary to the classical picture requiring an
accelerating particle to radiate. Bohr also introduced the idea of quantisation of
angular momentum of electrons moving in definite orbits. Thus it was a semi-
classical picture of the structure of atom.
Now with the development of quantum mechanics, we have a better
understanding of the structure of atom. Solutions of the Schrdinger wave
equation assign a wave-like description to the electrons bound in an atom due
to attractive forces of the protons.
An orbit of the electron in the Bohr model is the circular path of motion of an
electron around the nucleus. But according to quantum mechanics, we cannot
associate a definite path with the motion of the electrons in an atom. We can
only talk about the probability of finding an electron in a certain region of space
around the nucleus. This probability can be inferred from the one-electron
wave function called the orbital. This function depends only on the coordinates
of the electron.
It is therefore essential that we understand the subtle differences that exist in
the two models:

l Bohr model is valid for only one-electron atoms/ions; an energy value,


assigned to each orbit, depends on the principal quantum number n in
this model. We know that energy associated with a stationary state of an
electron depends on n only, for one-electron atoms/ions. For a multi-
electron atom/ion, this is not true.
l The solution of the Schrdinger wave equation, obtained for
hydrogen-like atoms/ions, called the wave function, gives information
about the probability of finding an electron in various regions around the
nucleus. This orbital has no resemblance whatsoever with the orbit
defined for an electron in the Bohr model.

At room temperature, most of the hydrogen atoms are in ground state.


When a hydrogen atom receives energy by processes such as
electron collisions, the atom may acquire sufficient energy to raise the
electron to higher energy states. The atom is then said to be in an
excited state. From Eq. (12.19), for n = 2; the energy E2 is 3.40 eV. It
means that the energy required to excite an electron in hydrogen atom
to its first excited state, is an energy equal to E2 E1 = 3.40 eV (
13.6) eV = 10.2 eV. Similarly, E3 = 1.51 eV and E3 E1 = 12.09 eV,
or to excite the hydrogen atom from its ground state (n = 1) to second
excited state (n = 3), 12.09 eV energy is required, and so on. From
these excited states the electron can then fall back to a state of lower
energy, emitting a photon in the process. Thus, as the excitation of
hydrogen atom increases (that is as n increases) the value of
minimum energy required to free the electron from the excited atom
decreases.
The energy level diagram* for the stationary states of a hydrogen
atom, computed from Eq. (12.19), is given in Fig. 12.8. The principal
quantum number n labels the stationary states in the ascending order
of energy. In this diagram, the highest energy state corresponds to n
= in Eq, (12.19) and has an energy of 0 eV. This is the energy of the
atom when the electron is completely removed (r = ) from the
nucleus and is at rest. Observe how the energies of the excited states
come closer and closer together as n increases.

12.5 The Line Spectra of the Hydrogen Atom

According to the third postulate of Bohrs model, when an atom makes


a transition from the higher energy state with quantum number ni to
the lower energy state with quantum number nf (nf < ni), the difference
of energy is carried away by a photon of frequency if such that
hif = Eni Enf (12.20)
Using Eq. (12.16), for Enf and Eni, we get

hif = (12.21)
or if = (12.22)
* An electron can have any total energy above E = 0 eV. In such
situations the electron is free. Thus there is a continuum of energy
states above E = 0 eV, as shown in Fig. 12.8.
Figure 12.8 The energy level diagram for the hydrogen atom. The electron in a
hydrogen atom at room temperature spendsmost of its time in the ground state. To
ionise a hydrogen atom an electron from the ground state, 13.6 eV of energy must
be supplied. (The horizontal lines specify the presence of allowed energy states.)

Equation (12.21) is the Rydberg formula, for the spectrum of the


hydrogen atom. In this relation, if we take nf = 2 and ni = 3, 4, 5..., it
reduces to a form similar to Eq. (12.10) for the Balmer series. The
Rydberg constant R is readily identified to be

R= (12.23)
If we insert the values of various constants in Eq. (12.23), we get
R = 1.03 107 m1
This is a value very close to the value (1.097 107 m1) obtained from
the empirical Balmer formula. This agreement between the theoretical
and experimental values of the Rydberg constant provided a direct
and striking confirmation of the Bohrs model.

Since both nf and ni are integers, this immediately shows that in


transitions between different atomic levels, light is radiated in various
discrete frequencies. For hydrogen spectrum, the Balmer formula
corresponds to nf = 2 and ni = 3, 4, 5, etc. The results of the Bohrs
model suggested the presence of other series spectra for hydrogen
atomthose corresponding to transitions resulting from nf = 1 and ni =
2, 3, etc.; nf = 3 and ni = 4, 5, etc., and so on. Such series were
identified in the course of spectroscopic investigations and are known
as the Lyman, Balmer, Paschen, Brackett, and Pfund series. The
electronic transitions corresponding to these series are shown in Fig.
12.9.

FRANCK HERTZ EXPERIMENT

The existence of discrete energy levels in an atom was directly verified in 1914
by James Franck and Gustav Hertz. They studied the spectrum of mercury
vapour when electrons having different kinetic energies passed through the
vapour. The electron energy was varied by subjecting the electrons to electric
fields of varying strength. The electrons collide with the mercury atoms and
can transfer energy to the mercury atoms. This can only happen when the
energy of the electron is higher than the energy difference between an energy
level of Hg occupied by an electron and a higher unoccupied level (see
Figure). For instance, the difference between an occupied energy level of Hg
and a higher unoccupied level is 4.9 eV. If an electron of having an energy of
4.9 eV or more passes through mercury, an electron in mercury atom can
absorb energy from the bombarding electron and get excited to the higher
level [Fig (a)]. The colliding electrons kinetic energy would reduce by this
amount.

The excited electron would subsequently fall back to the ground state by
emission of radiation [Fig. (b)]. The wavelength of emitted radiation is:

= 253 nm
By direct measurement, Franck and Hertz found that the emission spectrum of
mercury has a line corresponding to this wavelength. For this experimental
verification of Bohrs basic ideas of discrete energy levels in atoms and the
process of photon emission, Frank and Hertz were awarded the Nobel prize in
1925.

The various lines in the atomic spectra are produced when electrons
jump from higher energy state to a lower energy state and photons are
emitted. These spectral lines are called emission lines. But when an
atom absorbs a photon that has precisely the same energy needed by
the electron in a lower energy state to make transitions to a higher
energy state, the process is called absorption. Thus if photons with a
continuous range of frequencies pass through a rarefied gas and then
are analysed with a spectrometer, a series of dark spectral absorption
lines appear in the continuous spectrum. The dark lines indicate the
frequencies that have been absorbed by the atoms of the gas.
The explanation of the hydrogen atom spectrum provided by Bohrs
model was a brilliant achievement, which greatly stimulated progress
towards the modern quantum theory. In 1922, Bohr was awarded
Nobel Prize in Physics.
Figure 12.9 Line spectra originate in transitions between energy levels.

Example 12.6 Using the Rydberg formula, calculate the wavelengths


of the first four spectral lines in the Lyman series of the hydrogen
spectrum.
Solution The Rydberg formula is

hc/if =
The wavelengths of the first four lines in the Lyman series correspond
to transitions from ni = 2,3,4,5 to nf = 1. We know that

= 13.6 eV = 21.76 1019 J


Therefore,

= =
= 913.4 ni2/(ni2 1)

Substituting ni = 2,3,4,5, we get 21 = 1218 , 31 = 1028 , 41 =


974.3 , and 51 = 951.4 .

12.6 DE BROGLIES EXPLANATION OF BOHRS


SECOND POSTULATE OF QUANTISATION

Of all the postulates, Bohr made in his model of the atom, perhaps the
most puzzling is his second postulate. It states that the angular
momentum of the electron orbiting around the nucleus is quantised
(that is, Ln = nh/2; n = 1, 2, 3 ). Why should the angular
momentum have only those values that are integral multiples of h/2?
The French physicist Louis de Broglie explained this puzzle in 1923,
ten years after Bohr proposed his model.
We studied, in Chapter 11, about the de Broglies hypothesis that
material particles, such as electrons, also have a wave nature. C. J.
Davisson and L. H. Germer later experimentally verified the wave
nature of electrons in 1927. Louis de Broglie argued that the electron
in its circular orbit, as proposed by Bohr, must be seen as a particle
wave. In analogy to waves travelling on a string, particle waves too
can lead to standing waves under resonant conditions. From Chapter
15 of Class XI Physics textbook, we know that when a string is
plucked, a vast number of wavelengths are excited. However only
those wavelengths survive which have nodes at the ends and form the
standing wave in the string. It means that in a string, standing waves
are formed when the total distance travelled by a wave down the
string and back is one wavelength, two wavelengths, or any integral
number of wavelengths. Waves with other wavelengths interfere with
themselves upon reflection and their amplitudes quickly drop to zero.
For an electron moving in nth circular orbit of radius rn, the total
distance is the circumference of the orbit, 2rn. Thus
2 rn = n, n = 1, 2, 3... (12.24)
Figure 12.10 illustrates a standing particle wave on a circular orbit for
n = 4, i.e., 2rn = 4, where is the de Broglie wavelength of the
electron moving in nth orbit. From Chapter 11, we have = h/p, where
p is the magnitude of the electrons momentum. If the speed of the
electron is much less than the speed of light, the momentum is mvn.
Thus, = h/mvn. From Eq. (12.24), we have
2 rn = n h/mvn or m vn rn = nh/2
This is the quantum condition proposed by Bohr for the angular
momentum of the electron [Eq. (12.13)]. In Section 12.5, we saw that
this equation is the basis of explaining the discrete orbits and energy
levels in hydrogen atom. Thus de Broglie hypothesis provided an
explanation for Bohrs second postulate for the quantisation of angular
momentum of the orbiting electron. The quantised electron orbits and
energy states are due to the wave nature of the electron and only
resonant standing waves can persist.
Bohrs model, involving classical trajectory picture (planet-like electron
orbiting the nucleus), correctly predicts the gross features of the
hydrogenic atoms*, in particular, the frequencies of the radiation
emitted or selectively absorbed. This model however has many
limitations.
Some are:

(i) The Bohr model is applicable to hydrogenic atoms. It cannot be


extended even to mere two electron atoms such as helium. The
analysis of atoms with more than one electron was attempted on the
lines of Bohrs model for hydrogenic atoms but did not meet with any
success. Difficulty lies in the fact that each electron interacts not only
with the positively charged nucleus but also with all other electrons.
The formulation of Bohr model involves electrical force between
positively charged nucleus and electron. It does not include the
electrical forces between electrons which necessarily appear in
multi-electron atoms.

Figure 12.10 A standing wave is shown on a circular orbit where four de Broglie
wavelengths fit into the circumference of the orbit.

(ii) While the Bohrs model correctly predicts the frequencies of the
light emitted by hydrogenic atoms, the model is unable to explain the
relative intensities of the frequencies in the spectrum. In emission
spectrum of hydrogen, some of the visible frequencies have weak
intensity, others strong. Why? Experimental observations depict that
some transitions are more favoured than others. Bohrs model is
unable to account for the intensity variations.
Bohrs model presents an elegant picture of an atom and cannot be
generalised to complex atoms. For complex atoms we have to use a
new and radical theory based on Quantum Mechanics, which provides
a more complete picture of the atomic structure.

* Hydrogenic atoms are the atoms consisting of a nucleus with


positive charge +Ze and a single electron, where Z is the proton
number. Examples are hydrogen atom, singly ionised helium, doubly
ionised lithium, and so forth. In these atoms more complex electron-
electron interactions are nonexistent.

LASER LIGHT

Imagine a crowded market place or a railway platform with people entering a


gate and going towards all directions. Their footsteps are random and there is
no phase correlation between them. On the other hand, think of a large
number of soldiers in a regulated march. Their footsteps are very well
correlated. See figure here.

This is similar to the difference between light emitted by an ordinary source like
a candle or a bulb and that emitted by a laser. The acronym LASER stands for
Light Amplification by Stimulated Emission of Radiation. Since its development
in 1960, it has entered into all areas of science and technology. It has found
applications in physics, chemistry, biology, medicine, surgery, engineering, etc.
There are low power lasers, with a power of 0.5 mW, called pencil lasers,
which serve as pointers. There are also lasers of different power, suitable for
delicate surgery of eye or glands in the stomach. Finally, there are lasers
which can cut or weld steel.
Light is emitted from a source in the form of packets of waves. Light coming
out from an ordinary source contains a mixture of many wavelengths. There is
also no phase relation between the various waves. Therefore, such light, even
if it is passed through an aperture, spreads very fast and the beam size
increases rapidly with distance. In the case of laser light, the wavelength of
each packet is almost the same. Also the average length of the packet of
waves is much larger. This means that there is better phase correlation over a
longer duration of time. This results in reducing the divergence of a laser beam
substantially.
If there are N atoms in a source, each emitting light with intensity I, then the
total intensity produced by an ordinary source is proportional to NI, whereas in
a laser source, it is proportional to N2I. Considering that N is very large, we
see that the light from a laser can be much stronger than that from an ordinary
source.
When astronauts of the Apollo missions visited the moon, they placed a mirror
on its surface, facing the earth. Then scientists on the earth sent a strong laser
beam, which was reflected by the mirror on the moon and received back on
the earth. The size of the reflected laser beam and the time taken for the round
trip were measured. This allowed a very accurate determination of (a) the
extremely small divergence of a laser beam and (b) the distance of the moon
from the earth.

Summary

1. Atom, as a whole, is electrically neutral and therefore contains equal amount


of positive and negative charges.
2. In Thomsons model, an atom is a spherical cloud of positive charges with
electrons embedded in it.
3. In Rutherfords model , most of the mass of the atom and all its positive
charge are concentrated in a tiny nucleus (typically one by ten thousand the
size of an atom), and the electrons revolve around it.
4. Rutherford nuclear model has two main difficulties in explaining the structure
of atom: (a) It predicts that atoms are unstable because the accelerated
electrons revolving around the nucleus must spiral into the nucleus. This
contradicts the stability of matter. (b) It cannot explain the characteristic line
spectra of atoms of different elements.
5. Atoms of each element are stable and emit characteristic spectrum. The
spectrum consists of a set of isolated parallel lines termed as line spectrum. It
provides useful information about the atomic structure.
6. The atomic hydrogen emits a line spectrum consisting of various series. The
frequency of any line in a series can be expressed as a difference of two
terms;

Lyman series: ; n = 2, 3, 4,...

Balmer series: ; n = 3, 4, 5,...

Paschen series: ; n = 4, 5, 6,...

Brackett series: ; n = 5, 6, 7,...

Pfund series: ; n = 6, 7, 8,...


7. To explain the line spectra emitted by atoms, as well as the stability of
atoms, Niels Bohr proposed a model for hydrogenic (single elctron) atoms. He
introduced three postulates and laid the foundations of quantum mechanics:
(a) In a hydrogen atom, an electron revolves in certain stable orbits (called
stationary orbits) without the emission of radiant energy.
(b) The stationary orbits are those for which the angular momentum is some
integral multiple of h/2. (Bohrs quantisation condition.) That is L = nh/2,
where n is an integer called a quantum number.
(c) The third postulate states that an electron might make a transition from one
of its specified non-radiating orbits to another of lower energy. When it does
so, a photon is emitted having energy equal to the energy difference between
the initial and final states. The frequency () of the emitted photon is then given
by
h = Ei Ef
An atom absorbs radiation of the same frequency the atom emits, in which
case the electron is transferred to an orbit with a higher value of n.
Ei + h = Ef
8. As a result of the quantisation condition of angular momentum, the electron
orbits the nucleus at only specific radii. For a hydrogen atom it is given by

The total energy is also quantised:

2
= 13.6 eV/n
The n = 1 state is called ground state. In hydrogen atom the ground state
energy is 13.6 eV. Higher values of n correspond to excited states (n > 1).
Atoms are excited to these higher states by collisions with other atoms or
electrons or by absorption of a photon of right frequency.

9. de Broglies hypothesis that electrons have a wavelength = h/mv gave an


explanation for Bohrs quantised orbits by bringing in the wave-particle duality.
The orbits correspond to circular standing waves in which the circumference of
the orbit equals a whole number of wavelengths.
10. Bohrs model is applicable only to hydrogenic (single electron) atoms. It
cannot be extended to even two electron atoms such as helium. This model is
also unable to explain for the relative intensities of the frequencies emitted
even by hydrogenic atoms.
POINTS TO PONDER

1. Both the Thomsons as well as the Rutherfords models constitute an


unstable system. Thomsons model is unstable electrostatically, while
Rutherfords model is unstable because of electromagnetic radiation of orbiting
electrons.
2. What made Bohr quantise angular momentum (second postulate) and not
some other quantity? Note, h has dimensions of angular momentum, and for
circular orbits, angular momentum is a very relevant quantity. The second
postulate is then so natural!
3. The orbital picture in Bohrs model of the hydrogen atom was inconsistent
with the uncertainty principle. It was replaced by modern quantum mechanics
in which Bohrs orbits are regions where the electron may be found with large
probability.
4. Unlike the situation in the solar system, where planet-planet gravitational
forces are very small as compared to the gravitational force of the sun on each
planet (because the mass of the sun is so much greater than the mass of any
of the planets), the electron-electron electric force interaction is comparable in
magnitude to the electron-nucleus electrical force, because the charges and
distances are of the same order of magnitude. This is the reason why the
Bohrs model with its planet-like electron is not applicable to many electron
atoms.
5. Bohr laid the foundation of the quantum theory by postulating specific orbits
in which electrons do not radiate. Bohrs model include only one quantum
number n. The new theory called quantum mechanics supportes Bohrs
postulate. However in quantum mechanics (more generally accepted), a given
energy level may not correspond to just one quantum state. For example, a
state is characterised by four quantum numbers (n, l, m, and s), but for a pure
Coulomb potential (as in hydrogen atom) the energy depends only on n.
6. In Bohr model, contrary to ordinary classical expectation, the frequency of
revolution of an electron in its orbit is not connected to the frequency of
spectral line. The later is the difference between two orbital energies divided
by h. For transitions between large quantum numbers (n to n 1, n very large),
however, the two coincide as expected.
7. Bohrs semiclassical model based on some aspects of classical physics and
some aspects of modern physics also does not provide a true picture of the
simplest hydrogenic atoms. The true picture is quantum mechanical affair
which differs from Bohr model in a number of fundamental ways. But then if
the Bohr model is not strictly correct, why do we bother about it? The reasons
which make Bohrs model still useful are:
(iii) The model demonstrates how a theoretical physicist occasionally must
quite literally ignore certain problems of approach in hopes of being able to
make some predictions. If the predictions of the theory or model agree with
experiment, a theoretician then must somehow hope to explain away or
rationalise the problems that were ignored along the way.

EXERCISES
12.1 Choose the correct alternative from the clues given at the end
of the each statement:
(a) The size of the atom in Thomsons model is .......... the atomic
size in Rutherfords model. (much greater than/no different
from/much less than.)
(b) In the ground state of .......... electrons are in stable equilibrium,
while in .......... electrons always experience a net force.
(Thomsons model/ Rutherfords model.)

(c) A classical atom based on .......... is doomed to collapse.


(Thomsons model/ Rutherfords model.)
(d) An atom has a nearly continuous mass distribution in a ..........
but has a highly non-uniform mass distribution in ..........
(Thomsons model/ Rutherfords model.)
(e) The positively charged part of the atom possesses most of the
mass in .......... (Rutherfords model/both the models.)
12.2 Suppose you are given a chance to repeat the alpha-particle
scattering experiment using a thin sheet of solid hydrogen in place
of the gold foil. (Hydrogen is a solid at temperatures below 14 K.)
What results do you expect?
12.3 What is the shortest wavelength present in the Paschen
series of spectral lines?
12.4 A difference of 2.3 eV separates two energy levels in an
atom. What is the frequency of radiation emitted when the atom
make a transition from the upper level to the lower level?
12.5 The ground state energy of hydrogen atom is 13.6 eV. What
are the kinetic and potential energies of the electron in this state?
12.6 A hydrogen atom initially in the ground level absorbs a
photon, which excites it to the n = 4 level. Determine the
wavelength and frequency of photon.
12.7 (a) Using the Bohrs model calculate the speed of the
electron in a hydrogen atom in the n = 1, 2, and 3 levels. (b)
Calculate the orbital period in each of these levels.
12.8 The radius of the innermost electron orbit of a hydrogen atom
is 5.31011 m. What are the radii of the n = 2 and n =3 orbits?

12.9 A 12.5 eV electron beam is used to bombard gaseous


hydrogen at room temperature. What series of wavelengths will be
emitted?
12.10 In accordance with the Bohrs model, find the quantum
number that characterises the earths revolution around the sun in
an orbit of radius 1.5 1011 m with orbital speed 3 104 m/s.
(Mass of earth = 6.0 1024 kg.)
Additional Exercises

12.11 Answer the following questions, which help you understand


the difference between Thomsons model and Rutherfords model
better.
(a) Is the average angle of deflection of -particles by a thin gold
foil predicted by Thomsons model much less, about the same, or
much greater than that predicted by Rutherfords model?
(b) Is the probability of backward scattering (i.e., scattering of -
particles at angles greater than 90) predicted by Thomsons
model much less, about the same, or much greater than that
predicted by Rutherfords model?
(c) Keeping other factors fixed, it is found experimentally that for
small thickness t, the number of -particles scattered at moderate
angles is proportional to t. What clue does this linear dependence
on t provide?

(d) In which model is it completely wrong to ignore multiple


scattering for the calculation of average angle of scattering of
-particles by a thin foil?
12.12 The gravitational attraction between electron and proton in a
hydrogen atom is weaker than the coulomb attraction by a factor
of about 1040. An alternative way of looking at this fact is to
estimate the radius of the first Bohr orbit of a hydrogen atom if the
electron and proton were bound by gravitational attraction. You will
find the answer interesting.
12.13 Obtain an expression for the frequency of radiation emitted
when a hydrogen atom de-excites from level n to level (n1). For
large n, show that this frequency equals the classical frequency of
revolution of the electron in the orbit.
12.14 Classically, an electron can be in any orbit around the
nucleus of an atom. Then what determines the typical atomic size?
Why is an atom not, say, thousand times bigger than its typical
size? The question had greatly puzzled Bohr before he arrived at
his famous model of the atom that you have learnt in the text. To
simulate what he might well have done before his discovery, let us
play as follows with the basic constants of nature and see if we
can get a quantity with the dimensions of length that is roughly
equal to the known size of an atom (~ 1010m).
(a) Construct a quantity with the dimensions of length from the
fundamental constants e, me, and c. Determine its numerical
value.
(b) You will find that the length obtained in (a) is many orders of
magnitude smaller than the atomic dimensions. Further, it involves
c. But energies of atoms are mostly in non-relativistic domain
where c is not expected to play any role. This is what may have
suggested Bohr to discard c and look for something else to get
the right atomic size. Now, the Plancks constant h had already
made its appearance elsewhere. Bohrs great insight lay in
recognising that h, me, and e will yield the right atomic size.
Construct a quantity with the dimension of length from h, me, and
e and confirm that its numerical value has indeed the correct order
of magnitude.
12.15 The total energy of an electron in the first excited state of
the hydrogen atom is about 3.4 eV.
(a) What is the kinetic energy of the electron in this state?
(b) What is the potential energy of the electron in this state?
(c) Which of the answers above would change if the choice of the
zero of potential energy is changed?

12.16 If Bohrs quantisation postulate (angular momentum =


nh/2) is a basic law of nature, it should be equally valid for the
case of planetary motion also. Why then do we never speak of
quantisation of orbits of planets around the sun?
12.17 Obtain the first Bohrs radius and the ground state energy of
a muonic hydrogen atom [i.e., an atom in which a negatively
charged muon () of mass about 207me orbits around a proton].
Chapter Thirteen

NUCLEI

13.1 Introduction

In the previous chapter, we have learnt that in every atom, the positive
charge and mass are densely concentrated at the centre of the atom
forming its nucleus. The overall dimensions of a nucleus are much
smaller than those of an atom. Experiments on scattering of -
particles demonstrated that the radius of a nucleus was smaller than
the radius of an atom by a factor of about 104. This means the volume
of a nucleus is about 1012 times the volume of the atom. In other
words, an atom is almost empty. If an atom is enlarged to the size of a
classroom, the nucleus would be of the size of pinhead. Nevertheless,
the nucleus contains most (more than 99.9%) of the mass of an atom.
Does the nucleus have a structure, just as the atom does? If so, what
are the constituents of the nucleus? How are these held together? In
this chapter, we shall look for answers to such questions. We shall
discuss various properties of nuclei such as their size, mass and
stability, and also associated nuclear phenomena such as
radioactivity, fission and fusion.

13.2 Atomic Masses and Composition of Nucleus


The mass of an atom is very small, compared to a kilogram; for
example, the mass of a carbon atom, 12C, is 1.992647 1026 kg.
Kilogram is not a very convenient unit to measure such small
quantities. Therefore, a different mass unit is used for expressing
atomic masses. This unit is the atomic mass unit (u), defined as 1/12th
of the mass of the carbon (12C) atom. According to this definition

(13.1)
The atomic masses of various elements expressed in atomic mass
unit (u) are close to being integral multiples of the mass of a hydrogen
atom. There are, however, many striking exceptions to this rule. For
example, the atomic mass of chlorine atom is 35.46 u.

Accurate measurement of atomic masses is carried out with a mass


spectrometer, The measurement of atomic masses reveals the
existence of different types of atoms of the same element, which
exhibit the same chemical properties, but differ in mass. Such atomic
species of the same element differing in mass are called isotopes. (In
Greek, isotope means the same place, i.e. they occur in the same
place in the periodic table of elements.) It was found that practically
every element consists of a mixture of several isotopes. The relative
abundance of different isotopes differs from element to element.
Chlorine, for example, has two isotopes having masses 34.98 u and
36.98 u, which are nearly integral multiples of the mass of a hydrogen
atom. The relative abundances of these isotopes are 75.4 and 24.6
per cent, respectively. Thus, the average mass of a chlorine atom is
obtained by the weighted average of the masses of the two isotopes,
which works out to be

=
= 35.47 u
which agrees with the atomic mass of chlorine.
Even the lightest element, hydrogen has three isotopes having
masses 1.0078 u, 2.0141 u, and 3.0160 u. The nucleus of the lightest
atom of hydrogen, which has a relative abundance of 99.985%, is
called the proton. The mass of a proton is

(13.2)
This is equal to the mass of the hydrogen atom (= 1.00783u), minus
the mass of a single electron (me = 0.00055 u). The other two
isotopes of hydrogen are called deuterium and tritium. Tritium nuclei,
being unstable, do not occur naturally and are produced artificially in
laboratories.
The positive charge in the nucleus is that of the protons. A proton
carries one unit of fundamental charge and is stable. It was earlier
thought that the nucleus may contain electrons, but this was ruled out
later using arguments based on quantum theory. All the electrons of
an atom are outside the nucleus. We know that the number of these
electrons outside the nucleus of the atom is Z, the atomic number.
The total charge of the atomic electrons is thus (Ze), and since the
atom is neutral, the charge of the nucleus is (+Ze). The number of
protons in the nucleus of the atom is, therefore, exactly Z, the atomic
number.

Discovery of Neutron

Since the nuclei of deuterium and tritium are isotopes of hydrogen,


they must contain only one proton each. But the masses of the nuclei
of hydrogen, deuterium and tritium are in the ratio of 1:2:3. Therefore,
the nuclei of deuterium and tritium must contain, in addition to a
proton, some neutral matter. The amount of neutral matter present in
the nuclei of these isotopes, expressed in units of mass of a proton, is
approximately equal to one and two, respectively. This fact indicates
that the nuclei of atoms contain, in addition to protons, neutral matter
in multiples of a basic unit. This hypothesis was verified in 1932 by
James Chadwick who observed emission of neutral radiation when
beryllium nuclei were bombarded with alpha-particles. (-particles are
helium nuclei, to be discussed in a later section). It was found that this
neutral radiation could knock out protons from light nuclei such as
those of helium, carbon and nitrogen. The only neutral radiation
known at that time was photons (electromagnetic radiation).
Application of the principles of conservation of energy and momentum
showed that if the neutral radiation consisted of photons, the energy of
photons would have to be much higher than is available from the
bombardment of beryllium nuclei with -particles. The clue to this
puzzle, which Chadwick satisfactorily solved, was to assume that the
neutral radiation consists of a new type of neutral particles called
neutrons. From conservation of energy and momentum, he was able
to determine the mass of new particle as very nearly the same as
mass of proton.
The mass of a neutron is now known to a high degree of accuracy. It
is

mn = 1.00866 u = 1.67491027 kg
(13.3)
Chadwick was awarded the 1935 Nobel Prize in Physics for his
discovery of the neutron.
A free neutron, unlike a free proton, is unstable. It decays into a
proton, an electron and a antineutrino (another elementary particle),
and has a mean life of about 1000s. It is, however, stable inside the
nucleus.
The composition of a nucleus can now be described using the
following terms and symbols:

Z - atomic number = number of protons


[13.4(a)]
N - neutron number = number of neutrons
[13.4(b)]
A - mass number = Z + N
= total number of protons and neutrons
[13.4(c)]
One also uses the term nucleon for a proton or a neutron. Thus the
number of nucleons in an atom is its mass number A.

Nuclear species or nuclides are shown by the notation where X is


the chemical symbol of the species. For example, the nucleus of gold

is denoted by . It contains 197 nucleons, of which 79 are


protons and the rest118 are neutrons.
The composition of isotopes of an element can now be readily
explained. The nuclei of isotopes of a given element contain the same
number of protons, but differ from each other in their number of

neutrons. Deuterium, , which is an isotope of hydrogen, contains

one proton and one neutron. Its other isotope tritium, , contains
one proton and two neutrons. The element gold has 32 isotopes,
ranging from A =173 to A = 204. We have already mentioned that
chemical properties of elements depend on their electronic structure.
As the atoms of isotopes have identical electronic structure they have
identical chemical behaviour and are placed in the same location in
the periodic table.

All nuclides with same mass number A are called isobars. For

example, the nuclides and are isobars. Nuclides with same

neutron number N but different atomic number Z, for example

and , are called isotones.

13.3 SIZE OF THE NUCLEUS


As we have seen in Chapter 12, Rutherford was the pioneer who
postulated and established the existence of the atomic nucleus. At
Rutherfords suggestion, Geiger and Marsden performed their classic
experiment: on the scattering of -particles from thin gold foils. Their
experiments revealed that the distance of closest approach to a gold
nucleus of an -particle of kinetic energy 5.5 MeV is about 4.0 1014
m. The scattering of -particle by the gold sheet could be understood
by Rutherford by assuming that the coulomb repulsive force was
solely responsible for scattering. Since the positive charge is confined
to the nucleus, the actual size of the nucleus has to be less than 4.0
1014 m.
If we use -particles of higher energies than 5.5 MeV, the distance of
closest approach to the gold nucleus will be smaller and at some point
the scattering will begin to be affected by the short range nuclear
forces, and differ from Rutherfords calculations. Rutherfords
calculations are based on pure coulomb repulsion between the
positive charges of the -particle and the gold nucleus. From the
distance at which deviations set in, nuclear sizes can be inferred.

By performing scattering experiments in which fast electrons, instead


of -particles, are projectiles that bombard targets made up of various
elements, the sizes of nuclei of various elements have been
accurately measured.
It has been found that a nucleus of mass number A has a radius
R = R0 A1/3 (13.5)
where R0 = 1.2 1015 m. This means the volume of the nucleus,
which is proportional to R3 is proportional to A. Thus the density of
nucleus is a constant, independent of A, for all nuclei. Different nuclei
are likes drop of liquid of constant density. The density of nuclear
matter is approximately 2.3 1017 kg m3. This density is very large
compared to ordinary matter, say water, which is 103 kg m3. This is
understandable, as we have already seen that most of the atom is
empty. Ordinary matter consisting of atoms has a large amount of
empty space.

Example 13.1 Given the mass of iron nucleus as 55.85u and A=56, find the
nuclear density?
Solution

mFe = 55.85, u = 9.27 1026 kg

Nuclear density = =
17 3
= 2.29 10 kg m
The density of matter in neutron stars (an astrophysical object) is comparable
to this density. This shows that matter in these objects has been compressed
to such an extent that they resemble a big nucleus.

13.4 MASS-ENERGY AND NUCLEAR BINDING


ENERGY

13.4.1 Mass Energy


Einstein showed from his theory of special relativity that it is necessary
to treat mass as another form of energy. Before the advent of this
theory of special relativity it was presumed that mass and energy were
conserved separately in a reaction. However, Einstein showed that
mass is another form of energy and one can convert mass-energy into
other forms of energy, say kinetic energy and vice-versa.

Einstein gave the famous mass-energy equivalence relation


E = mc2 (13.6)
Here the energy equivalent of mass m is related by the above
equation and c is the velocity of light in vacuum and is approximately
equal to 3108 m s1.

Example 13.2 Calculate the energy equivalent of 1 g of substance.


Solution

Energy, E = 103 ( 3 108)2 J

E = 103 9 1016 = 9 1013 J


Thus, if one gram of matter is converted to energy, there is a release of
enormous amount of energy.

Experimental verification of the Einsteins mass-energy relation has


been achieved in the study of nuclear reactions amongst nucleons,
nuclei, electrons and other more recently discovered particles. In a
reaction the conservation law of energy states that the initial energy
and the final energy are equal provided the energy associated with
mass is also included. This concept is important in understanding
nuclear masses and the interaction of nuclei with one another. They
form the subject matter of the next few sections.

13.4.2 Nuclear binding energy

In Section 13.2 we have seen that the nucleus is made up of neutrons


and protons. Therefore it may be expected that the mass of the
nucleus is equal to the total mass of its individual protons and
neutrons. However, the nuclear mass M is found to be always less

than this. For example, let us consider ; a nucleus which has 8


neutrons and 8 protons. We have
Mass of 8 neutrons = 8 1.00866 u
Mass of 8 protons = 8 1.00727 u
Mass of 8 electrons = 8 0.00055 u

Therefore the expected mass of nucleus

= 8 2.01593 u = 16.12744 u.

The atomic mass of found from mass spectroscopy experiments


is seen to be 15.99493 u. Substracting the mass of 8 electrons (8

0.00055 u) from this, we get the experimental mass of nucleus to


be 15.99053 u.

Thus, we find that the mass of the nucleus is less than the total
mass of its constituents by 0.13691u. The difference in mass of a
nucleus and its constituents, M, is called the mass defect, and is
given by

(13.7)
What is the meaning of the mass defect? It is here that Einsteins
equivalence of mass and energy plays a role. Since the mass of the
oxygen nucleus is less that the sum of the masses of its constituents
(8 protons and 8 neutrons, in the unbound state), the equivalent
energy of the oxygen nucleus is less than that of the sum of the
equivalent energies of its constituents. If one wants to break the
oxygen nucleus into 8 protons and 8 neutrons, this extra energy M
c2, has to supplied. This energy required Eb is related to the mass
defect by
Eb = M c2 (13.8)

Example 13.3 Find the energy equivalent of one atomic mass unit, first in

Joules and then in MeV. Using this, express the mass defect of in
MeV/c2.
Solution

1u = 1.6605 1027 kg n
To convert it into energy units, we multiply it by c2 and find that energy
equivalent = 1.6605 1027 (2.9979 108)2 kg m2/s2

= 1.4924 1010 J

=
= 0.9315 109 eV
= 931.5 MeV

or , 1u = 931.5 MeV/c2

For , M = 0.13691 u = 0.13691931.5 MeV/c2

= 127.5 MeV/c2

The energy needed to separate into its constituents is thus


127.5 MeV/c2.

If a certain number of neutrons and protons are brought together to


form a nucleus of a certain charge and mass, an energy Eb will be
released in the process. The energy Eb is called the binding energy of
the nucleus. If we separate a nucleus into its nucleons, we would have
to supply a total energy equal to Eb, to those particles. Although we
cannot tear apart a nucleus in this way, the nuclear binding energy is
still a convenient measure of how well a nucleus is held together. A
more useful measure of the binding between the constituents of the
nucleus is the binding energy per nucleon, Ebn, which is the ratio of
the binding energy Eb of a nucleus to the number of the nucleons, A,
in that nucleus:
Ebn = Eb / A (13.9)
Figure 13.1 The binding energy per nucleon as a function of mass number.

We can think of binding energy per nucleon as the average energy per
nucleon needed to separate a nucleus into its individual nucleons.
Figure 13.1 is a plot of the binding energy per nucleon Ebn versus the
mass number A for a large number of nuclei. We notice the following
main features of
the plot:
(i) the binding energy per nucleon, Ebn, is practically constant, i.e.
practically independent of the atomic number for nuclei of middle
mass number ( 30 < A < 170). The curve has a maximum of about
8.75 MeV for A = 56 and has a value of 7.6 MeV for A = 238.
(ii) Ebn is lower for both light nuclei (A<30) and heavy nuclei (A>170).
We can draw some conclusions from these two observations:
(i) The force is attractive and sufficiently strong to produce a binding
energy of a few MeV per nucleon.

(ii) The constancy of the binding energy in the range 30 < A < 170 is a
consequence of the fact that the nuclear force is short-ranged.
Consider a particular nucleon inside a sufficiently large nucleus. It will
be under the influence of only some of its neighbours, which come
within the range of the nuclear force. If any other nucleon is at a
distance more than the range of the nuclear force from the particular
nucleon it will have no influence on the binding energy of the nucleon
under consideration. If a nucleon can have a maximum of p
neighbours within the range of nuclear force, its binding energy would
be proportional to p. Let the binding energy of the nucleus be pk,
where k is a constant having the dimensions of energy. If we increase
A by adding nucleons they will not change the binding energy of a
nucleon inside. Since most of the nucleons in a large nucleus reside
inside it and not on the surface, the change in binding energy per
nucleon would be small. The binding energy per nucleon is a constant
and is approximately equal to pk. The property that a given nucleon
influences only nucleons close to it is also referred to as saturation
property of the nuclear force.
(iii) A very heavy nucleus, say A = 240, has lower binding energy per
nucleon compared to that of a nucleus with A = 120. Thus if a nucleus
A = 240 breaks into two A = 120 nuclei, nucleons get more tightly
bound. This implies energy would be released in the process. It has
very important implications for energy production through fission, to be
discussed later in Section 13.7.1.

(iv) Consider two very light nuclei (A 10) joining to form a heavier
nucleus. The binding energy per nucleon of the fused heavier nuclei is
more than the binding energy per nucleon of the lighter nuclei. This
means that the final system is more tightly bound than the initial
system. Again energy would be released in such a process of fusion.
This is the energy source of sun, to be discussed later in Section
13.7.3.

13.5 NUCLEAR FORCE

Figure 13.2 Potential energy of a pair of nucleons as a function of their separation.

For a separation greater than r0, the force is attractive and for separations
less than r0, the force is strongly repulsive.
The force that determines the motion of atomic electrons is the familiar
Coulomb force. In Section 13.4, we have seen that for average mass
nuclei the binding energy per nucleon is approximately 8 MeV, which
is much larger than the binding energy in atoms. Therefore, to bind a
nucleus together there must be a strong attractive force of a totally
different kind. It must be strong enough to overcome the repulsion
between the (positively charged) protons and to bind both protons and
neutrons into the tiny nuclear volume. We have already seen that the
constancy of binding energy per nucleon can be understood in terms
of its short-range. Many features of the nuclear binding force are
summarised below. These are obtained from a variety of experiments
carried out during 1930 to 1950.
(i) The nuclear force is much stronger than the Coulomb force acting
between charges or the gravitational forces between masses. The
nuclear binding force has to dominate over the Coulomb repulsive
force between protons inside the nucleus. This happens only because
the nuclear force is much stronger than the coulomb force. The
gravitational force is much weaker than even Coulomb force.
(ii) The nuclear force between two nucleons falls rapidly to zero as
their distance is more than a few femtometres. This leads to saturation
of forces in a medium or a large-sized nucleus, which is the reason for
the constancy of the binding energy per nucleon.
A rough plot of the potential energy between two nucleons as a
function of distance is shown in the Fig. 13.2. The potential energy is a
minimum at a distance r0 of about 0.8 fm. This means that the force is
attractive for distances larger than 0.8 fm and repulsive if they are
separated by distances less than 0.8 fm.
(iii) The nuclear force between neutron-neutron, proton-neutron and
proton-proton is approximately the same. The nuclear force does not
depend on the electric charge.
Unlike Coulombs law or the Newtons law of gravitation there is no
simple mathematical form of the nuclear force.

13.6 RADIOACTIVITY

A. H. Becquerel discovered radioactivity in 1896 purely by accident.


While studying the fluorescence and phosphorescence of compounds
irradiated with visible light, Becquerel observed an interesting
phenomenon. After illuminating some pieces of uranium-potassium
sulphate with visible light, he wrapped them in black paper and
separated the package from a photographic plate by a piece of silver.
When, after several hours of exposure, the photographic plate was
developed, it showed blackening due to something that must have
been emitted by the compound and was able to penetrate both black
paper and the silver.

Experiments performed subsequently showed that radioactivity was a


nuclear phenomenon in which an unstable nucleus undergoes a
decay. This is referred to as radioactive decay. Three types of
radioactive decay occur in nature :

(i) -decay in which a helium nucleus is emitted;


(ii) -decay in which electrons or positrons (particles with the same
mass as electrons, but with a charge exactly opposite to that of
electron) are emitted;
(iii) -decay in which high energy (hundreds of keV or more) photons
are emitted.

Each of these decay will be considered in subsequent sub-sections.

13.6.1 Law of radioactive decay

In any radioactive sample, which undergoes , or -decay, it is found


that the number of nuclei undergoing the decay per unit time is
proportional to the total number of nuclei in the sample. If N is the
number of nuclei in the sample and N undergo decay in time t then

or, N/t = N, (13.10)


where is called the radioactive decay constant or disintegration
constant.
The change in the number of nuclei in the sample* is dN = N in
time t. Thus the rate of change of N is (in the limit t 0)

* N is the number of nuclei that decay, and hence is always positive.


dN is the change in N, which may have either sign. Here it is negative,
because out of original N nuclei, N have decayed, leaving (NN)
nuclei.
or,

Now, integrating both sides of the above equation,we get,

(13.11)
or, ln N ln N0 = (t t0) (13.12)

Here N0 is the number of radioactive nuclei in the sample at some


arbitrary time t0 and N is the number of radioactive nuclei at any
subsequent time t. Setting t0 = 0 and rearranging Eq. (13.12) gives us

ln (13.13)
which gives
N(t) = N0 e t (13.14)
Note, for example, the light bulbs follow no such exponential decay
law. If we test 1000 bulbs for their life (time span before they burn out
or fuse), we expect that they will decay (that is, burn out) at more or
less the same time. The decay of radionuclides follows quite a
different law, the law of radioactive decay represented by Eq. (13.14).
The total decay rate R of a sample is the number of nuclei
disintegrating per unit time. Suppose in a time interval dt, the decay
count measured is N. Then dN = N.
The positive quantity R is then defined as
R=

Differentiating Eq. (13.14), we get

R = N0 e t
or, R = R0 e t (13.15)

Figure 13.3 Exponential decay of a radioactive species. After a lapse of T1/2 ,


population of the given species drops by a factor of 2.

This is equivalant to the law of radioactivity decay, since you can


integrate Eq. (13.15) to get back Eq. (13.14). Clearly, R0 = N0 is the
decay rate at t = 0. The decay rate R at a certain time t and the
number of undecayed nuclei N at the same time are related by
R = N (13.16)
The decay rate of a sample, rather than the number of radioactive
nuclei, is a more direct experimentally measurable quantity and is
given a specific name: activity. The SI unit for activity is becquerel,
named after the discoverer of radioactivity, Henry Becquerel.

Marie Sklodowska Curie (1867-1934) Born in Poland. She is recognised both


as a physicist and as a chemist. The discovery of radioactivity by Henri
Becquerel in 1896 inspired Marie and her husband Pierre Curie in their
researches and analyses which led to the isolation of radium and polonium
elements. She was the first person to be awarded two Nobel Prizes- for
Physics in 1903 and for Chemistry in 1911.

1 becquerel is simply equal to 1 disintegration or decay per second.


There is also another unit named curie that is widely used and is
related to the SI unit as:
1 curie = 1 Ci = 3.7 1010 decays per second
= 3.7 1010 Bq
Different radionuclides differ greatly in their rate of decay. A common
way to characterize this feature is through the notion of half-life. Half-
life of a radionuclide (denoted by T1/2) is the time it takes for a sample
that has initially, say N0 radionuclei to reduce to N0/2. Putting
N = N0/2 and t = T1/2 in Eq. (13.14), we get

T1/2 = = (13.17)
Clearly if N0 reduces to half its value in time T1/2, R0 will also reduce
to half its value in the same time according to Eq. (13.16).
Another related measure is the average or mean life . This again can
be obtained from Eq. (13.14). The number of nuclei which decay in the
time interval t to t + t is R(t)t (= N0ett). Each of them has lived
for time t. Thus the total life of all these nuclei would be t N0et t. It
is clear that some nuclei may live for a short time while others may live
longer. Therefore to obtain the mean life, we have to sum (or
integrate) this expression over all times from 0 to , and divide by the
total number N0 of nuclei at t = 0. Thus,

One can show by performing this integral that


= 1/
We summarise these results with the following:

T1/2 = = ln 2 (13.18)
Radioactive elements (e.g., tritium, plutonium) which are short-lived
i.e., have half-lives much less than the age of the universe ( 15
billion years) have obviously decayed long ago and are not found in
nature. They can, however, be produced artificially in nuclear
reactions.

Example 13.4 The half-life of undergoing -decay is 4.5 109 years.

What is the activity of 1g sample of ?


Solution

T1/2 = 4.5 109 y

= 4.5 109 y x 3.16 x 107 s/y

= 1.42 1017s
One k mol of any isotope contains Avogadros number of atoms, and so 1g of

contains

6.025 1026 atoms/kmol

= 25.3 1020 atoms.


The decay rate R is
R = N

= =

= 1.23 104 s1

= 1.23 104 Bq
Example 13.5 Tritium has a half-life of 12.5 y undergoing beta decay. What
fraction of a sample of pure tritium will remain undecayed
after 25 y.
Solution
By definition of half-life, half of the initial sample will remain undecayed after
12.5 y. In the next 12.5 y, one-half of these nuclei would have decayed.
Hence, one fourth of the sample of the initial pure tritium will remain
undecayed.

13.6.2 Alpha decay

A well-known example of alpha decay is the decay of uranium to

thorium with the emission of a helium nucleus

+ (-decay) (13.19)
In -decay, the mass number of the product nucleus (daughter nucleus) is four
less than that of the decaying nucleus (parent nucleus), while the atomic

number decreases by two. In general, -decay of a parent nucleus

results in a daughter nucleus

+ (13.20)
From Einsteins mass-energy equivalance relation [Eq. (13.6)] and energy
conservation, it is clear that this spontaneous decay is possible only when the
total mass of the decay products is less than the mass of the initial nucleus.
This difference in mass appears as kinetic energy of the products. By referring

to a table of nuclear masses, one can check that the total mass of

and is indeed less than that of .


The disintegration energy or the Q-value of a nuclear reaction is the difference
between the initial mass energy and the total mass energy of the decay
products. For -decay
Q = (mX mY mHe) c2 (13.21)
Q is also the net kinetic energy gained in the process or, if the initial nucleus X
is at rest, the kinetic energy of the products. Clearly, Q> 0 for exothermic
processes such as -decay.

Example 13.6 We are given the following atomic masses:

= 238.05079 u = 4.00260 u

= 234.04363 u = 1.00783 u

= 237.05121 u
Here the symbol Pa is for the element protactinium (Z = 91).

(a) Calculate the energy released during the alpha decay of .

(b) Show that can not spontaneously emit a proton.


Solution

(a) The alpha decay of is given by Eq. (13.20). The energy released
in this process is given by

Q = (MU MTh MHe) c2

Substituting the atomic masses as given in the data, we find

Q = (238.05079 234.04363 4.00260)u c2

= (0.00456 u) c2
= (0.00456 u) (931.5 MeV/u)
= 4.25 MeV.

(b) If spontaneously emits a proton, the decay process would be

+
The Q for this process to happen is

= (MU MPa MH) c2

= (238.05079 237.05121 1.00783) u c2

= ( 0.00825 u) c2
= (0.00825 u)(931.5 MeV/u)
= 7.68 MeV
Thus, the Q of the process is negative and therefore it cannot proceed

spontaneously. We will have to supply an energy of 7.68 MeV to a


nucleus to make it emit a proton.

13.6.3 Beta decay


In beta decay, a nucleus spontaneously emits an electron ( decay) or a
positron (+ decay). A common example of decay is

(13.22)

and that of + decay is

(13.23)
The decays are governed by the Eqs. (13.14) and (13.15), so that one can
never predict which nucleus will undergo decay, but one can characterize the
decay by a half-life T1/2 . For example, T1/2 for the decays above is
respectively 14.3 d and 2.6y. The emission of electron in decay is
accompanied by the emission of an antineutrino ( ); in + decay, instead, a
neutrino () is generated. Neutrinos are neutral particles with very small
(possiblly, even zero) mass compared to electrons. They have only weak
interaction with other particles. They are, therefore, very difficult to detect,
since they can penetrate large quantity of matter (even earth) without any
interaction.
In both and + decay, the mass number A remains unchanged. In decay,
the atomic number Z of the nucleus goes up by 1, while in + decay Z goes
down by 1. The basic nuclear process underlying decay is the conversion of
neutron to proton

n p + e + (13.24)

while for + decay, it is the conversion of proton into neutron


p n + e+ + (13.25)
Note that while a free neutron decays to proton, the decay of proton to neutron
[Eq. (13.25)] is possible only inside the nucleus, since proton has smaller mass
than neutron.

13.6.4 Gamma decay

Like an atom, a nucleus also has discrete energy levels - the ground state and
excited states. The scale of energy is, however, very different. Atomic energy
level spacings are of the order of eV, while the difference in nuclear energy
levels is of the order of MeV. When a nucleus in an excited state
spontaneously decays to its ground state (or to a lower energy state), a photon
is emitted with energy equal to the difference in the two energy levels of the
nucleus. This is the so-called gamma decay. The energy (MeV) corresponds to
radiation of extremely short wavelength, shorter than the hard X-ray region.
Figure 13.4 -decay of nucleus followed by emission of two rays from
deexcitation of the daughter

nucleus .

Typically, a gamma ray is emitted when a or decay results in a daughter


nucleus in an excited state. This then returns to the ground state by a single
photon transition or successive transitions involving more than one photon. A
familiar example is the successive emmission of gamma rays of energies 1.17

MeV and 1.33 MeV from the deexcitation of nuclei formed from

decay of .

13.7 NUCLEAR ENERGY


The curve of binding energy per nucleon Ebn, given in Fig. 13.1, has a long flat
middle region between A = 30 and A = 170. In this region the binding energy
per nucleon is nearly constant (8.0 MeV). For the lighter nuclei region, A < 30,
and for the heavier nuclei region, A > 170, the binding energy per nucleon is
less than 8.0 MeV, as we have noted earlier. Now, the greater the binding
energy, the less is the total mass of a bound system, such as a nucleus.
Consequently, if nuclei with less total binding energy transform to nuclei with
greater binding energy, there will be a net energy release. This is what
happens when a heavy nucleus decays into two or more intermediate mass
fragments (fission) or when light nuclei fuse into a havier nucleus (fusion.)
Exothermic chemical reactions underlie conventional energy sources such as
coal or petroleum. Here the energies involved are in the range of electron
volts. On the other hand, in a nuclear reaction, the energy release is of the
order of MeV. Thus for the same quantity of matter, nuclear sources produce a
million times more energy than a chemical source. Fission of 1 kg of uranium,
for example, generates 1014 J of energy; compare it with burning of 1 kg of
coal that gives 107 J.

13.7.1 Fission
New possibilities emerge when we go beyond natural radioactive decays and
study nuclear reactions by bombarding nuclei with other nuclear particles such
as proton, neutron, -particle, etc.
A most important neutron-induced nuclear reaction is fission. An example of

fission is when a uranium isotope bombarded with a neutron breaks


into two intermediate mass nuclear fragments

(13.26)
The same reaction can produce other pairs of intermediate mass fragments

(13.27)
Or, as another example,

(13.28)

The fragment products are radioactive nuclei; they emit particles in


succession to achieve stable end products.
The energy released (the Q value ) in the fission reaction of nuclei like uranium
is of the order of 200 MeV per fissioning nucleus. This is estimated as follows:
Let us take a nucleus with A = 240 breaking into two fragments each of A =
120. Then
Ebn for A = 240 nucleus is about 7.6 MeV,
Ebn for the two A = 120 fragment nuclei is about 8.5 MeV.
Gain in binding energy for nucleon is about 0.9 MeV.
Hence the total gain in binding energy is 2400.9 or 216 MeV.
The disintegration energy in fission events first appears as the kinetic energy
of the fragments and neutrons. Eventually it is transferred to the surrounding
matter appearing as heat. The source of energy in nuclear reactors, which
produce electricity, is nuclear fission. The enormous energy released in an
atom bomb comes from uncontrolled nuclear fission. We discuss some details
in the next section how a nuclear reactor functions.
13.7.2 Nuclear reactor

Notice one fact of great importance in the fission reactions given in Eqs.
(13.26) to (13.28). There is a release of extra neutron (s) in the fission process.
Averagely, 2 neutrons are released per fission of uranium nucleus. It is a
fraction since in some fission events 2 neutrons are produced, in some 3, etc.
The extra neutrons in turn can initiate fission processes, producing still more
neutrons, and so on. This leads to the possibility of a chain reaction, as was
first suggested by Enrico Fermi. If the chain reaction is controlled suitably, we
can get a steady energy output. This is what happens in a nuclear reactor. If
the chain reaction is uncontrolled, it leads to explosive energy output, as in a
nuclear bomb.

INDIAS ATOMIC ENERGY PROGRAMME

The atomic energy programme in India was launched around the time of
independence under the leadership of Homi J. Bhabha (1909-1966). An
early historic achievement was the design and construction of the first
nuclear reactor in India (named Apsara) which went critical on August 4,
1956. It used enriched uranium as fuel and water as moderator. Following
this was another notable landmark: the construction of CIRUS (Canada
India Research U.S.) reactor in 1960. This 40 MW reactor used natural
uranium as fuel and heavy water as moderator. Apsara and CIRUS
spurred research in a wide range of areas of basic and applied nuclear
science. An important milestone in the first two decades of the programme
was the indigenous design and construction of the plutonium plant at
Trombay, which ushered in the technology of fuel reprocessing (separating
useful fissile and fertile nuclear materials from the spent fuel of a reactor)
in India. Research reactors that have been subsequently commissioned
include ZERLINA, PURNIMA (I, II and III), DHRUVA and KAMINI. KAMINI
is the countrys first large research reactor that uses U-233 as fuel. As the
name suggests, the primary objective of a research reactor is not
generation of power but to provide a facility for research on different
aspects of nuclear science and technology. Research reactors are also an
excellent source for production of a variety of radioactive isotopes that find
application in diverse fields: industry, medicine and agriculture.
The main objectives of the Indian Atomic Energy programme are to
provide safe and reliable electric power for the countrys social and
economic progress and to be self-reliant in all aspects of nuclear
technology. Exploration of atomic minerals in India undertaken since the
early fifties has indicated that India has limited reserves of uranium, but
fairly abundant reserves of thorium. Accordingly, our country has adopted
a three-stage strategy of nuclear power generation. The first stage involves
the use of natural uranium as a fuel, with heavy water as moderator. The
Plutonium-239 obtained from reprocessing of the discharged fuel from the
reactors then serves as a fuel for the second stage the fast breeder
reactors. They are so called because they use fast neutrons for sustaining
the chain reaction (hence no moderator is needed) and, besides
generating power, also breed more fissile species (plutonium) than they
consume. The third stage, most significant in the long term, involves using
fast breeder reactors to produce fissile Uranium-233 from Thorium-232
and to build power reactors based on them.
India is currently well into the second stage of the programme and
considerable work has also been done on the third the thorium
utilisation stage. The country has mastered the complex technologies of
mineral exploration and mining, fuel fabrication, heavy water production,
reactor design, construction and operation, fuel reprocessing, etc.
Pressurised Heavy Water Reactors (PHWRs) built at different sites in the
country mark the accomplishment of the first stage of the programme.
India is now more than self-sufficient in heavy water production. Elaborate
safety measures both in the design and operation of reactors, as also
adhering to stringent standards of radiological protection are the hallmark
of the Indian Atomic Energy Programme.

There is, however, a hurdle in sustaining a chain reaction, as described here. It


is known experimentally that slow neutrons (thermal neutrons) are much more

likely to cause fission in than fast


neutrons. Also fast neutrons liberated in fission would escape instead of
causing another fission reaction.

The average energy of a neutron produced in fission of is 2 MeV.


These neutrons unless slowed down will escape from the reactor without
interacting with the uranium nuclei, unless a very large amount of fissionable
material is used for sustaining the chain reaction. What one needs to do is to
slow down the fast neutrons by elastic scattering with light nuclei. In fact,
Chadwicks experiments showed that in an elastic collision with hydrogen the
neutron almost comes to rest and proton carries away the energy. This is the
same situation as when a marble hits head-on an identical marble at rest.
Therefore, in reactors, light nuclei called moderators are provided along with
the fissionable nuclei for slowing down fast neutrons. The moderators
commonly used are water, heavy water (D2O) and graphite. The Apsara
reactor at the Bhabha Atomic Research Centre (BARC), Mumbai, uses water
as moderator. The other Indian reactors, which are used for power production,
use heavy water as moderator.
Because of the use of moderator, it is possible that the ratio, K, of number of
fission produced by a given generation of neutrons to the number of fission of
the preceeding generation may be greater than one. This ratio is called the
multiplication factor; it is the measure of the growth rate of the neutrons in the
reactor. For K = 1, the operation of the reactor is said to be critical, which is
what we wish it to be for steady power operation. If K becomes greater than
one, the reaction rate and the reactor power increases exponentially. Unless
the factor K is brought down very close to unity, the reactor will become
supercritical and can even explode. The explosion of the Chernobyl reactor in
Ukraine in 1986 is a sad reminder that accidents in a nuclear reactor can be
catastrophic.
The reaction rate is controlled through control-rods made out of neutron-
absorbing material such as cadmium. In addition to control rods, reactors are
provided with safety rods which, when required, can be inserted into the
reactor and K can be reduced rapidly to less than unity.

The more abundant isotope in naturally occurring uranium is non-


fissionable. When it captures a neutron, it produces the highly radioactive
plutonium through these reactions

(13.29)
Plutonium undergoes fission with slow neutrons.
Figure 13.5 shows the schematic diagram of a nuclear reactor based on
thermal neutron fission. The core of the reactor is the site of nuclear fission. It
contains the fuel elements in suitably fabricated form. The fuel may be say

enriched uranium (i.e., one that has greater abundance of than


naturally occurring uranium). The core contains a moderator to slow down the
neutrons. The core is surrounded by a reflector to reduce leakage. The energy
(heat) released in fission is continuously removed by a suitable coolant. A
containment vessel prevents the escape of radioactive fission products. The
whole assembly is shielded to check harmful radiation from coming out. The
reactor can be shut down by means of rods (made of, for example, cadmium)
that have high absorption of neutrons. The coolant transfers heat to a working
fluid which in turn may produce stream. The steam drives turbines and
generates electricity.

Figure 13.5 Schematic diagram of a nuclear reactor based on thermal neutron fission.

Like any power reactor, nuclear reactors generate considerable waste


products. But nuclear wastes need special care for treatment since they are
radioactive and hazardous. Elaborate safety measures, both for reactor
operation as well as handling and reprocessing the spent fuel, are required.
These safety measures are a distinguishing feature of the Indian Atomic
Energy programme. An appropriate plan is being evolved to study the
possibility of converting radioactive waste into less active and short-lived
material.

13.7.3 Nuclear fusion energy generation in stars

When two light nuclei fuse to form a larger nucleus, energy is released, since
the larger nucleus is more tightly bound, as seen from the binding energy
curve in Fig.13.1. Some examples of such energy liberating nuclear fusion
reactions are :

+ e+ + + 0.42 MeV [13.29(a)]


+ n + 3.27 MeV [13.29(b)]

+ 4.03 MeV [13.29(c)]


In the first reaction, two protons combine to form a deuteron and a positron
with a release of 0.42 MeV energy. In reaction [13.29(b)], two deuterons
combine to form the light isotope of helium. In reaction (13.29c), two deuterons
combine to form a triton and a proton. For fusion to take place, the two nuclei
must come close enough so that attractive short-range nuclear force is able to
affect them. However, since they are both positively charged particles, they
experience coulomb repulsion. They, therefore, must have enough energy to
overcome this coulomb barrier. The height of the barrier depends on the
charges and radii of the two interacting nuclei. It can be shown, for example,
that the barrier height for two protons is ~ 400 keV, and is higher for nuclei with
higher charges. We can estimate the temperature at which two protons in a
proton gas would (averagely) have enough energy to overcome the coulomb
barrier:
(3/2)k T = K 400 keV, which gives T ~ 3 109 K.
When fusion is achieved by raising the temperature of the system so that
particles have enough kinetic energy to overcome the coulomb repulsive
behaviour, it is called thermonuclear fusion.
Thermonuclear fusion is the source of energy output in the interior of stars.
The interior of the sun has a temperature of 1.5107 K, which is considerably
less than the estimated temperature required for fusion of particles of average
energy. Clearly, fusion in the sun involves protons whose energies are much
above the average energy.
The fusion reaction in the sun is a multi-step process in which the hydrogen is
burned into helium. Thus, the fuel in the sun is the hydrogen in its core. The
proton-proton (p, p) cycle by which this occurs is represented by the following
sets of reactions:

+ e+ + + 0.42 MeV (i)


e+ + e + + 1.02 MeV (ii)

+ + 5.49 MeV (iii)

+ 12.86 MeV (iv)


(13.30)
For the fourth reaction to occur, the first three reactions must occur twice, in
which case two light helium nuclei unite to form ordinary helium nucleus. If we
consider the combination 2(i) + 2(ii) + 2(iii) +(iv), the net effect is

or
(13.31)

Thus, four hydrogen atoms combine to form an atom with a release of


26.7 MeV of energy.
Helium is not the only element that can be synthesized in the interior of a star.
As the hydrogen in the core gets depleted and becomes helium, the core starts
to cool. The star begins to collapse under its own gravity which increases the
temperature of the core. If this temperature increases to about 108 K, fusion
takes place again, this time of helium nuclei into carbon. This kind of process
can generate through fusion higher and higher mass number elements. But
elements more massive than those near the peak of the binding energy curve
in Fig. 13.1 cannot be so produced.
The age of the sun is about 5109 y and it is estimated that there is enough
hydrogen in the sun to keep it going for another 5 billion years. After that, the
hydrogen burning will stop and the sun will begin to cool and will start to
collapse under gravity, which will raise the core temperature. The outer
envelope of the sun will expand, turning it into the so called red giant.

13.7.4 Controlled thermonuclear fusion

The natural thermonuclear fusion process in a star is replicated in a


thermonuclear fusion device. In controlled fusion reactors, the aim is to
generate steady power by heating the nuclear fuel to a temperature in the
range of 108 K. At these temperatures, the fuel is a mixture of positive ions and
electrons (plasma). The challenge is to confine this plasma, since no container
can stand such a high temperature. Several countries around the world
including India are developing techniques in this connection. If successful,
fusion reactors will hopefully supply almost unlimited power to humanity.
NUCLEAR HOLOCAUST

In a single uranium fission about 0.9235 MeV (200 MeV) of energy is


liberated. If each nucleus of about 50 kg of 235U undergoes fission the
amount of energy involved is about 4 1015J. This energy is equivalent to
about 20,000 tons of TNT, enough for a superexplosion. Uncontrolled
release of large nuclear energy is called an atomic explosion. On August 6,
1945 an atomic device was used in warfare for the first time. The US
dropped an atom bomb on Hiroshima, Japan. The explosion was
equivalent to 20,000 tons of TNT. Instantly the radioactive products
devastated 10 sq km of the city which had 3,43,000 inhabitants. Of this
number 66,000 were killed and 69,000 were injured; more than 67% of the
citys structures were destroyed.
High temperature conditions for fusion reactions can be created by
exploding a fission bomb. Super-explosions equivalent to 10 megatons of
explosive power of TNT were tested in 1954. Such bombs which involve
fusion of isotopes of hydrogen, deuterium and tritium are called hydrogen
bombs. It is estimated that a nuclear arsenal sufficient to destroy every
form of life on this planet several times over is in position to be triggered by
the press of a button. Such a nuclear holocaust will not only destroy the life
that exists now but its radioactive fallout will make this planet unfit for life
for all times. Scenarios based on theoretical calculations predict a long
nuclear winter, as the radioactive waste will hang like a cloud in the earths
atmosphere and will absorb the suns radiation.

Example 13.7 Answer the following questions:


(a) Are the equations of nuclear reactions (such as those given in Section
13.7) balanced in the sense a chemical equation (e.g., 2H2 + O2 2 H2O) is?
If not, in what sense are they balanced on both sides?
(b) If both the number of protons and the number of neutrons are conserved in
each nuclear reaction, in what way is mass converted into energy (or vice-
versa) in a nuclear reaction?
(c) A general impression exists that mass-energy interconversion takes place
only in nuclear reaction and never in chemical reaction. This is strictly
speaking, incorrect. Explain.
Solution
(a) A chemical equation is balanced in the sense that the number of atoms of
each element is the same on both sides of the equation. A chemical reaction
merely alters the original combinations of atoms. In a nuclear reaction,
elements may be transmuted. Thus, the number of atoms of each element is
not necessarily conserved in a nuclear reaction. However, the number of
protons and the number of neutrons are both separately conserved in a
nuclear reaction. [Actually, even this is not strictly true in the realm of very high
energies what is strictly conserved is the total charge and total baryon
number. We need not pursue this matter here.]
In nuclear reactions (e.g., Eq. 13.26), the number of protons and the number of
neutrons are the same on the two sides of the equation.
(b) We know that the binding energy of a nucleus gives a negative contribution
to the mass of the nucleus (mass defect). Now, since proton number and
neutron number are conserved in a nuclear reaction, the total rest mass of
neutrons and protons is the same on either side of a reaction. But the total
binding energy of nuclei on the left side need not be the same as that on the
right hand side. The difference in these binding energies appears as energy
released or absorbed in a nuclear reaction. Since binding energy contributes to
mass, we say that the difference in the total mass of nuclei on the two sides
get converted into energy or vice-versa. It is in these sense that a nuclear
reaction is an example of mass-energy interconversion.
(c) From the point of view of mass-energy interconversion, a chemical reaction
is similar to a nuclear reaction in principle. The energy released or absorbed in
a chemical reaction can be traced to the difference in chemical (not nuclear)
binding energies of atoms and molecules on the two sides of a reaction. Since,
strictly speaking, chemical binding energy also gives a negative contribution
(mass defect) to the total mass of an atom or molecule, we can equally well
say that the difference in the total mass of atoms or molecules, on the two
sides of the chemical reaction gets converted into energy or vice-versa.
However, the mass defects involved in a chemical reaction are almost a million
times smaller than those in a nuclear reaction.This is the reason for the
general impression, (which is incorrect) that mass-energy interconversion does
not take place in a chemical reaction.

SUMMARY

1. An atom has a nucleus. The nucleus is positively charged. The radius of the
nucleus is smaller than the radius of an atom by a factor of 104. More than
99.9% mass of the atom is concentrated in the nucleus.
2. On the atomic scale, mass is measured in atomic mass units (u). By
definition, 1 atomic mass unit (1u) is 1/12th mass of one atom of 12C;
1u = 1.660563 1027 kg.
3. A nucleus contains a neutral particle called neutron. Its mass is almost the
same as that of proton
4. The atomic number Z is the number of protons in the atomic nucleus of an
element. The mass number A is the total number of protons and neutrons in
the atomic nucleus; A = Z+N; Here N denotes the number of neutrons in the
nucleus.

A nuclear species or a nuclide is represented as , where X is the chemical


symbol of the species.
Nuclides with the same atomic number Z, but different neutron number N are
called isotopes. Nuclides with the same A are isobars and those with the same
N are isotones.
Most elements are mixtures of two or more isotopes. The atomic mass of an
element is a weighted average of the masses of its isotopes. The masses are
the relative abundances of the isotopes.
5. A nucleus can be considered to be spherical in shape and assigned a
radius. Electron scattering experiments allow determination of the nuclear
radius; it is found that radii of nuclei fit the formula

R = R0 A1/3,
where R0 = a constant = 1.2 fm. This implies that the nuclear density is
independent of A. It is of the order of 1017 kg/m3.
6. Neutrons and protons are bound in a nucleus by the short-range strong
nuclear force. The nuclear force does not distinguish between neutron and
proton.

7. The nuclear mass M is always less than the total mass, m, of its
constituents. The difference in mass of a nucleus and its constituents is called
the mass defect,
=(
M Z mp + (A Z)mn) M
Using Einsteins mass energy relation, we express this mass difference in
terms of energy as
b=
E M c2

The energy Eb represents the binding energy of the nucleus. In the mass
number range A = 30 to 170, the binding energy per nucleon is nearly
constant, about 8 MeV/nucleon.
8. Energies associated with nuclear processes are about a million times larger
than chemical process.
9. The Q-value of a nuclear process is

Q = final kinetic energy initial kinetic energy.


Due to conservation of mass-energy, this is also,
Q = (sum of initial masses sum of final masses)c2
10. Radioactivity is the phenomenon in which nuclei of a given species
transform by giving out or or rays; -rays are helium nuclei;
-rays are electrons. -rays are electromagnetic radiation of wavelengths
shorter than X-rays;
11. Law of radioactive decay : N (t) = N(0) et

where is the decay constant or disintegration constant.


The half-life T1/2 of a radionuclide is the time in which N has been reduced to
one-half of its initial value. The mean life is the time at which N has been
reduced to e1 of its initial value

12. Energy is released when less tightly bound nuclei are transmuted into

more tightly bound nuclei. In fission, a heavy nucleus like breaks into

two smaller fragments, e.g.,


13. The fact that more neutrons are produced in fission than are consumed
gives the possibility of a chain reaction with each neutron that is produced
triggering another fission. The chain reaction is uncontrolled and rapid in a
nuclear bomb explosion. It is controlled and steady in a nuclear reactor. In a
reactor, the value of the neutron multiplication factor k is maintained at 1.
14. In fusion, lighter nuclei combine to form a larger nucleus. Fusion of
hydrogen nuclei into helium nuclei is the source of energy of all stars including
our sun.

Screenshot from 2015-01-27 16:55:54


Points to Ponder
1. The density of nuclear matter is independent of the size of the nucleus. The
mass density of the atom does not follow this rule.
2. The radius of a nucleus determined by electron scattering is found to be
slightly different from that determined by alpha-particle scattering. This is
because electron scattering senses the charge distribution of the nucleus,
whereas alpha and similar particles sense the nuclear matter.
3. After Einstein showed the equivalence of mass and energy, E = mc2, we
cannot any longer speak of separate laws of conservation of mass and
conservation of energy, but we have to speak of a unified law of conservation
of mass and energy. The most convincing evidence that this principle operates
in nature comes from nuclear physics. It is central to our understanding of
nuclear energy and harnessing it as a source of power. Using the principle, Q
of a nuclear process (decay or reaction) can be expressed also in terms of
initial and final masses.
4. The nature of the binding energy (per nucleon) curve shows that exothermic
nuclear reactions are possible, when two light nuclei fuse or when a heavy
nucleus undergoes fission into nuclei with intermediate mass.
5. For fusion, the light nuclei must have sufficient initial energy to overcome
the coulomb potential barrier. That is why fusion requires very high
temperatures.
6. Although the binding energy (per nucleon) curve is smooth and slowly
varying, it shows peaks at nuclides like 4He, 16O etc. This is considered as
evidence of atom-like shell structure in nuclei.
7. Electrons and positron are a particle-antiparticle pair. They are identical in
mass; their charges are equal in magnitude and opposite. (It is found that
when an electron and a positron come together, they annihilate each other
giving energy in the form of gamma-ray photons.)
--
8. In decay (electron emission), the particle emitted along with electron is
anti-neutrino ( +
). On the other hand, the particle emitted in -decay (positron
emission) is neutrino (). Neutrino and anti-neutrino are a particle-antiparticle
pair. There are anti particles associated with every particle. What should be
antiproton which is the anti particle of the proton?

9. A free neutron is unstable ( ). But a similar free proton


decay is not possible, since a proton is (slightly) lighter than a neutron.
10. Gamma emission usually follows alpha or beta emission. A nucleus in an
excited (higher) state goes to a lower state by emitting a gamma photon. A
nucleus may be left in an excited state after alpha or beta emission.
Successive emission of gamma rays from the same nucleus (as in case of
60Ni, Fig. 13.4) is a clear proof that nuclei also have discrete energy levels as
do the atoms.
11. Radioactivity is an indication of the instability of nuclei. Stability requires
the ratio of neutron to proton to be around 1:1 for light nuclei. This ratio
increases to about 3:2 for heavy nuclei. (More neutrons are required to
overcome the effect of repulsion among the protons.) Nuclei which are away
from the stability ratio, i.e., nuclei which have an excess of neutrons or protons
are unstable. In fact, only about 10% of knon isotopes (of all elements), are
stable. Others have been either artificially produced in the laboratory by
bombarding , p, d, n or other particles on targets of stable nuclear species or
identified in astronomical observations of matter in the universe.

Exercises

You may find the following data useful in solving the exercises:
e = 1.61019C N = 6.0231023 per mole

1/(40) = 9 109 N m2/C2 k = 1.3811023J 0K1

1 MeV = 1.61013J 1 u = 931.5 MeV/c2


1 year = 3.154107 s

mH = 1.007825 u mn = 1.008665 u
m( ) = 4.002603 u me = 0.000548 u

13.1 (a) Two stable isotopes of lithium and have


respective abundances of 7.5% and 92.5%. These isotopes have
masses 6.01512 u and 7.01600 u, respectively. Find the atomic
mass
of lithium.

(b) Boron has two stable isotopes, and . Their respective


masses are 10.01294 u and 11.00931 u, and the atomic mass of

boron is 10.811 u. Find the abundances of and .

13.2 The three stable isotopes of neon:


have respective abundances of 90.51%, 0.27% and 9.22%. The
atomic masses of the three isotopes are 19.99 u, 20.99 u and
21.99 u, respectively. Obtain the average atomic mass of neon.
13.3 Obtain the binding energy (in MeV) of a nitrogen nucleus

, given m =14.00307 u

13.4 Obtain the binding energy of the nuclei and in


units of MeV from the following data:

m( ) = 55.934939 u m ( ) = 208.980388 u
13.5 A given coin has a mass of 3.0 g. Calculate the nuclear
energy that would be required to separate all the neutrons and
protons from each other. For simplicity assume that the coin is

entirely made of atoms (of mass 62.92960 u).


13.6 Write nuclear reaction equations for

(i) -decay of (ii) -decay of

(iii) -decay of (iv) -decay of

(v) +-decay of (vi) +-decay of

(vii) Electron capture of

13.7 A radioactive isotope has a half-life of T years. How long will


it take the activity to reduce to a) 3.125%, b) 1% of its original
value?
13.8 The normal activity of living carbon-containing matter is found
to be about 15 decays per minute for every gram of carbon. This

activity arises from the small proportion of radioactive present

with the stable carbon isotope . When the organism is dead,


its interaction with the atmosphere (which maintains the above
equilibrium activity) ceases and its activity begins to drop. From

the known half-life (5730 years) of , and the measured activity,


the age of the specimen can be approximately estimated. This is

the principle of dating used in archaeology. Suppose a


specimen from Mohenjodaro gives an activity of 9 decays per
minute per gram of carbon. Estimate the approximate age of the
Indus-Valley civilisation.

13.9 Obtain the amount of necessary to provide a

radioactive source of 8.0 mCi strength. The half-life of is 5.3


years.

13.10 The half-life of is 28 years. What is the disintegration


rate of 15 mg of this isotope?
13.11 Obtain approximately the ratio of the nuclear radii of the

gold isotope and the silver isotope .


13.12 Find the Q-value and the kinetic energy of the emitted -

particle in the -decay of (a) and (b) .

Given m ( ) = 226.02540 u, m ( ) = 222.01750 u,

m( ) = 220.01137 u, m ( ) = 216.00189 u.
13.13 The radionuclide 11C decays according to

The maximum energy of the emitted positron is 0.960 MeV.


Given the mass values:

m( ) = 11.011434 u and m ( ) = 11.009305 u,

calculate Q and compare it with the maximum energy of the


positron emitted.

13.14 The nucleus decays by emission. Write down the


-decay equation and determine the maximum kinetic energy of
the electrons emitted. Given that:

m( ) = 22.994466 u
m( ) = 22.089770 u.
13.15 The Q value of a nuclear reaction A + b C + d is defined
by
Q = [ mA + mb mC md]c2
where the masses refer to the respective nuclei. Determine from
the given data the Q-value of the following reactions and state
whether the reactions are exothermic or endothermic.

(i)

(ii)
Atomic masses are given to be

m( ) = 2.014102 u

m( ) = 3.016049 u

m( ) = 12.000000 u

m( ) = 19.992439 u

13.16 Suppose, we think of fission of a nucleus into two

equal fragments, . Is the fission energetically possible? Argue

by working out Q of the process. Given m ( ) = 55.93494 u


and

m( ) = 27.98191 u.

13.17 The fission properties of are very similar to those of


. The average energy released per fission is 180 MeV. How
much energy, in MeV, is released if all the atoms in 1 kg of pure

undergo fission?
13.18 A 1000 MW fission reactor consumes half of its fuel in 5.00

y. How much did it contain initially? Assume that the reactor


operates 80% of the time, that all the energy generated arises

from the fission of and that this nuclide is consumed only by


the fission process.
13.19 How long can an electric lamp of 100W be kept glowing by
fusion of 2.0 kg of deuterium? Take the fusion reaction as

13.20 Calculate the height of the potential barrier for a head on


collision of two deuterons. (Hint: The height of the potential barrier
is given by the Coulomb repulsion between the two deuterons
when they just touch each other. Assume that they can be taken
as hard spheres of radius 2.0 fm.)

13.21 From the relation R = R0A1/3, where R0 is a constant and A


is the mass number of a nucleus, show that the nuclear matter
density is nearly constant (i.e. independent of A).
13.22 For the + (positron) emission from a nucleus, there is
another competing process known as electron capture (electron
from an inner orbit, say, the Kshell, is captured by the nucleus
and a neutrino is emitted).
Show that if + emission is energetically allowed, electron capture
is necessarily allowed but not viceversa.

Additional Exercises
13.23 In a periodic table the average atomic mass of magnesium
is given as 24.312 u. The average value is based on their relative
natural abundance on earth. The three isotopes and their masses

are (23.98504u), (24.98584u) and

(25.98259u). The natural abundance of is 78.99% by mass.


Calculate the abundances of other two isotopes.
13.24 The neutron separation energy is defined as the energy
required to remove a neutron from the nucleus. Obtain the neutron

separation energies of the nuclei and from the


following data:

m( ) = 39.962591 u

m( ) = 40.962278 u

m( ) = 25.986895 u

m( ) = 26.981541 u

13.25 A source contains two phosphorous radio nuclides

(T1/2 = 14.3d) and (T1/2 = 25.3d). Initially, 10% of the decays

come from . How long one must wait until 90% do so?
13.26 Under certain circumstances, a nucleus can decay by
emitting a particle more massive than an -particle. Consider the
following decay processes:

Calculate the Q-values for these decays and determine that both
are energetically allowed.

13.27 Consider the fission of by fast neutrons. In one fission


event, no neutrons are emitted and the final end products, after

the beta decay of the primary fragments, are and .


Calculate Q for this fission process. The relevant atomic and
particle masses are

m( ) =238.05079 u

m( ) =139.90543 u

m( ) = 98.90594 u

13.28 Consider the DT reaction (deuteriumtritium fusion)

(a) Calculate the energy released in MeV in this reaction from the
data:

m( )=2.014102 u

m( ) =3.016049 u
(b) Consider the radius of both deuterium and tritium to be
approximately 2.0 fm. What is the kinetic energy needed to
overcome the coulomb repulsion between the two nuclei? To what
temperature must the gas be heated to initiate the reaction?
(Hint: Kinetic energy required for one fusion event =average
thermal kinetic energy available with the interacting particles =
2(3kT/2); k = Boltzmans constant, T = absolute temperature.)

13.29 Obtain the maximum kinetic energy of -particles, and the


radiation frequencies of decays in the decay scheme shown in
Fig. 13.6. You are given that
m(198Au) = 197.968233 u
m(198Hg) =197.966760 u

Figure13.6
13.30 Calculate and compare the energy released by a) fusion of
1.0 kg of hydrogen deep within Sun and b) the fission of 1.0 kg of
235U in a fission reactor.
13.31 Suppose India had a target of producing by 2020 AD,
200,000 MW of electric power, ten percent of which was to be
obtained from nuclear power plants. Suppose we are given that,
on an average, the efficiency of utilization (i.e. conversion to
electric energy) of thermal energy produced in a reactor was 25%.
How much amount of fissionable uranium would our country need
per year by 2020? Take the heat energy per fission of 235U to be
about 200MeV.
Chapter Eleven

Dual Nature of Radiation and Matter

11.1 INTRODUCTION

The Maxwells equations of electromagnetism and Hertz experiments


on the generation and detection of electromagnetic waves in 1887
strongly established the wave nature of light. Towards the same
period at the end of 19th century, experimental investigations on
conduction of electricity (electric discharge) through gases at low
pressure in a discharge tube led to many historic discoveries.
The discovery of X-rays by Roentgen in 1895, and of electron by J. J.
Thomson in 1897, were important milestones in the understanding of
atomic structure. It was found that at sufficiently low pressure of about
0.001 mm of mercury column, a discharge took place between the two
electrodes on applying the electric field to the gas in the discharge
tube. A fluorescent glow appeared on the glass opposite to cathode.
The colour of glow of the glass depended on the type of glass, it being
yellowish-green for soda glass. The cause of this fluorescence was
attributed to the radiation which appeared to be coming from the
cathode. These cathode rays were discovered, in 1870, by William
Crookes who later, in 1879, suggested that these rays consisted of
streams of fast moving negatively charged particles. The British
physicist J. J. Thomson (1856-1940) confirmed this hypothesis. By
applying mutually perpendicular electric and magnetic fields across
the discharge tube, J. J. Thomson was the first to determine
experimentally the speed and the specific charge [charge to mass
ratio (e/m)] of the cathode ray particles. They were found to travel with
speeds ranging from about 0.1 to 0.2 times the speed of light (3 108
m/s). The presently accepted value of e/m is 1.76 1011 C/kg.
Further, the value of e/m was found to be independent of the nature of
the material/metal used as the cathode (emitter), or the gas introduced
in the discharge tube. This observation suggested the universality of
the cathode ray particles.
Around the same time, in 1887, it was found that certain metals, when
irradiated by ultraviolet light, emitted negatively charged particles
having small speeds. Also, certain metals when heated to a high
temperature were found to emit negatively charged particles. The
value of e/m of these particles was found to be the same as that for
cathode ray particles. These observations thus established that all
these particles, although produced under different conditions, were
identical in nature. J. J. Thomson, in 1897, named these particles as
electrons, and suggested that they were fundamental, universal
constituents of matter. For his epoch-making discovery of electron,
through his theoretical and experimental investigations on conduction
of electricity by gasses, he was awarded the Nobel Prize in Physics in
1906. In 1913, the American physicist R. A. Millikan (1868-1953)
performed the pioneering oil-drop experiment for the precise
measurement of the charge on an electron. He found that the charge
on an oil-droplet was always an integral multiple of an elementary
charge, 1.602 1019 C. Millikans experiment established that
electric charge is quantised. From the values of charge (e) and
specific charge
(e/m), the mass (m) of the electron could be determined.

11.2 ELECTRON EMISSION


We know that metals have free electrons (negatively charged
particles) that are responsible for their conductivity. However, the free
electrons cannot normally escape out of the metal surface. If an
electron attempts to come out of the metal, the metal surface acquires
a positive charge and pulls the electron back to the metal. The free
electron is thus held inside the metal surface by the attractive forces of
the ions. Consequently, the electron can come out of the metal
surface only if it has got sufficient energy to overcome the attractive
pull. A certain minimum amount of energy is required to be given to an
electron to pull it out from the surface of the metal. This minimum
energy required by an electron to escape from the metal surface is
called the work function of the metal. It is generally denoted by 0 and
measured in ev (electron volt). One electron volt is the energy gained
by an electron when it has been accelerated by a potential difference
of 1 volt, so that 1 eV = 1.602 1019 J.
This unit of energy is commonly used in atomic and nuclear physics.
The work function (0) depends on the properties of the metal and the
nature of its surface. The values of work function of some metals are
given in Table 11.1. These values are approximate as they are very
sensitive to surface impurities.
Note from Table 11.1 that the work function of platinum is the highest
(0 = 5.65 eV) while it is the lowest (0 = 2.14 eV) for caesium.
The minimum energy required for the electron emission from the metal
surface can be supplied to the free electrons by any one of the
following physical processes:
(i) Thermionic emission: By suitably heating, sufficient thermal energy
can be imparted to the free electrons to enable them to come out of
the metal.

(ii) Field emission: By applying a very strong electric field (of the order
of 108 V m1) to a metal, electrons can be pulled out of the metal, as in
a spark plug.
(iii) Photo-electric emission: When light of suitable frequency
illuminates a metal surface, electrons are emitted from the metal
surface. These photo(light)-generated electrons are called
photoelectrons.

11.3 PHOTOELECTRIC EFFECT

11.3.1 Hertzs observations


The phenomenon of photoelectric emission was discovered in 1887 by
Heinrich Hertz (1857-1894), during his electromagnetic wave
experiments. In his experimental investigation on the production of
electromagnetic waves by means of a spark discharge, Hertz
observed that high voltage sparks across the detector loop were
enhanced when the emitter plate was illuminated by ultraviolet light
from an arc lamp.
Light shining on the metal surface somehow facilitated the escape of
free, charged particles which we now know as electrons. When light
falls on a metal surface, some electrons near the surface absorb
enough energy from the incident radiation to overcome the attraction
of the positive ions in the material of the surface. After gaining
sufficient energy from the incident light, the electrons escape from the
surface of the metal into the surrounding space.

11.3.2 Hallwachs and Lenards observations


Wilhelm Hallwachs and Philipp Lenard investigated the phenomenon
of photoelectric emission in detail during 1886-1902.
Lenard (1862-1947) observed that when ultraviolet radiations were
allowed to fall on the emitter plate of an evacuated glass tube
enclosing two electrodes (metal plates), current flows in the circuit
(Fig. 11.1). As soon as the ultraviolet radiations were stopped, the
current flow also stopped. These observations indicate that when
ultraviolet radiations fall on the emitter plate C, electrons are ejected
from it which are attracted towards the positive, collector plate A by
the electric field. The electrons flow through the evacuated glass tube,
resulting in the current flow. Thus, light falling on the surface of the
emitter causes current in the external circuit. Hallwachs and Lenard
studied how this photo current varied with collector plate potential, and
with frequency and intensity of incident light.
Hallwachs, in 1888, undertook the study further and connected a
negatively charged zinc plate to an electroscope. He observed that the
zinc plate lost its charge when it was illuminated by ultraviolet light.
Further, the uncharged zinc plate became positively charged when it
was irradiated by ultraviolet light. Positive charge on a positively
charged zinc plate was found to be further enhanced when it was
illuminated by ultraviolet light. From these observations he concluded
that negatively charged particles were emitted from the zinc plate
under the action of ultraviolet light.
After the discovery of the electron in 1897, it became evident that the
incident light causes electrons to be emitted from the emitter plate.
Due to negative charge, the emitted electrons are pushed towards the
collector plate by the electric field. Hallwachs and Lenard also
observed that when ultraviolet light fell on the emitter plate, no
electrons were emitted at all when the frequency of the incident light
was smaller than a certain minimum value, called the threshold
frequency. This minimum frequency depends on the nature of the
material of the emitter plate.
It was found that certain metals like zinc, cadmium, magnesium, etc.,
responded only to ultraviolet light, having short wavelength, to cause
electron emission from the surface. However, some alkali metals such
as lithium, sodium, potassium, caesium and rubidium were sensitive
even to visible light. All these photosensitive substances emit
electrons when they are illuminated by light. After the discovery of
electrons, these electrons were termed as photoelectrons. The
phenomenon is called photoelectric effect.
11.4 EXPERIMENTAL STUDY OF
PHOTOELECTRIC EFFECT
Figure 11.1 depicts a schematic view of the arrangement used for the
experimental study of the photoelectric effect. It consists of an
evacuated glass/quartz tube having a photosensitive plate C and
another metal plate A. Monochromatic light from the source S of
sufficiently short wavelength passes through the window W and falls
on the photosensitive plate C (emitter). A transparent quartz window is
sealed on to the glass tube, which permits ultraviolet radiation to pass
through it and irradiate the photosensitive plate C. The electrons are
emitted by the plate C and are collected by the plate A (collector), by
the electric field created by the battery. The battery maintains the
potential difference between the plates C and A, that can be varied.
The polarity of the plates C and A can be reversed by a commutator.
Thus, the plate A can be maintained at a desired positive or negative
potential with respect to emitter C. When the collector plate A is
positive with respect to the emitter plate C, the electrons are attracted
to it. The emission of electrons causes flow of electric current in the
circuit. The potential difference between the emitter and collector
plates is measured by a voltmeter (V) whereas the resulting photo
current flowing in the circuit is measured by a microammeter (A). The
photoelectric current can be increased or decreased by varying the
potential of collector plate A with respect to the emitter plate C. The
intensity and frequency of the incident light can be varied, as can the
potential difference V between the emitter C and the collector A.
Figure 11.1 Experimental arrangement for study of photoelectric effect.

We can use the experimental arrangement of Fig. 11.1 to study the


variation of photocurrent with (a) intensity of radiation, (b) frequency of
incidentradiation, (c) the potential difference between the plates A and
C, and (d) the nature of the material of plate C. Light of different
frequencies can be used by putting appropriate coloured filter or
coloured glass in the path of light falling on the emitter C. The intensity
of light is varied by changing the distance of the light source from the
emitter.
Figure 11.2 Variation of Photoelectric current with intensity of light.

11.4.1 Effect of intensity of light on photocurrent


The collector A is maintained at a positive potential with respect to
emitter C so that electrons ejected from C are attracted towards
collector A. Keeping the frequency of the incident radiation and the
accelerating potential fixed, the intensity of light is varied and the
resulting photoelectric current is measured each time. It is found that
the photocurrent increases linearly with intensity of incident light as
shown graphically in Fig. 11.2. The photocurrent is directly
proportional to the number of photoelectrons emitted per second. This
implies that the number of photoelectrons emitted per second is
directly proportional to the intensity of incident radiation.

11.4.2 Effect of potential on photoelectric current


We first keep the plate A at some positive accelerating potential with
respect to the plate C and illuminate the plate C with light of fixed
frequency and fixed intensity I1. We next vary the positive potential
of plate A gradually and measure the resulting photocurrent each time.
It is found that the photoelectric current increases with increase in
accelerating (positive) potential. At some stage, for a certain positive
potential of plate A, all the emitted electrons are collected by the plate
A and the photoelectric current becomes maximum or saturates. If we
increase the accelerating potential of plate A further, the photocurrent
does not increase. This maximum value of the photoelectric current is
called saturation current. Saturation current corresponds to the case
when all the photoelectrons emitted by the emitter plate C reach the
collector plate A.
We now apply a negative (retarding) potential to the plate A with
respect to the plate C and make it increasingly negative gradually.
When the polarity is reversed, the electrons are repelled and only the
most energetic electrons are able to reach the collector A. The
photocurrent is found to decrease rapidly until it drops to zero at a
certain sharply defined, critical value of the negative potential V0 on
the plate A. For a particular frequency of incident radiation, the
minimum negative (retarding) potential V0 given to the plate A for
which the photocurrent stops or becomes zero is called the cut-off or
stopping potential.
Figure 11.3 Variation of photocurrent with collector plate potential for
different intensity of incident radiation.

The interpretation of the observation in terms of photoelectrons is


straightforward. All the photoelectrons emitted from the metal do not
have the same energy. Photoelectric current is zero when the
stopping potential is sufficient to repel even the most energetic
photoelectrons, with the maximum kinetic energy (Kmax), so that
Kmax = e V0 (11.1)
We can now repeat this experiment with incident radiation of the same
frequency but of higher intensity I2 and I3 (I3 > I2 > I1). We note that
the saturation currents are now found to be at higher values. This
shows that more electrons are being emitted per second, proportional
to the intensity of incident radiation. But the stopping potential remains
the same as that for the incident radiation of intensity I1, as shown
graphically in Fig. 11.3. Thus, for a given frequency of the incident
radiation, the stopping potential is independent of its intensity. In other
words, the maximum kinetic energy of photoelectrons depends on the
light source and the emitter plate material, but is independent of
intensity of incident radiation.

11.4.3 Effect of frequency of incident radiation on


stopping potential

Figure 11.4 Variation of photoelectric current with collector plate potential for
different frequencies of incident radiation.

We now study the relation between the frequency of the incident


radiation and the stopping potential V0. We suitably adjust the same
intensity of light radiation at various frequencies and study the
variation of photocurrent with collector plate potential. The resulting
variation is shown in Fig. 11.4. We obtain different values of stopping
potential but the same value of the saturation current for incident
radiation of different frequencies. The energy of the emitted electrons
depends on the frequency of the incident radiations. The stopping
potential is more negative for higher frequencies of incident radiation.
Note from Fig. 11.4 that the stopping potentials are in the order V03 >
V02 > V01 if the frequencies are in the order 3 > 2 > 1 . This implies
that greater the frequency of incident light, greater is the maximum
kinetic energy of the photoelectrons. Consequently, we need greater
retarding potential to stop them completely. If we plot a graph between
the frequency of incident radiation and the corresponding stopping
potential for different metals we get a straight line, as shown in Fig.
11.5.
Figure 11.5 Variation of stopping potential V0 with frequency of incident radiation
for a given photosensitive material.

The graph shows that


(i) the stopping potential V0 varies linearly with the frequency of
incident radiation for a given photosensitive material.
(ii) there exists a certain minimum cut-off frequency 0 for which the
stopping potential is zero.
These observations have two implications:
(i) The maximum kinetic energy of the photoelectrons varies linearly
with the frequency of incident radiation, but is independent of its
intensity.
(ii) For a frequency of incident radiation, lower than the cut-off
frequency 0, no photoelectric emission is possible even if the
intensity is large.
This minimum, cut-off frequency 0, is called the threshold frequency.
It is different for different metals.

Different photosensitive materials respond differently to light.


Selenium is more sensitive than zinc or copper. The same
photosensitive substance gives different response to light of different
wavelengths. For example, ultraviolet light gives rise to photoelectric
effect in copper while green or red light does not.
Note that in all the above experiments, it is found that, if frequency of
the incident radiation exceeds the threshold frequency, the
photoelectric emission starts instantaneously without any apparent
time lag, even if the incident radiation is very dim. It is now known that
9
emission starts in a time of the order of 10 s or less.

We now summarise the experimental features and observations


described in this section.
(i) For a given photosensitive material and frequency of incident
radiation (above the threshold frequency), the photoelectric current is
directly proportional to the intensity of incident light (Fig. 11.2).

(ii) For a given photosensitive material and frequency of incident


radiation, saturation current is found to be proportional to the intensity
of incident radiation whereas the stopping potential is independent of
its intensity (Fig. 11.3).
(iii) For a given photosensitive material, there exists a certain
minimum cut-off frequency of the incident radiation, called the
threshold frequency, below which no emission of photoelectrons takes
place, no matter how intense the incident light is. Above the threshold
frequency, the stopping potential or equivalently the maximum kinetic
energy of the emitted photoelectrons increases linearly with the
frequency of the incident radiation, but is independent of its intensity
(Fig. 11.5).
(iv) The photoelectric emission is an instantaneous process without
any apparent time lag (109s or less), even when the incident
radiation is made exceedingly dim.

11.5 PHOTOELECTRIC EFFECT AND WAVE


THEORY OF LIGHT
The wave nature of light was well established by the end of the
nineteenth century. The phenomena of interference, diffraction and
polarisation were explained in a natural and satisfactory way by the
wave picture of light. According to this picture, light is an
electromagnetic wave consisting of electric and magnetic fields with
continuous distribution of energy over the region of space over which
the wave is extended. Let us now see if this wave picture of light can
explain the observations on photoelectric emission given in the
previous section.
According to the wave picture of light, the free electrons at the surface
of the metal (over which the beam of radiation falls) absorb the radiant
energy continuously. The greater the intensity of radiation, the greater
are the amplitude of electric and magnetic fields. Consequently, the
greater the intensity, the greater should be the energy absorbed by
each electron. In this picture, the maximum kinetic energy of the
photoelectrons on the surface is then expected to increase with
increase in intensity. Also, no matter what the frequency of radiation
is, a sufficiently intense beam of radiation (over sufficient time) should
be able to impart enough energy to the electrons, so that they exceed
the minimum energy needed to escape from the metal surface . A
threshold frequency, therefore, should not exist. These expectations of
the wave theory directly contradict observations (i), (ii) and (iii) given
at the end of sub-section 11.4.3.
Further, we should note that in the wave picture, the absorption of
energy by electron takes place continuously over the entire
wavefront of the radiation. Since a large number of electrons absorb
energy, the energy absorbed per electron per unit time turns out to be
small. Explicit calculations estimate that it can take hours or more for a
single electron to pick up sufficient energy to overcome the work
function and come out of the metal. This conclusion is again in striking
contrast to observation (iv) that the photoelectric emission is
instantaneous. In short, the wave picture is unable to explain the most
basic features of photoelectric emission.

11.6 EINSTEINS PHOTOELECTRIC EQUATION:


ENERGY QUANTUM OF RADIATION

In 1905, Albert Einstein (1879-1955) proposed a radically new picture


of electromagnetic radiation to explain photoelectric effect. In this
picture, photoelectric emission does not take place by continuous
absorption of energy from radiation. Radiation energy is built up of
discrete units the so called quanta of energy of radiation. Each
quantum of radiant energy has energy h, where h is Plancks
constant and the frequency of light. In photoelectric effect, an
electron absorbs a quantum of energy (h) of radiation. If this quantum
of energy absorbed exceeds the minimum energy needed for the
electron to escape from the metal surface (work function 0), the
electron is emitted with maximum kinetic energy
Kmax = h 0 (11.2)
More tightly bound electrons will emerge with kinetic energies less
than the maximum value. Note that the intensity of light of a given
frequency is determined by the number of photons incident per
second. Increasing the intensity will increase the number of emitted
electrons per second. However, the maximum kinetic energy of the
emitted photoelectrons is determined by the energy of each photon.

Equation (11.2) is known as Einsteins photoelectric equation. We now


see how this equation accounts in a simple and elegant manner all the
observations on photoelectric effect given at the end of sub-section
11.4.3.

Albert Einstein (1879 1955) Einstein, one of the greatest physicists of all
time, was born in Ulm, Germany. In 1905, he published three path-breaking
papers. In the first paper, he introduced the notion of light quanta (now called
photons) and used it to explain the features of photoelectric effect. In the
second paper, he developed a theory of Brownian motion, confirmed
experimentally a few years later and provided a convincing evidence of the
atomic picture of matter. The third paper gave birth to the special theory of
relativity. In 1916, he published the general theory of relativity. Some of
Einsteins most significant later contributions are: the notion of stimulated
emission introduced in an alternative derivation of Plancks blackbody radiation
law, static model of the universe which started modern cosmology, quantum
statistics of a gas of massive bosons, and a critical analysis of the foundations
of quantum mechanics. In 1921, he was awarded the Nobel Prize in physics
for his contribution to theoretical physics and the photoelectric effect.
According to Eq. (11.2), Kmax depends linearly on , and is
independent of intensity of radiation, in agreement with
observation. This has happened because in Einsteins picture,
photoelectric effect arises from the absorption of a single
quantum of radiation by a single electron. The intensity of
radiation (that is proportional to the number of energy quanta per
unit area per unit time) is irrelevant to this basic process.
Since Kmax must be non-negative, Eq. (11.2 ) implies that
photoelectric emission is possible only if
h > 0
or > 0 , where

0 = (11.3)
Equation (11.3) shows that the greater the work function 0, the
higher the minimum or threshold frequency 0 needed to emit
photoelectrons. Thus, there exists a threshold frequency0 (=
0/h) for the metal surface, below which no photoelectric
emission is possible, no matter how intense the incident radiation
may be or how long it falls on the surface.
In this picture, intensity of radiation as noted above, is
proportional to the number of energy quanta per unit area per
unit time. The greater the number of energy quanta available, the
greater is the number of electrons absorbing the energy quanta
and greater, therefore, is the number of electrons coming out of
the metal (for > 0). This explains why, for > 0 , photoelectric
current is proportional to intensity.
In Einsteins picture, the basic elementary process involved in
photoelectric effect is the absorption of a light quantum by an
electron. This process is instantaneous. Thus, whatever may be
the intensity i.e., the number of quanta of radiation per unit area
per unit time, photoelectric emission is instantaneous. Low
intensity does not mean delay in emission, since the basic
elementary process is the same. Intensity only determines how
many electrons are able to participate in the elementary process
(absorption of a light quantum by a single electron) and,
therefore, the photoelectric current.
Using Eq. (11.1), the photoelectric equation, Eq. (11.2), can be
written as

e V0 = h 0; for

or V0 = (11.4)
This is an important result. It predicts that the V0 versus curve is a
straight line with slope = (h/e), independent of the nature of the
material. During 1906-1916, Millikan performed a series of
experiments on photoelectric effect, aimed at disproving Einsteins
photoelectric equation. He measured the slope of the straight line
obtained for sodium, similar to that shown in Fig. 11.5. Using the
known value of e, he determined the value of Plancks constant h.
34
This value was close to the value of Plancks contant (= 6.626 10 J
s) determined in an entirely different context. In this way, in 1916,
Millikan proved the validity of Einsteins photoelectric equation, instead
of disproving it.
The successful explanation of photoelectric effect using the
hypothesis of light quanta and the experimental determination of
values of h and 0, in agreement with values obtained from other
experiments, led to the acceptance of Einsteins picture of
photoelectric effect. Millikan verified photoelectric equation with great
precision, for a number of alkali metals over a wide range of radiation
frequencies.

11.7 PARTICLE NATURE OF LIGHT: THE PHOTON


Photoelectric effect thus gave evidence to the strange fact that light in
interaction with matter behaved as if it was made of quanta or packets
of energy, each of energy h .
Is the light quantum of energy to be associated with a particle?
Einstein arrived at the important result, that the light quantum can also
be associated with momentum (h /c). A definite value of energy as
well as momentum is a strong sign that the light quantum can be
associated with a particle. This particle was later named photon. The
particle-like behaviour of light was further confirmed, in 1924, by the
experiment of A.H. Compton (1892-1962) on scattering of X-rays from
electrons. In 1921, Einstein was awarded the Nobel Prize in Physics
for his contribution to theoretical physics and the photoelectric effect.
In 1923, Millikan was awarded the Nobel Prize in physics for his work
on the elementary charge of electricity and on the photoelectric effect.
We can summarise the photon picture of electromagnetic radiation as
follows:
(i) In interaction of radiation with matter, radiation behaves as if it is
made up of particles called photons.
(ii) Each photon has energy E (=h) and momentum p (= h /c), and
speed c, the speed of light.
(iii) All photons of light of a particular frequency , or wavelength ,
have the same energy E (=h = hc/) and momentum p (= h/c= h/),
whatever the intensity of radiation may be. By increasing the intensity
of light of given wavelength, there is only an increase in the number of
photons per second crossing a given area, with each photon having
the same energy. Thus, photon energy is independent of intensity of
radiation.
(iv) Photons are electrically neutral and are not deflected by electric
and magnetic fields.
(v) In a photon-particle collision (such as photon-electron collision),
the total energy and total momentum are conserved. However, the
number of photons may not be conserved in a collision. The photon
may be absorbed or a new photon may be created.

Example 11.1 Monochromatic light of frequency 6.0 1014 Hz is produced by


a laser. The power emitted is 2.0 103 W. (a) What is the energy of a photon
in the light beam? (b) How many photons per second, on an average, are
emitted by the source?
Solution
(a) Each photon has an energy
E = h = ( 6.63 1034 J s) (6.0 1014 Hz)

= 3.98 1019 J
(b) If N is the number of photons emitted by the source per second, the power
P transmitted in the beam equals N times the energy per photon E, so that P =
N E. Then

N=

= 5.0 1015 photons per second.

Example 11.2 The work function of caesium is 2.14 eV. Find (a) the threshold
frequency for caesium, and (b) the wavelength of the incident light if the
photocurrent is brought to zero by a stopping potential of 0.60 V.
Solution
(a) For the cut-off or threshold frequency, the energy h 0 of the incident
radiation must be equal to work function 0, so that

0 =

Thus, for frequencies less than this threshold frequency, no photoelectrons are
ejected.
(b) Photocurrent reduces to zero, when maximum kinetic energy of the emitted
photoelectrons equals the potential energy e V0 by the retarding potential V0.
Einsteins Photoelectric equation is

eV0 = h 0 = 0
or, = hc/(eV0 + 0)
Example 11.3 The wavelength of light in the visible region is about 390 nm for
violet colour, about 550 nm (average wavelength) for yellow-green colour and
about 760 nm for red colour.
(a) What are the energies of photons in (eV) at the (i) violet end, (ii) average
wavelength, yellow-green colour, and (iii) red end of the visible spectrum?
(Take h = 6.631034 J s and 1 eV = 1.610 19J.)
(b) From which of the photosensitive materials with work functions listed in
Table 11.1 and using the results of (i), (ii) and (iii) of (a), can you build a
photoelectric device that operates with visible light?
Solution
(a) Energy of the incident photon, E = h = hc/
E = (6.631034J s) (3108 m/s)/

(i) For violet light, 1 = 390 nm (lower wavelength end)

Incident photon energy, E1 =

= 5.10 1019J

= 3.19 eV
(ii) For yellow-green light, 2 = 550 nm (average wavelength)

Incident photon energy, E2 =

= 3.621019 J = 2.26 eV
(iii) For red light, 3 = 760 nm (higher wavelength end)

Incident photon energy, E3 =

= 2.621019 J = 1.64 eV
(b) For a photoelectric device to operate, we require incident light energy E to
be equal to or greater than the work function 0 of the material. Thus, the
photoelectric device will operate with violet light (with E = 3.19 eV)
photosensitive material Na (with 0 = 2.75 eV), K (with 0 = 2.30 eV) and Cs
(with 0 = 2.14 eV). It will also operate with yellow-green light (with E = 2.26
eV) for Cs (with 0 = 2.14 eV) only. However, it will not operate with red light
(with E = 1.64 eV) for any of these photosensitive materials.

11.8 WAVE NATURE OF MATTER


The dual (wave-particle) nature of light (electromagnetic radiation, in
general) comes out clearly from what we have learnt in this and the
preceding chapters. The wave nature of light shows up in the
phenomena of interference, diffraction and polarisation. On the other
hand, in photoelectric effect and Compton effect which involve energy
and momentum transfer, radiation behaves as if it is made up of a
bunch of particles the photons. Whether a particle or wave
description is best suited for understanding an experiment depends on
the nature of the experiment. For example, in the familiar
phenomenon of seeing an object by our eye, both descriptions are
important. The gathering and focussing mechanism of light by the eye-
lens is well described in the wave picture. But its absorption by the
rods and cones (of the retina) requires the photon picture of light.
A natural question arises: If radiation has a dual (wave-particle)
nature, might not the particles of nature (the electrons, protons, etc.)
also exhibit wave-like character? In 1924, the French physicist Louis
Victor de Broglie (pronounced as de Broy) (1892-1987) put forward
the bold hypothesis that moving particles of matter should display
wave-like properties under suitable conditions. He reasoned that
nature was symmetrical and that the two basic physical entities
matter and energy, must have symmetrical character. If radiation
shows dual aspects, so should matter. De Broglie proposed that the
wave length associated with a particle of momentum p is given as

= (11.5)
where m is the mass of the particle and v its speed. Equation (11.5) is
known as the de Broglie relation and the wavelength of the matter
wave is called de Broglie wavelength. The dual aspect of matter is
evident in the de Broglie relation. On the left hand side of Eq. (11.5),
is the attribute of a wave while on the right hand side the momentum p
is a typical attribute of a particle. Plancks constant h relates the two
attributes.
Equation (11.5) for a material particle is basically a hypothesis whose
validity can be tested only by experiment. However, it is interesting to
see that it is satisfied also by a photon. For a photon, as we have
seen,
p = h /c (11.6)
Therefore,
(11.7)

That is, the de Broglie wavelength of a photon given by Eq. (11.5)


equals the wavelength of electromagnetic radiation of which the
photon is a quantum of energy and momentum.
Clearly, from Eq. (11.5 ), is smaller for a heavier particle (large m) or
more energetic particle (large v). For example, the de Broglie
wavelength of a ball of mass 0.12 kg moving with a speed of 20 m s1
is easily calculated:
p = m v = 0.12 kg 20 m s1 = 2.40 kg m s1

= = = 2.76 1034 m

This wavelength is so small that it is beyond any measurement. This is


the reason why macroscopic objects in our daily life do not show
wave-like properties. On the other hand, in the sub-atomic domain, the
wave character of particles is significant and measurable.
Consider an electron (mass m, charge e) accelerated from rest
through a potential V. The kinetic energy K

Photocell

A photocell is a technological application of the photoelectric effect. It is a


device whose electrical properties are affected by light. It is also sometimes
called an electric eye. A photocell consists of a semi-cylindrical photo-sensitive
metal plate C (emitter) and a wire loop A (collector) supported in an evacuated
glass or quartz bulb. It is connected to the external circuit having a high-
tension battery B and microammeter (A) as shown in the Figure. Sometimes,
instead of the plate C, a thin layer of photosensitive material is pasted on the
inside of the bulb. A part of the bulb is left clean for the light to enter it.

A photo cell
When light of suitable wavelength falls on the emitter C, photoelectrons are
emitted. These photoelectrons are drawn to the collector A. Photocurrent of
the order of a few microampere can be normally obtained from a photo cell.
A photocell converts a change in intensity of illumination into a change in
photocurrent. This current can be used to operate control systems and in light
measuring devices. A photocell of lead sulphide sensitive to infrared radiation
is used in electronic ignition circuits.
In scientific work, photo cells are used whenever it is necessary to measure
the intensity of light. Light meters in photographic cameras make use of photo
cells to measure the intensity of incident light. The photocells, inserted in the
door light electric circuit, are used as automatic door opener. A person
approaching a doorway may interrupt a light beam which is incident on a
photocell. The abrupt change in photocurrent may be used to start a motor
which opens the door or rings an alarm. They are used in the control of a
counting device which records every interruption of the light beam caused by a
person or object passing across the beam. So photocells help count the
persons entering an auditorium, provided they enter the hall one by one. They
are used for detection of traffic law defaulters: an alarm may be sounded
whenever a beam of (invisible) radiation is intercepted.
In burglar alarm, (invisible) ultraviolet light is continuously made to fall on a
photocell installed at the doorway. A person entering the door interrupts the
beam falling on the photocell. The abrupt change in photocurrent is used to
start an electric bell ringing. In fire alarm, a number of photocells are installed
at suitable places in a building. In the event of breaking out of fire, light
radiations fall upon the photocell. This completes the electric circuit through an
electric bell or a siren which starts operating as a warning signal.
Photocells are used in the reproduction of sound in motion pictures and in the
television camera for scanning and telecasting scenes. They are used in
industries for detecting minor flaws or holes in metal sheets.

of the electron equals the work done (eV ) on it by the electric field:
K=eV (11.8)

Now , K = m v2 = , so that

p= (11.9)
The de Broglie wavelength of the electron is then

= (11.10)
Substituting the numerical values of h, m, e,
we get
(11.11)

where V is the magnitude of accelerating potential in volts. For a 120


V accelerating potential, Eq. (11.11) gives = 0.112 nm. This
wavelength is of the same order as the spacing between the atomic
planes in crystals. This suggests that matter waves associated with an
electron could be verified by crystal diffraction experiments analogous
to X-ray diffraction. We describe the experimental verification of the de
Broglie hypothesis in the next section. In 1929, de Broglie was
awarded the Nobel Prize in Physics for his discovery of the wave
nature of electrons.

Louis Victor de Broglie (1892 1987) French physicist who put forth
revolutionary idea of wave nature of matter. This idea was developed by Erwin
Schrdinger into a full-fledged theory of quantum mechanics commonly known
as wave mechanics. In 1929, he was awarded the Nobel Prize in Physics for
his discovery of the wave nature of electrons.

The matterwave picture elegantly incorporated the Heisenbergs


uncertainty principle. According to the principle, it is not possible to
measure both the position and momentum of an electron (or any other
particle) at the same time exactly. There is always some uncertainty
( x) in the specification of position and some uncertainty (p) in the
specification of momentum. The product of x and p is of the order of
* (with = h/2), i.e.,
x p h (11.12)
Equation (11.12) allows the possibility that x is zero; but then p
must be infinite in order that the product is non-zero. Similarly, if p is
zero, x must be infinite. Ordinarily, both x and p are non-zero such
that their product is of the order of .
Now, if an electron has a definite momentum p, (i.e.p = 0), by the de
Broglie relation, it has a definite wavelength . A wave of definite
(single) wavelength extends all over space. By Borns probability
interpretation this means that the electron is not localised in any finite
region of space. That is, its position uncertainty is infinite (x ),
which is consistent with the uncertainty principle.
In general, the matter wave associated with the electron is not
extended all over space. It is a wave packet extending over some
finite region of space. In that case x is not infinite but has some finite
value depending on the extension of the wave packet. Also, you must
appreciate that a wave packet of finite extension does not have a
single wavelength. It is built up of wavelengths spread around some
central wavelength.
* A more rigorous treatment gives x p h /2.
By de Broglies relation, then, the momentum of the electron will also
have a spread an uncertainty p. This is as expected from the
uncertainty principle. It can be shown that the wave packet description
together with de Broglie relation and Borns probability interpretation
reproduce the Heisenbergs uncertainty principle exactly.
In Chapter 12, the de Broglie relation will be seen to justify Bohrs
postulate on quantisation of angular momentum of electron in an
atom.

Figure 11.6 (a) The wave packet description of an electron. The wave packet
corresponds to a spread of wavelength around some central wavelength (and
hence by de Broglie relation, a spread in momentum). Consequently, it is
associated with an uncertainty in position (x) and an uncertainty in momentum
(p).(b) The matter wave corresponding to a definite momentum of an
electron extends all over space. In this case, p = 0 and x .

Figure 11.6 shows a schematic diagram of (a) a localised wave


packet, and (b) an extended wave with fixed wavelength.

Example 11.4 What is the de Broglie wavelength associated with (a)


an electron moving with a speed of 5.4106 m/s, and (b) a ball of
mass 150 g travelling at 30.0 m/s?

Solution
(a) For the electron:
Mass m = 9.111031 kg, speed v = 5.4106 m/s. Then, momentum p = m v =
9.111031 (kg) 5.4 106 (m/s)
p = 4.92 1024 kg m/s

de Broglie wavelength, = h/p

= 0.135 nm
(b) For the ball:
Mass m = 0.150 kg, speed v = 30.0 m/s.
Then momentum p = m v = 0.150 (kg) 30.0 (m/s)
p= 4.50 kg m/s

de Broglie wavelength = h/p.

= 1.47 1034 m
The de Broglie wavelength of electron is comparable with X-ray wavelengths.
19
However, for the ball it is about 10 times the size of the proton, quite
beyond experimental measurement.

Example 11.5 An electron, an -particle, and a proton have the same kinetic
energy. Which of these particles has the shortest de Broglie wavelength?
Solution
For a particle, de Broglie wavelength, = h/p
2
Kinetic energy, K = p /2m

Then,
For the same kinetic energy K, the de Broglie wavelength associated with the
particle is inversely proportional to the square root of their masses. A proton

is 1836 times massive than an electron and an -particle four


times that of a proton.
Hence, particle has the shortest de Broglie wavelength.
Example 11.6 A particle is moving three times as fast as an electron. The ratio
of the de Broglie wavelength of the particle to that of the electron is 1.813
4
10 . Calculate the particles mass and identify the particle.
Solution
de Broglie wavelength of a moving particle, having mass m and
velocity v:
Mass, m = h/v
For an electron, mass me = h/e ve
Now, we have v/ve = 3 and

/e = 1.813 10 4

Then, mass of the particle, m = me

m = (9.111031 kg) (1/3) (1/1.813 104)

m = 1.675 1027 kg.

Thus, the particle, with this mass could be a proton or a neutron.

Probability interpretation to matter waves

It is worth pausing here to reflect on just what a matter wave associated with a
particle, say, an electron, means. Actually, a truly satisfactory physical
understanding of the dual nature of matter and radiation has not emerged so
far. The great founders of quantum mechanics (Niels Bohr, Albert Einstein,
and many others) struggled with this and related concepts for long. Still the
deep physical interpretation of quantum mechanics continues to be an area of
active research. Despite this, the concept of matter wave has been
mathematically introduced in modern quantum mechanics with great success.
An important milestone in this connection was when Max Born (1882-1970)
suggested a probability interpretation to the matter wave amplitude. According
to this, the intensity (square of the amplitude) of the matter wave at a point
determines the probability density of the particle at that point. Probability
density means probability per unit volume. Thus, if A is the amplitude of the
wave at a point, |A|2 V is the probability of the particle being found in a small
volume V around that point. Thus, if the intensity of matter wave is large in a
certain region, there is a greater probability of the particle being found there
than where the intensity is small.

Example 11.7 What is the de Broglie wavelength associated with an electron,


accelerated through a potential differnece of 100 volts?
Solution Accelerating potential V = 100 V. The de Broglie wavelength is

= h /p nm

nm = 0.123 nm

The de Broglie wavelength associated with an electron in this case is of the


order of X-ray wavelengths.

11.9 DAVISSON AND GERMER EXPERIMENT


The wave nature of electrons was first experimentally verified by C.J.
Davisson and L.H. Germer in 1927 and independently by G.P.
Thomson, in 1928, who observed diffraction effects with beams of
electrons scattered by crystals. Davisson and Thomson shared the
Nobel Prize in 1937 for their experimental discovery of diffraction of
electrons by crystals.
The experimental arrange-ment used by Davisson and Germer is
schematically shown in Fig. 11.7. It consists of an electron gun which
comprises of a tungsten filament F, coated with barium oxide and
heated by a low voltage power supply (L.T. or battery). Electrons
emitted by the filament are accelerated to a desired velocity by
applying suitable potential/voltage from a high voltage power supply
(H.T. or battery). They are made to pass through a cylinder with fine
holes along its axis, producing a fine collimated beam. The beam is
made to fall on the surface of a nickel crystal. The electrons are
scattered in all directions by the atoms of the crystal. The intensity of
the electron beam, scattered in a given direction, is measured by the
electron detector (collector). The detector can be moved on a circular
scale and is connected to a sensitive galvanometer, which records the
current. The deflection of the galvanometer is proportional to the
intensity of the electron beam entering the collector. The apparatus is
enclosed in an evacuated chamber. By moving the detector on the
circular scale at different positions, the intensity of the scattered
electron beam is measured for different values of angle of scattering
which is the angle between the incident and the scattered electron
beams. The variation of the intensity (I) of the scattered electrons with
the angle of scattering is obtained for different accelerating voltages.
Figure 11.7 Davisson-Germer electron diffraction arrangement.

The experiment was performed by varying the accelarating voltage


from 44 V to 68 V. It was noticed that a strong peak appeared in the
intensity (I) of the scattered electron for an accelarating voltage of 54V
at a scattering angle = 50
The appearance of the peak in a particular direction is due to the
constructive interference of electrons scattered from different layers of
the regularly spaced atoms of the crystals. From the electron
diffraction measurements, the wavelength of matter waves was found
to be
0.165 nm.
The de Broglie wavelength associated with electrons, using
Eq. (11.11), for V = 54 V is given by

= h /p nm

nm = 0.167 nm
Thus, there is an excellent agreement between the theoretical value
and the experimentally obtained value of de Broglie wavelength.
Davisson-Germer experiment thus strikingly confirms the wave nature
of electrons and the de Broglie relation. More recently, in 1989, the
wave nature of a beam of electrons was experimentally demonstrated
in a double-slit experiment, similar to that used for the wave nature of
light. Also, in an experiment in 1994, interference fringes were
obtained with the beams of iodine molecules, which are about a
million times more massive than electrons.
The de Broglie hypothesis has been basic to the development of
modern quantum mechanics. It has also led to the field of electron
optics. The wave properties of electrons have been utilised in the
design of electron microscope which is a great improvement, with
higher resolution, over the optical microscope.

Summary

1. The minimum energy needed by an electron to come out from a metal


surface is called the work function of the metal. Energy (greater than the work
function () required for electron emission from the metal surface can be
supplied by suitably heating or applying strong electric field or irradiating it by
light of suitable frequency.
2. Photoelectric effect is the phenomenon of emission of electrons by metals
when illuminated by light of suitable frequency. Certain metals respond to
ultraviolet light while others are sensitive even to the visible light.
Photoelectric effect involves conversion of light energy into electrical energy. It
follows the law of conservation of energy. The photoelectric emission is an
instantaneous process and possesses certain special features.
3. Photoelectric current depends on (i) the intensity of incident light, (ii) the
potential difference applied between the two electrodes, and (iii) the nature of
the emitter material.
4. The stopping potential (Vo) depends on (i) the frequency of incident light,
and (ii) the nature of the emitter material. For a given frequency of incident
light, it is independent of its intensity. The stopping potential is directly related
to the maximum kinetic energy of electrons emitted:
e V0 = (1/2) m v2max = Kmax.
5. Below a certain frequency (threshold frequency) 0, characteristic of the
metal, no photoelectric emission takes place, no matter how large the intensity
may be.
6. The classical wave theory could not explain the main features of
photoelectric effect. Its picture of continuous absorption of energy from
radiation could not explain the independence of Kmax on intensity, the
existence of o and the instantaneous nature of the process. Einstein
explained these features on the basis of photon picture of light. According to
this, light is composed of discrete packets of energy called quanta or photons.
Each photon carries an energy E (= h ) and momentum p (= h/), which
depend on the frequency ( ) of incident light and not on its intensity.
Photoelectric emission from the metal surface occurs due to absorption of a
photon by an electron.
7. Einsteins photoelectric equation is in accordance with the energy
conservation law as applied to the photon absorption by an electron in the
metal. The maximum kinetic energy (1/2)m v2max is equal to
the photon energy (h) minus the work function 0 (= h0) of the
target metal:

m v2max = V0 e = h 0 = h ( 0)
This photoelectric equation explains all the features of the photoelectric effect.
Millikans first precise measurements confirmed the Einsteins photoelectric
equation and obtained an accurate value of Plancks constant h. This led to the
acceptance of particle or photon description (nature) of electromagnetic
radiation, introduced by Einstein.
8. Radiation has dual nature: wave and particle. The nature of experiment
determines whether a wave or particle description is best suited for
understanding the experimental result. Reasoning that radiation and matter
should be symmetrical in nature, Louis Victor de Broglie attributed a wave-like
character to matter (material particles). The waves associated with the moving
material particles are called matter waves or de Broglie waves.
9. The de Broglie wavelength () associated with a moving particle is related to
its momentum p as: = h/p. The dualism of matter is inherent in the de Broglie
relation which contains a wave concept () and a particle concept (p). The de
Broglie wavelength is independent of the charge and nature of the material
particle. It is significantly measurable (of the order of the atomic-planes
spacing in crystals) only in case of sub-atomic particles like electrons, protons,
etc. (due to smallness of their masses and hence, momenta). However, it is
indeed very small, quite beyond measurement, in case of macroscopic objects,
commonly encountered in everyday life.
10. Electron diffraction experiments by Davisson and Germer, and by G. P.
Thomson, as well as many later experiments, have verified and confirmed the
wave-nature of electrons. The de Broglie hypothesis of matter waves supports
the Bohrs concept of stationary orbits.

Points to Ponder

1. Free electrons in a metal are free in the sense that they move inside the
metal in a constant potential (This is only an approximation). They are not free
to move out of the metal. They need additional energy to get out of the metal.
2. Free electrons in a metal do not all have the same energy. Like molecules in
a gas jar, the electrons have a certain energy distribution at a given
temperature. This distribution is different from the usual Maxwells distribution
that you have learnt in the study of kinetic theory of gases. You will learn about
it in later courses, but the difference has to do with the fact that electrons obey
Paulis exclusion principle.
3. Because of the energy distribution of free electrons in a metal, the energy
required by an electron to come out of the metal is different for different
electrons. Electrons with higher energy require less additional energy to come
out of the metal than those with lower energies. Work function is the least
energy required by an electron to come out of the metal.
4. Observations on photoelectric effect imply that in the event of matter-light
interaction, absorption of energy takes place in discrete units of h. This is not
quite the same as saying that light consists of particles, each of energy h.
5. Observations on the stopping potential (its independence of intensity and
dependence on frequency) are the crucial discriminator between the wave-
picture and photon-picture of photoelectric effect.

6. The wavelength of a matter wave given by has physical


significance; its phase velocity vp has no physical
significance. However, the group velocity of the matter wave is physically
meaningful and equals the velocity of the particle.

Exercises

11.1 Find the

(a) maximum frequency, and


(b) minimum wavelength of X-rays produced by 30 kV electrons.
11.2 The work function of caesium metal is 2.14 eV. When light of
frequency 6 1014Hz is incident on the metal surface,
photoemission of electrons occurs. What is the
(a) maximum kinetic energy of the emitted electrons,
(b) Stopping potential, and
(c) maximum speed of the emitted photoelectrons?

11.3 The photoelectric cut-off voltage in a certain experiment is 1.5


V. What is the maximum kinetic energy of photoelectrons emitted?
11.4 Monochromatic light of wavelength 632.8 nm is produced by
a helium-neon laser. The power emitted is 9.42 mW.

(a) Find the energy and momentum of each photon in the light
beam,
(b) How many photons per second, on the average, arrive at a
target irradiated by this beam? (Assume the beam to have uniform
cross-section which is less than the target area), and
(c) How fast does a hydrogen atom have to travel in order to have
the same momentum as that of the photon?
11.5 The energy flux of sunlight reaching the surface of the earth
is
1.388 103 W/m2. How many photons (nearly) per square metre
are incident on the Earth per second? Assume that the photons in
the sunlight have an average wavelength of 550 nm.
11.6 In an experiment on photoelectric effect, the slope of the cut-
off voltage versus frequency of incident light is found to be 4.12
1015 V s. Calculate the value of Plancks constant.
11.7 A 100W sodium lamp radiates energy uniformly in all
directions. The lamp is located at the centre of a large sphere that
absorbs all the sodium light which is incident on it. The wavelength
of the sodium light is 589 nm. (a) What is the energy per photon
associated with the sodium light? (b) At what rate are the photons
delivered to the sphere?
11.8 The threshold frequency for a certain metal is 3.3 1014 Hz.
If light of frequency 8.2 1014 Hz is incident on the metal, predict
the cut-off voltage for the photoelectric emission.
11.9 The work function for a certain metal is 4.2 eV. Will this metal
give photoelectric emission for incident radiation of wavelength
330 nm?
11.10 Light of frequency 7.21 1014 Hz is incident on a metal
surface. Electrons with a maximum speed of 6.0 105 m/s are
ejected from the surface. What is the threshold frequency for
photoemission of electrons?
11.11 Light of wavelength 488 nm is produced by an argon laser
which is used in the photoelectric effect. When light from this
spectral line is incident on the emitter, the stopping (cut-off)
potential of photoelectrons is 0.38 V. Find the work function of the
material from which the emitter is made.
11.12 Calculate the

(a) momentum, and


(b) de Broglie wavelength of the electrons accelerated through a
potential difference of 56 V.
11.13 What is the

(a) momentum,
(b) speed, and
(c) de Broglie wavelength of an electron with kinetic energy of
120 eV.
11.14 The wavelength of light from the spectral emission line of
sodium is 589 nm. Find the kinetic energy at which
(a) an electron, and

(b) a neutron, would have the same de Broglie wavelength.


11.15 What is the de Broglie wavelength of

(a) a bullet of mass 0.040 kg travelling at the speed of 1.0 km/s,


(b) a ball of mass 0.060 kg moving at a speed of 1.0 m/s, and

(c) a dust particle of mass 1.0 109 kg drifting with a speed of


2.2 m/s?
11.16 An electron and a photon each have a wavelength of 1.00
nm. Find
(a) their momenta,

(b) the energy of the photon, and


(c) the kinetic energy of electron.
11.17 (a) For what kinetic energy of a neutron will the associated
de Broglie wavelength be 1.40 1010 m?
(b) Also find the de Broglie wavelength of a neutron, in thermal
equilibrium with matter, having an average kinetic energy of (3/2) k
T at 300 K.
11.18 Show that the wavelength of electromagnetic radiation is
equal to the de Broglie wavelength of its quantum (photon).

11.19 What is the de Broglie wavelength of a nitrogen molecule in


air at 300 K? Assume that the molecule is moving with the root-
mean-square speed of molecules at this temperature. (Atomic
mass of nitrogen = 14.0076 u)
Additional Exercises

11.20 (a) Estimate the speed with which electrons emitted from a
heated emitter of an evacuated tube impinge on the collector
maintained at a potential difference of 500 V with respect to the
emitter. Ignore the small initial speeds of the electrons.
The specific charge of the electron, i.e., its e/m is given to be 1.76
1011 C kg1.
(b) Use the same formula you employ in (a) to obtain electron
speed for an collector potential of 10 MV. Do you see what is
wrong ? In what way is the formula to be modified?
11.21 (a) A monoenergetic electron beam with electron speed

of 5.20 106 m s1 is subject to a magnetic field of 1.30 104 T


normal to the beam velocity. What is the radius of the circle traced
by the beam, given e/m for electron equals 1.76 1011C kg1.
(b) Is the formula you employ in (a) valid for calculating radius of
the path of a 20 MeV electron beam? If not, in what way is it
modified?

[Note: Exercises 11.20(b) and 11.21(b) take you to relativistic


mechanics which is beyond the scope of this book. They have
been inserted here simply to emphasise the point that the formulas
you use in part (a) of the exercises are not valid at very high
speeds or energies. See answers at the end to know what very
high speed or energy means.]
11.22 An electron gun with its collector at a potential of 100 V fires
out electrons in a spherical bulb containing hydrogen gas at low
pressure (102 mm of Hg). A magnetic field of 2.83 104 T
curves the path of the electrons in a circular orbit of radius 12.0
cm. (The path can be viewed because the gas ions in the path
focus the beam by attracting electrons, and emitting light by
electron capture; this method is known as the fine beam tube
method.) Determine
e/m from the data.

11.23 (a) An X-ray tube produces a continuous spectrum of


radiation with its short wavelength end at 0.45 . What is the
maximum energy of a photon in the radiation?
(b) From your answer to (a), guess what order of accelerating
voltage (for electrons) is required in such a tube?
11.24 In an accelerator experiment on high-energy collisions of
electrons with positrons, a certain event is interpreted as
annihilation of an electron-positron pair of total energy 10.2 BeV
into two -rays of equal energy. What is the wavelength
associated with each -ray? (1BeV = 109 eV)
11.25 Estimating the following two numbers should be interesting.
The first number will tell you why radio engineers do not need to
worry much about photons! The second number tells you why our
eye can never count photons, even in barely detectable light.
(a) The number of photons emitted per second by a Medium wave
transmitter of 10 kW power, emitting radiowaves of wavelength
500 m.
(b) The number of photons entering the pupil of our eye per
second corresponding to the minimum intensity of white light that
we humans can perceive (1010 W m2). Take the area of the
pupil to be about 0.4 cm2, and the average frequency of white light
to be about 6 1014 Hz.
11.26 Ultraviolet light of wavelength 2271 from a 100 W mercury
source irradiates a photo-cell made of molybdenum metal. If the
stopping potential is 1.3 V, estimate the work function of the
metal. How would the photo-cell respond to a high intensity (105
W m2) red light of wavelength 6328 produced by a He-Ne
laser?

11.27 Monochromatic radiation of wavelength 640.2 nm (1nm =


109 m) from a neon lamp irradiates photosensitive material made
of caesium on tungsten. The stopping voltage is measured to be
0.54 V. The source is replaced by an iron source and its 427.2 nm
line irradiates the same photo-cell. Predict the new stopping
voltage.
11.28 A mercury lamp is a convenient source for studying
frequency dependence of photoelectric emission, since it gives a
number of spectral lines ranging from the UV to the red end of the
visible spectrum. In our experiment with rubidium photo-cell, the
following lines from a mercury source were used:

1 = 3650 , 2= 4047 , 3= 4358 , 4= 5461 , 5= 6907 ,

The stopping voltages, respectively, were measured to be:


V01 = 1.28 V, V02 = 0.95 V, V03 = 0.74 V, V04 = 0.16 V, V05 = 0 V

Determine the value of Plancks constant h, the threshold


frequency and work function for the material.

[Note: You will notice that to get h from the data, you will need to
know e (which you can take to be 1.6 1019 C). Experiments of
this kind on Na, Li, K, etc. were performed by Millikan, who, using
his own value of e (from the oil-drop experiment) confirmed
Einsteins photoelectric equation and at the same time gave an
independent estimate of the value of h.]
11.29 The work function for the following metals is given:

Na: 2.75 eV; K: 2.30 eV; Mo: 4.17 eV; Ni: 5.15 eV. Which of these
metals will not give photoelectric emission for a radiation of
wavelength 3300 from a He-Cd laser placed 1 m away from the
photocell? What happens if the laser is brought nearer and placed
50 cm away?
11.30 Light of intensity 105 W m2 falls on a sodium photo-cell of
surface area 2 cm2. Assuming that the top 5 layers of sodium
absorb the incident energy, estimate time required for
photoelectric emission in the wave-picture of radiation. The work
function for the metal is given to be about 2 eV. What is the
implication of your answer?
11.31 Crystal diffraction experiments can be performed using X-
rays, or electrons accelerated through appropriate voltage. Which
probe has greater energy? (For quantitative comparison, take the
wavelength of the probe equal to 1 , which is of the order of inter-
atomic spacing in the lattice) (me=9.11 1031 kg).
11.32 (a) Obtain the de Broglie wavelength of a neutron of kinetic
energy 150 eV. As you have seen in Exercise 11.31, an electron
beam of this energy is suitable for crystal diffraction experiments.
Would a neutron beam of the same energy be equally suitable?
Explain. (mn = 1.675 1027 kg)
(b) Obtain the de Broglie wavelength associated with thermal
neutrons at room temperature (27 C). Hence explain why a fast
neutron beam needs to be thermalised with the environment
before it can be used for neutron diffraction experiments.
11.33 An electron microscope uses electrons accelerated by a
voltage of 50 kV. Determine the de Broglie wavelength associated
with the electrons. If other factors (such as numerical aperture,
etc.) are taken to be roughly the same, how does the resolving
power of an electron microscope compare with that of an optical
microscope which uses yellow light?
11.34 The wavelength of a probe is roughly a measure of the size
of a structure that it can probe in some detail. The quark
structure of protons and neutrons appears at the minute length-
scale of
1015 m or less. This structure was first probed in early 1970s
using high energy electron beams produced by a linear
accelerator at Stanford, USA. Guess what might have been the
order of energy of these electron beams. (Rest mass energy of
electron = 0.511 MeV.)
11.35 Find the typical de Broglie wavelength associated with a He
atom in helium gas at room temperature (27 C) and 1 atm
pressure; and compare it with the mean separation between two
atoms under these conditions.
11.36 Compute the typical de Broglie wavelength of an electron in
a metal at 27 C and compare it with the mean separation
between two electrons in a metal which is given to be about 2
1010 m.

[Note: Exercises 11.35 and 11.36 reveal that while the wave-
packets associated with gaseous molecules under ordinary
conditions are non-overlapping, the electron wave-packets in a
metal strongly overlap with one another. This suggests that
whereas molecules in an ordinary gas can be distinguished apart,
electrons in a metal cannot be distintguished apart from one
another. This indistinguishibility has many fundamental
implications which you will explore in more advanced Physics
courses.]
11.37 Answer the following questions:

(a) Quarks inside protons and neutrons are thought to carry


fractional charges [(+2/3)e ; (1/3)e]. Why do they not show up in
Millikans oil-drop experiment?

(b) What is so special about the combination e/m? Why do we not


simply talk of e and m separately?
(c) Why should gases be insulators at ordinary pressures and start
conducting at very low pressures?
(d) Every metal has a definite work function. Why do all
photoelectrons not come out with the same energy if incident
radiation is monochromatic? Why is there an energy distribution of
photoelectrons?
(e) The energy and momentum of an electron are related to the
frequency and wavelength of the associated matter wave by the
relations:

E = h , p =

But while the value of is physically significant, the value of (and


therefore, the value of the phase speed ) has no physical
significance. Why?

Appendix

11.1 The history of wave-particle flip-flop


What is light? This question has haunted mankind for a long time. But
systematic experiments were done by scientists since the dawn of the
scientific and industrial era, about four centuries ago. Around the
same time, theoretical models about what light is made of were
developed. While building a model in any branch of science, it is
essential to see that it is able to explain all the experimental
observations existing at that time. It is therefore appropriate to
summarize some observations about light that were known in the
seventeenth century.

The properties of light known at that time included (a) rectilinear


propagation of light, (b) reflection from plane and curved surfaces, (c)
refraction at the boundary of two media, (d) dispersion into various
colours, (e) high speed. Appropriate laws were formulated for the first
four phenomena. For example, Snell formulated his laws of refraction
in 1621. Several scientists right from the days of Galileo had tried to
measure the speed of light. But they had not been able to do so. They
had only concluded that it was higher than the limit of their
measurement.
Two models of light were also proposed in the seventeenth century.
Descartes, in early decades of seventeenth century, proposed that
light consists of particles, while Huygens, around 1650-60, proposed
that light consists of waves. Descartes proposal was merely a
philosophical model, devoid of any experiments or scientific
arguments. Newton soon after, around 1660-70, extended Descartes
particle model, known as corpuscular theory, built it up as a scientific
theory, and explained various known properties with it. These models,
light as waves and as particles, in a sense, are quite opposite of each
other. But both models could explain all the known properties of light.
There was nothing to choose between them.
The history of the development of these models over the next few
centuries is interesting. Bartholinus, in 1669, discovered double
refraction of light in some crystals, and Huygens, in 1678, was quick to
explain it on the basis of his wave theory of light. In spite of this, for
over one hundred years, Newtons particle model was firmly believed
and preferred over the wave model. This was partly because of its
simplicity and partly because of Newtons influence on contemporary
physics.

Then in 1801, Young performed his double-slit experiment and


observed interference fringes. This phenomenon could be explained
only by wave theory. It was realized that diffraction was also another
phenomenon which could be explained only by wave theory. In fact, it
was a natural consequence of Huygens idea of secondary wavelets
emanating from every point in the path of light. These experiments
could not be explained by assuming that light consists of particles.
Another phenomenon of polarisation was discovered around 1810,
and this too could be naturally explained by the wave theory. Thus
wave theory of Huygens came to the forefront and Newtons particle
theory went into the background. This situation again continued for
almost a century.
Better experiments were performed in the nineteenth century to
determine the speed of light. With more accurate experiments, a value
of 3108 m/s for speed of light in vacuum was arrived at. Around
1860, Maxwell proposed his equations of electromagnetism and it was
realized that all electromagnetic phenomena known at that time could
be explained by Maxwells four equations. Soon Maxwell showed that
electric and magnetic fields could propagate through empty space
(vacuum) in the form of electromagnetic waves. He calculated the
speed of these waves and arrived at a theoretical value of 2.998108
m/s. The close agreement of this value with the experimental value
suggested that light consists of electromagnetic waves. In 1887 Hertz
demonstrated the generation and detection of such waves. This
established the wave theory of light on a firm footing. We might say
that while eighteenth century belonged to the particle model, the
nineteenth century belonged to the wave model of light.

Vast amounts of experiments were done during the period 1850-1900


on heat and related phenomena, an altogether different area of
physics. Theories and models like kinetic theory and thermodynamics
were developed which quite successfully explained the various
phenomena, except one.
Every body at any temperature emits radiation of all wavelengths. It
also absorbs radiation falling on it. A body which absorbs all the
radiation falling on it is called a black body. It is an ideal concept in
physics, like concepts of a point mass or uniform motion. A graph of
the intensity of radiation emitted by a body versus wavelength is called
the black body spectrum. No theory in those days could explain the
complete black body spectrum!
In 1900, Planck hit upon a novel idea. If we assume, he said, that
radiation is emitted in packets of energy instead of continuously as in
a wave, then we can explain the black body spectrum. Planck himself
regarded these quanta, or packets, as a property of emission and
absorption, rather than that of light. He derived a formula which
agreed with the entire spectrum. This was a confusing mixture of wave
and particle pictures radiation is emitted as a particle, it travels as a
wave, and is again absorbed as a particle! Moreover, this put
physicists in a dilemma. Should we again accept the particle picture of
light just to explain one phenomenon? Then what happens to the
phenomena of interference and diffraction which cannot be explained
by the particle model?
But soon in 1905, Einstein explained the photoelectric effect by
assuming the particle picture of light. In 1907, Debye explained the
low temperature specific heats of solids by using the particle picture
for lattice vibrations in a crystalline solid. Both these phenomena
belonging to widely diverse areas of physics could be explained only
by the particle model and not by the wave model. In 1923, Comptons
x-ray scattering experiments from atoms also went in favour of the
particle picture. This increased the dilemma further.
Thus by 1923, physicists faced with the following situation. (a) There
were some phenomena like rectilinear propagation, reflection,
refraction, which could be explained by either particle model or by
wave model. (b) There were some phenomena such as diffraction and
interference which could be explained only by the wave model but not
by the particle model. (c) There were some phenomena such as black
body radiation, photoelectric effect, and Compton scattering which
could be explained only by the particle model but not by the wave
model. Somebody in those days aptly remarked that light behaves as
a particle on Mondays, Wednesdays and Fridays, and as a wave on
Tuesdays, Thursdays and Saturdays, and we dont talk of light on
Sundays!
In 1924, de Broglie proposed his theory of wave-particle duality in
which he said that not only photons of light but also particles of
matter such as electrons and atoms possess a dual character,
sometimes behaving like a particle and sometimes as a wave. He
gave a formula connecting their mass, velocity, momentum (particle
characteristics), with their wavelength and frequency (wave
characteristics)! In 1927 Thomson, and Davisson and Germer, in
separate experiments, showed that electrons did behave like waves
with a wavelength which agreed with that given by de Broglies
formula. Their experiment was on diffraction of electrons through
crystalline solids, in which the regular arrangement of atoms acted like
a grating. Very soon, diffraction experiments with other particles such
as neutrons and protons were performed and these too confirmed with
de Broglies formula. This confirmed wave-particle duality as an
established principle of physics. Here was a principle, physicists
thought, which explained all the phenomena mentioned above not
only for light but also for the so-called particles.
But there was no basic theoretical foundation for wave-particle duality.
De Broglies proposal was merely a qualitative argument based on
symmetry of nature. Wave-particle duality was at best a principle, not
an outcome of a sound fundamental theory. It is true that all
experiments whatever agreed with de Broglie formula. But physics
does not work that way. On the one hand, it needs experimental
confirmation, while on the other hand, it also needs sound theoretical
basis for the models proposed. This was developed over the next two
decades. Dirac developed his theory of radiation in about 1928, and
Heisenberg and Pauli gave it a firm footing by 1930. Tomonaga,
Schwinger, and Feynman, in late 1940s, produced further refinements
and cleared the theory of inconsistencies which were noticed. All
these theories mainly put wave-particle duality on a theoretical footing.
Although the story continues, it grows more and more complex and
beyond the scope of this note. But we have here the essential
structure of what happened, and let us be satisfied with it at the
moment. Now it is regarded as a natural consequence of present
theories of physics that electromagnetic radiation as well as particles
of matter exhibit both wave and particle properties in different
experiments, and sometimes even in the different parts of the same
experiment.
Unit 1

The Solid State

Objectives

After studying this Unit, you will be able to

describe general characteristics of solid state;

distinguish between amorphous and crystalline solids;

classify crystalline solids on the basis of the nature of binding forces;

define crystal lattice and unit cells

explain close packing of particles;

describe different types of voids and close packed structures;

calculate the packing efficiency of different types of cubic unit cells;

correlate the density of a substance with its unit cell properties;

describe the imperfections in solids and their effect on properties;

correlate the electrical and magnetic properties of solids and their


structure.
The vast majority of solid substances like high temperature
superconductors, biocompatible plastics, silicon chips, etc.
are destined to play an ever expanding role in future
development of science.

We are mostly surrounded by solids and we use them more often than liquids
and gases. For different applications we need solids with widely different
properties. These properties depend upon the nature of constituent particles
and the binding forces operating between them. Therefore, study of the
structure of solids is important. The correlation between structure and
properties helps in discovering new solid materials with desired properties
like high temperature superconductors, magnetic materials, biodegradable
polymers for packaging, biocompliant solids for surgical implants, etc.

From our earlier studies, we know that liquids and gases are called fluids
because of their ability to flow. The fluidity in both of these states is due to
the fact that the molecules are free to move about. On the contrary, the
constituent particles in solids have fixed positions and can only oscillate
about their mean positions. This explains the rigidity in solids. In crystalline
solids, the constituent particles are arranged in regular patterns.

In this Unit, we shall discuss different possible arrangements of particles


resulting in several types of structures. The correlation between the nature of
interactions within the constituent particles and several properties of solids
will also be explored. How these properties get modified due to the structural
imperfections or by the presence of impurities in minute amounts would also
be discussed.
1.1 General Characteristics of Solid State

In Class XI you have learnt that matter can exist in three states namely, solid,
liquid and gas. Under a given set of conditions of temperature and pressure,
which of these would be the most stable state of a given substance depends
upon the net effect of two opposing factors. Intermolecular forces tend to
keep the molecules (or atoms or ions) closer, whereas thermal energy tends to
keep them apart by making them move faster. At sufficiently low
temperature, the thermal energy is low and intermolecular forces bring them
so close that they cling to one another and occupy fixed positions. These can
still oscillate about their mean positions and the substance exists in solid
state. The following are the characteristic properties of the solid state:

(i) They have definite mass, volume and shape.

(ii) Intermolecular distances are short.

(iii) Intermolecular forces are strong.

(iv) Their constituent particles (atoms, molecules or ions) have fixed


positions and can only oscillate about their mean positions.

(v) They are incompressible and rigid.

1.2 Amorphous and Crystalline Solids

Solids can be classified as crystalline or amorphous on the basis of the nature


of order present in the arrangement of their constituent particles. A crystalline
solid usually consists of a large number of small crystals, each of them
having a definite characteristic geometrical shape. In a crystal, the
arrangement of constituent particles (atoms, molecules or ions) is ordered. It
has long range order which means that there is a regular pattern of
arrangement of particles which repeats itself periodically over the entire
crystal. Sodium chloride and quartz are typical examples of crystalline solids.
An amorphous solid (Greek amorphos = no form) consists of particles of
irregular shape. The arrangement of constituent particles (atoms, molecules
or ions) in such a solid has only short range order. In such an arrangement, a
regular and periodically repeating pattern is observed over short distances
only. Such portions are scattered and in between the arrangement is
disordered. The structures of quartz (crystalline) and quartz glass
(amorphous) are shown in Fig. 1.1 (a) and (b) respectively. While the two
structures are almost identical, yet in the case of amorphous quartz glass there
is no long range order. The structure of amorphous solids is similar to that of
liquids. Glass, rubber and plastics are typical examples of amorphous solids.
Due to the differences in the arrangement of the constituent particles, the two
types of solids differ in their properties.

Fig. 1.1: Two dimensional structure of (a) quartz and (b) quartz glass
Crystalline solids have a sharp melting point. On the other hand, amorphous
solids soften over a range of temperature and can be moulded and blown into
various shapes. On heating they become crystalline at some temperature.
Some glass objects from ancient civilisations are found to become milky in
appearance because of some crystallisation. Like liquids, amorphous solids
have a tendency to flow, though very slowly. Therefore, sometimes these are
called pseudo solids or super cooled liquids. Glass panes fixed to windows or
doors of old buildings are invariably found to be slightly thicker at the bottom
than at the top. This is because the glass flows down very slowly and makes
the bottom portion slightly thicker.

Crystalline solids are anisotropic in nature, that is, some of their physical
properties like electrical resistance or refractive index show different values
when measured along different directions in the same crystals. This arises
from different arrangement of particles in different directions. This is
illustrated in Fig. 1.2. Since the arrangement of particles is different along
different directions, the value of same physical property is found to be
different along each direction.

Fig. 1.2: Anisotropy in crystals is due to different arrangement of particles along different
directions.

Amorphous solids on the other hand are isotropic in nature. It is because


there is no long range order in them and arrangement is irregular along all the
directions. Therefore, value of any physical property would be same along
any direction. These differences are summarised in Table 1.1.

Table 1.1: Distinction between Crystalline and Amorphous Solids

Property Crystalline solids Amorphous solids

Definite characteristic
Shape Irregular shape
geometrical shape

Gradually soften over


Melting Melt at a sharp and
a range of
point characteristic temperature
temperature

When cut with a


When cut with a sharp edged
sharp edged tool,
Cleavage tool, they split into two pieces
they cut into two
property and the newly generated
pieces with irregular
surfaces are plain and smooth
surfaces

They do not have


Heat of They have a definite and
definite heat of
fusion characteristic heat of fusion
fusion

Anisotropy Anisotropic in nature Isotropic in nature

Nature True solids Pseudo solids or


super cooled liquids

Order in
Pseudo solids or
arrangement
super cooled liquids
of Long range order
Only short range
constituent
order.
particles

Amorphous solids are useful materials. Glass, rubber and plastics find many
applications in our daily lives. Amorphous silicon is one of the best
photovoltaic material available for conversion of sunlight into electricity.

Intext Questions

1.1 Why are solids rigid?

1.2 Why do solids have a definite volume?

1.3 Classify the following as amorphous or crystalline solids:


Polyurethane, naphthalene, benzoic acid, teflon, potassium nitrate,
cellophane, polyvinyl chloride, fibre glass, copper.

1.4 Why is glass considered a super cooled liquid?

1.5 Refractive index of a solid is observed to have the same value along
all directions. Comment on the nature of this solid. Would it show
cleavage property?
1.3 Classification of Crystalline Solids

In Section 1.2, we have learnt about amorphous substances and that they have
only short range order. However, most of the solid substances are crystalline
in nature. For example, all the metallic elements like iron, copper and silver;
non metallic elements like sulphur, phosphorus and iodine and compounds
like sodium chloride, zinc sulphide and naphthalene form crystalline solids.

Crystalline solids can be classified on the basis of nature of intermolecular


forces operating in them into four categories viz., molecular, ionic, metallic
and covalent solids. Let us now learn about these categories.

1.3.1 Molecular Solids

Molecules are the constituent particles of molecular solids. These are further
sub divided into the following categories:

(i) Non polar Molecular Solids: They comprise of either atoms, for example,
argon and helium or the molecules formed by non polar covalent bonds for
example H2, Cl2 and I2. In these solids, the atoms or molecules are held by
weak dispersion forces or London forces about which you have learnt in
Class XI. These solids are soft and non-conductors of electricity. They have
low melting points and are usually in liquid or gaseous state at room
temperature and pressure.
(ii) Polar Molecular Solids: The molecules of substances like HCl, SO2, etc.
are formed by polar covalent bonds. The molecules in such solids are held
together by relatively stronger dipole-dipole interactions. These solids are
soft and non-conductors of electricity. Their melting points are higher than
those of non polar molecular solids yet most of these are gases or liquids
under room temperature and pressure. Solid SO2 and solid NH3 are some
examples of such solids.

(iii) Hydrogen Bonded Molecular Solids: The molecules of such solids


contain polar covalent bonds between H and F, O or N atoms. Strong
hydrogen bonding binds molecules of such solids like H2O (ice). They are
non-conductors of electricity. Generally they are volatile liquids or soft solids
under room temperature and pressure.

1.3.2 Ionic Solids

Ions are the constituent particles of ionic solids. Such solids are formed by
the three dimensional arrangements of cations and anions bound by strong
coulombic (electrostatic) forces. These solids are hard and brittle in nature.
They have high melting and boiling points. Since the ions are not free to
move about, they are electrical insulators in the solid state. However, in the
molten state or when dissolved in water, the ions become free to move about
and they conduct electricity.

1.3.3 Metallic Solids


Metals are orderly collection of positive ions surrounded by and held together
by a sea of free electrons. These electrons are mobile and are evenly spread
out throughout the crystal. Each metal atom contributes one or more electrons
towards this sea of mobile electrons. These free and mobile electrons are
responsible for high electrical and thermal conductivity of metals. When an
electric field is applied, these electrons flow through the network of positive
ions. Similarly, when heat is supplied to one portion of a metal, the thermal
energy is uniformly spread throughout by free electrons. Another important
characteristic of metals is their lustre and colour in certain cases. This is also
due to the presence of free electrons in them. Metals are highly malleable and
ductile.

1.3.4 Covalent or Network Solids

A wide variety of crystalline solids of non-metals result from the formation


of covalent bonds between adjacent atoms throughout the crystal. They are
also called giant molecules. Covalent bonds are strong and directional in
nature, therefore atoms are held very strongly at their positions. Such solids
are very hard and brittle. They have extremely high melting points and may
even decompose before melting. They are insulators and do not conduct
electricity. Diamond (Fig. 1.3) and silicon carbide are typical examples of
such solids. Graphite is soft and a conductor of electricity. Its exceptional
properties are due to its typical structure (Fig. 1.4). Carbon atoms are
arranged in different layers and each atom is covalently bonded to three of its
neighbouring atoms in the same layer. The fourth valence electron of each
atom is present between different layers and is free to move about. These free
electrons make graphite a good conductor of electricity. Different layers can
slide one over the other. This makes graphite a soft solid and a good solid
lubricant.

Fig. 1.3: Network structure Fig. 1.4: Structure of graphite of diamond

The different properties of the four types of solids are listed in Table 1.2.

Table 1.2: Different Types of Solids

Type of Constituent Bonding/Attractive Physical


Examples
Solid Praticles Forces Nature

(1)
Molecular
solids
Dispersion or London Ar, CCl4,
forces Soft
(i) Non H2, I2, CO
polar Molecules Dipole-dipole Soft
interactions HCl, SO2
(ii) Polar Hydrogen bonding H2O (ice) Hard

(iii)
Hydrogen
Bonded

NaCl,
(2) Ionic Coulombic or MgO, Hard but
Ions
solids electrostatic Brittle
ZnS, CaF2

Positive
(3) Ions in a
Hard but
Metallic sea of Metalic Binding Fe, Cu, Ag
malleable
solids delocalised
electrons

SiO2
(4)
Covalent (quartz), Hard
or Atoms Covalent bonding SiC, C
network (diamond), Soft
solids AlN, C
graphite)

Intext Questions

1.6 Classify the following solids in different categories based on the


nature of intermolecular forces operating in them:

Potassium sulphate, tin, benzene, urea, ammonia, water, zinc sulphide,


graphite, rubidium, argon, silicon carbide.

1.7 Solid A is a very hard electrical insulator in solid as well as in molten


state and melts at extremely high temperature. What type of solid is it?

1.8 Ionic solids conduct electricity in molten state but not in solid state.
Explain.

1.9 What type of solids are electrical conductors, malleable and ductile?

1.4 Crystal Lattices and Unit Cells

The main characteristic of crystalline solids is a regular and repeating pattern


of constituent particles. If the three dimensional arrangement of constituent
particles in a crystal is represented diagrammatically, in which each particle
is depicted as a point, the arrangement is called crystal lattice. Thus, a regular
three dimensional arrangement of points in space is called a crystal lattice. A
portion of a crystal lattice is shown in Fig. 1.5.

Fig. 1.5: A portion of a three dimensional cubic lattice and its unit cell.
There are only 14 possible three dimensional lattices. These are called
Bravais Lattices (after the French mathematician who first described them).
The following are the characteristics of a crystal lattice:

(a) Each point in a lattice is called lattice point or lattice site.

(b) Each point in a crystal lattice represents one constituent particle which
may be an atom, a molecule (group of atoms) or an ion.

(c) Lattice points are joined by straight lines to bring out the geometry of the
lattice. Unit cell is the smallest portion of a crystal lattice which, when
repeated in different directions, generates the entire lattice.

A unit cell is characterised by:

(i) its dimensions along the three edges, a, b and c. These edges may or may
not be mutually perpendicular.

(ii) angles between the edges, (between b and c) (between a and c) and
(between a and b). Thus, a unit cell is characterised by six parameters, a, b, c,
, and . These parameters of a typical unit cell are shown in Fig. 1.6.

Fig. 1.6: Illustration of parameters of a unit cell.

1.4.1 Primitive and Centred Unit Cells


Unit cells can be broadly divided into two categories, primitive and centred
unit cells.

(a) Primitive Unit Cells

When constituent particles are present only on the corner positions of a unit
cell, it is called as primitive unit cell.

(b) Centred Unit Cells

When a unit cell contains one or more constituent particles present at


positions other than corners in addition to those at corners, it is called a
centred unit cell. Centred unit cells are of three types:

(i) Body-Centred Unit Cells: Such a unit cell contains one constituent particle
(atom, molecule or ion) at its body-centre besides the ones that are at its
corners.

(ii) Face-Centred Unit Cells: Such a unit cell contains one constituent
particle present at the centre of each face, besides the ones that are at its
corners.

(iii) End-Centred Unit Cells: In such a unit cell, one constituent particle is
present at the centre of any two opposite faces besides the ones present at its
corners.

In all, there are seven types of primitive unit cells (Fig. 1.7).
Fig. 1.7: Seven primitive unit cells in crystals

Their characteristics along with the centred unit cells they can form have
been listed in Table 1.3.

Table 1.3: Seven Primitive Unit Cells and their Possible Variations as
Centred Unit Cells

Axial
Crystal Possible distances Axial
Examples
system Variations or edge angles
lengths

Primitive,
Body- =
NaCl, Zinc
Cubic centred, a=b=c = =
blende, Cu
Face- 90
centred

Primitive, = White tin,


Tetragonal Body- a=bc == SnO2, TiO2,
centred 90 CaSO4
Primitive,
Body-
centred, = Rhombic
Orthorhombic Face- a b c == sulphur, KNO3,
centred, 90 BaSO4
End
centred

=
=90 Graphite,
Hexagonal Primitive a=bc
= ZnO,CdS,
120

= Calcite
Rhombohedral (CaCO3), HgS
Primitive a=b=c =
or Trigonal
90 (cinnabar)

= Monoclinic
Primitive,
=90 sulphur,
Monoclinic End- a b c
Na2SO4.10H2O
centred
90

K2Cr2O7,
Triclinic Primitive abc CuSO4. 5H2O,
90 H3BO3

Unit Cells of 14 Types of Bravais Lattices


1.5 Number of Atoms in a Unit Cell

We know that any crystal lattice is made up of a very large number of unit
cells and every lattice point is occupied by one constituent particle (atom,
molecule or ion). Let us now work out what portion of each particle belongs
to a particular unit cell.

We shall consider three types of cubic unit cells and for simplicity assume
that the constituent particle is an atom.

1.5.1 Primitive Cubic Unit Cell

Primitive cubic unit cell has atoms only at its corner. Each atom at a corner is
shared between eight adjacent unit cells as shown in
Fig. 1.8, four unit cells in the same layer and four unit cells of the upper (or

lower) layer. Therefore, only th of an atom (or molecule or ion) actually


belongs to a

particular unit cell. In Fig. 1.9, a primitive cubic unit cell has been depicted in
three different ways. Each small sphere in Fig. 1.9 (a) represents only the
centre of the particle occupying that position and not its actual size. Such
structures are called open structures. The arrangement of particles is easier to
follow in open structures.
Fig. 1.9 (b) depicts space-filling representation of the unit cell with actual
particle size and Fig. 1.9 (c) shows the actual portions of different atoms
present in a cubic unit cell.
In all, since each cubic unit cell has
8 atoms on its corners, the total number of atoms in one unit cell is

atom.

Fig. 1.8: In a simple cubic unit cell, each corner atom is shared between 8 unit cells.

Fig. 1.9: A primitive cubic unit cell (a) open structure (b) space-filling structure (c) actual
portions of atoms belonging to one unit cell.
1.5.2 Body-Centred Cubic Unit Cell

A body-centred cubic (bcc) unit cell has an atom at each of its corners and
also one atom at its body centre. Fig. 1.10 depicts (a) open structure (b) space
filling model and (c) the unit cell with portions of atoms actually belonging to
it. It can be seen that the atom at the body centre wholly belongs to the unit
cell in which it is present. Thus in a body-centered cubic (bcc) unit cell:

Fig. 1.10: A body-centred cubic unit cell (a) open structure (b) spacefilling structure (c)
actual portions of atoms belonging to one unit cell.

(i) 8 corners per corner atom = 1 atom

(ii) 1 body centre atom = 1 1 = 1 atom

Total number of atoms per unit cell = 2 atoms

1.5.3 Face- Centred Cubic Unit Cell

A face-centred cubic (fcc) unit cell contains atoms at all the corners and at the
centre of all the faces of the cube. It can be seen in Fig. 1.11 that each atom
located at the face-centre is shared between two adjacent unit cells and only

of each atom belongs to a unit cell. Fig. 1.12 depicts (a) open structure (b)
space-filling model and (c) the unit cell with portions of atoms actually
belonging to it. Thus, in a face-centred cubic (fcc) unit cell:

(i) 8 corners atoms atom per unit cell = 1 atom

(ii) 6 face-centred atoms atom per unit cell = 6 = 3 atoms

Total number of atoms per unit cell = 4 atoms

Fig. 1.11: An atom at face centre of unit cell is shared between 2 unit cells
Fig 1.12: A face-centred cubic unit cell (a) open structure (b) space filling structure (c)
actual portions of atoms belonging to one unit cell.

Intext Questions

1.10 Give the significance of a 'lattice point'.

1.11 Name the parameters that characterise a unit cell.

1.12 Distinguish between


(i) Hexagonal and monoclinic unit cells
(ii) Face-centred and end-centred unit cells.

1.13 Explain how much portion of an atom located at (i) corner and (ii)
bodycentre of a cubic unit cell is part of its neighbouring unit cell.

1.6 CLOSE PACKED STRUCTURES

In solids, the constituent particles are close-packed, leaving the minimum


vacant space. Let us consider the constituent particles as identical hard
spheres and build up the three dimensional structure in three steps.

(a) Close Packing in One Dimension


There is only one way of arranging spheres in a one dimensional close
packed structure, that is to arrange them in a row and touching each other
(Fig. 1.13).

Fig. 1.13: Close packing of spheres in one dimension

In this arrangement, each sphere is in contact with two of its neighbours. The
number of nearest neighbours of a particle is called its coordination number.
Thus, in one dimensional close packed arrangement, the coordination number
is 2.

(b) Close Packing in Two Dimensions

Two dimensional close packed structure can be generated by stacking


(placing) the rows of close packed spheres. This can be done in two different
ways.

(i) The second row may be placed in contact with the first one such that the
spheres of the second row are exactly above those of the first row. The
spheres of the two rows are aligned horizontally as well as vertically. If we
call the first row as A type row, the second row being exactly the same as
the first one, is also of A type. Similarly, we may place more rows to obtain
AAA type of arrangement as shown in Fig. 1.14 (a).
Fig. 1.14: (a) Square close packing (b) hexagonal close packing of spheres in two
dimensions

In this arrangement, each sphere is in contact with four of its neighbours.


Thus, the two dimensional coordination number is 4. Also, if the centres of
these 4 immediate neighbouring spheres are joined, a square is formed.
Hence this packing is called square close packing in two dimensions.

(ii) The second row may be placed above the first one in a staggered manner
such that its spheres fit in the depressions of the first row. If the arrangement
of spheres in the first row is called A type, the one in the second row is
different and may be called B type. When the third row is placed adjacent to
the second in staggered manner, its spheres are aligned with those of the first
layer. Hence this layer is also of A type. The spheres of similarly placed
fourth row will be aligned with those of the second row (B type). Hence this
arrangement is of ABAB type. In this arrangement there is less free space and
this packing is more efficient than the square close packing. Each sphere is in
contact with six of its neighbours and the two dimensional coordination
number is 6. The centres of these six spheres are at the corners of a regular
hexagon (Fig. 1.14b) hence this packing is called two dimensional hexagonal
close-packing. It can be seen in Figure 1.14 (b) that in this layer there are
some voids (empty spaces). These are triangular in shape. The triangular
voids are of two different types. In one row, the apex of the triangles are
pointing upwards and in the next layer downwards.

(c) Close Packing in Three Dimensions

All real structures are three dimensional structures. They can be obtained by
stacking two dimensional layers one above the other. In the last Section, we
discussed close packing in two dimensions which can be of two types; square
close-packed and hexagonal close-packed. Let us see what types of three
dimensional close packing can be obtained from these.

(i) Three dimensional close packing from two dimensional square close-
packed layers: While placing the second square close-packed layer above the
first we follow the same rule that was followed when one row was placed
adjacent to the other. The second layer is placed over the first layer such that
the spheres of the upper layer are exactly above those of the first layer. In this
arrangement spheres of both the layers are perfectly aligned horizontally as
well as vertically as shown in Fig. 1.15.

Similarly, we may place more layers one above the other. If the arrangement
of spheres in the first layer is called A type, all the layers have the same
arrangement. Thus this lattice has AAA.... type pattern. The lattice thus
generated is the simple cubic lattice, and its unit cell is the primitive cubic
unit cell (See Fig. 1.9).
Fig. 1.15: Simple cubic lattice formed by A A A .... arrangement

(ii) Three dimensional close packing from two dimensional hexagonal close
packed layers: Three dimensional close packed structure can be generated by
placing layers one over the other.

(a) Placing second layer over the first layer

Let us take a two dimensional hexagonal close packed layer A and place a
similar layer above it such that the spheres of the second layer are placed in
the depressions of the first layer. Since the spheres of the two layers are
aligned differently, let us call the second layer as B. It can be observed from
Fig. 1.16 that not all the triangular voids of the first layer are covered by the
spheres of the second layer. This gives rise to different arrangements.
Wherever a sphere of the second layer is above the void of the first layer (or
vice versa) a tetrahedral void is formed. These voids are called tetrahedral
voids because a tetrahedron is formed when the centres of these four spheres
are joined. They have been marked as T in Fig. 1.16. One such void has
been shown separately in Fig. 1.17.

Fig. 1.16: A stack of two layers of close packed spheres and voids generated in them. T =
Tetrahedral void; O = Octahedral void

Fig 1.17 Tetrahedral and octahedral voids (a) top view (b) exploded side view and (c)
geometrical shape of the void.

At other places, the triangular voids in the second layer are above the
triangular voids in the first layer, and the triangular shapes of these do not
overlap. One of them has the apex of the triangle pointing upwards and the
other downwards. These voids have been marked as O in Fig. 1.16. Such
voids are surrounded by six spheres and are called octahedral voids. One such
void has been shown separately in Fig. 1.17. The number of these two types
of voids depend upon the number of close packed spheres.

Let the number of close packed spheres be N, then:

The number of octahedral voids generated = N

The number of tetrahedral voids generated = 2N

(b) Placing third layer over the second layer


When third layer is placed over the second, there are two possibilities.

(i) Covering Tetrahedral Voids: Tetrahedral voids of the second layer may be
covered by the spheres of the third layer. In this case, the spheres of the third
layer are exactly aligned with those of the first layer. Thus, the pattern of
spheres is repeated in alternate layers. This pattern is often written as ABAB
....... pattern. This structure is called hexagonal close packed (hcp) structure
(Fig. 1.18). This sort of arrangement of atoms is found in many metals like
magnesium and zinc.

Fig. 1.18 (a) Hexagonal cubic close-packing exploded view showing stacking of layers of
spheres (b) four layers stacked in each case and (c) geometry of packing.
Fig. 1.19 (a) ABCABC... arrangement of layers when octahedral void is covered (b)
fragment of structure formed by this arrangement resulting in cubic closed packed (ccp) or
face centred cubic (fcc) structure.

(ii) Covering Octahedral Voids: The third layer may be placed above the
second layer in a manner such that its spheres cover the octahedral voids.
When placed in this manner, the spheres of the third layer are not aligned
with those of either the first or the second layer. This arrangement is called
C type. Only when fourth layer is placed, its spheres are aligned with those
of the first layer as shown in Figs. 1.18 and 1.19. This pattern of layers is
often written as ABCABC ........... This structure is called cubic close packed
(ccp) or face-centred cubic (fcc) structure. Metals such as copper and silver
crystallise in this structure.

Both these types of close packing are highly efficient and 74% space in the
crystal is filled. In either of them, each sphere is in contact with twelve
spheres. Thus, the coordination number is 12 in either of these two structures.
1.6.1 Formula of a Compound and Number of Voids
Filled

Earlier in the section, we have learnt that when particles are close-packed
resulting in either ccp or hcp structure, two types of voids are generated.
While the number of octahedral voids present in a lattice is equal to the
number of close packed particles, the number of tetrahedral voids generated
is twice this number. In ionic solids, the bigger ions (usually anions) form the
close packed structure and the smaller ions (usually cations) occupy the
voids. If the latter ion is small enough then tetrahedral voids are occupied, if
bigger, then octahedral voids. Not all octahedral or tetrahedral voids are
occupied. In a given compound, the fraction of octahedral or tetrahedral voids
that are occupied, depends upon the chemical formula of the compound, as
can be seen from the following examples.

Example 1.1

A compound is formed by two elements X and Y. Atoms of the element


Y (as anions) make ccp and those of the element X (as cations) occupy
all the octahedral voids. What is the formula of the compound?

Solution

The ccp lattice is formed by the element Y. The number of octahedral


voids generated would be equal to the number of atoms of Y present in
it. Since all the octahedral voids are occupied by the atoms of X, their
number would also be equal to that of the element Y. Thus, the atoms of
elements X and Y are present in equal numbers or 1:1 ratio. Therefore,
the formula of the compound is XY.

Example 1.2

Atoms of element B form hcp lattice and those of the element A occupy
2/3rd of tetrahedral voids. What is the formula of the compound formed
by the elements A and B?

Solution

The number of tetrahedral voids formed is equal to twice the number of


atoms of element B and only 2/3rd of these are occupied by the atoms of
element A. Hence the ratio of the number of atoms of A and B is 2
(2/3):1 or 4:3 and the formula of the compound is A4B3.

Locating Tetrahedral and Octahedral Voids

We know that close packed structures have both tetrahedral and


octahedral voids. Let us take ccp (or fcc) structure and locate these voids
in it.

(a) Locating Tetrahedral Voids

Let us consider a unit cell of ccp or fcc lattice [Fig. 1(a)]. The unit cell is
divided into eight small cubes.

Each small cube has atoms at alternate corners [Fig. 1(a)]. In all, each
small cube has 4 atoms. When joined to each other, they make a regular
tetrahedron. Thus, there is one tetrahedral void in each small cube and
eight tetrahedral voids in total. Each of the eight small cubes have one
void in one unit cell of ccp structure. We know that ccp structure has 4
atoms per unit cell. Thus, the number of tetrahedral voids is twice the
number of atoms.

Fig. 1: (a) Eight tetrahedral voids per unit cell of ccp structure (b) one tetrahedral
void showing the geometry.

(b) Locating Octahedral Voids

Let us again consider a unit cell of ccp or fcc lattice [Fig. 2(a)]. The body
centre of the cube, C is not occupied but it is surrounded by six atoms on
face centres. If these face centres are joined, an octahedron is generated.
Thus, this unit cell has one octahedral void at the body centre of the
cube.

Besides the body centre, there is one octahedral void at the centre of each
of the 12 edges. [Fig. 2(b)]. It is surrounded by six atoms, four belonging
to the same unit cell (2 on the corners and 2 on face centre) and two
belonging to two adjacent unit cells. Since each edge of the cube is
shared between four adjacent unit cells, so is the octahedral void located

on it. Only th of each void belongs to a particular unit cell.

Fig. 2: Location of octahedral voids per unit cell of ccp or fcc lattice (a) at the body
centre of the cube and (b) at the centre of each edge (only one such void is shown).

Thus in cubic close packed structure:

Octahedral void at the body-centre of the cube = 1

12 octahedral voids located at each edge and shared between four unit
cells

Therefore, total number of octahedral voids = 4

We know that in ccp structure, each unit cell has 4 atoms. Thus, the
number of octahedral voids is equal to this number.
1.7 Packing Efficiency

In whatever way the constituent particles (atoms, molecules or ions) are


packed, there is always some free space in the form of voids. Packing
efficiency is the percentage of total space filled by the particles. Let us
calculate the packing efficiency in different types of structures.

1.7.1 Packing Efficiency in hcp and ccp Structures

Both types of close packing (hcp and ccp) are equally efficient. Let us
calculate the efficiency of packing in ccp structure. In Fig. 1.20 let the unit
cell edge length be a and face diagonal AC = b.

In

AC2 = b2 = BC2 + AB2

= a2+a2 = 2a2 or

b=

If r is the radius of the sphere, we find

b = 4r =

or a =
(we can also write,

We know, that each unit cell in ccp structure, has effectively 4 spheres. Total

volume of four spheres is equal to and volume of the cube

is a3 or .

Therefore,

Fig. 1.20: Cubic close packing other sides are not provided with spheres for sake of clarity.
1.7.2 Efficiency of Packing in Body- Centred Cubic
Structures

From Fig. 1.21, it is clear that the atom at the centre will be in touch with the
other two atoms diagonally arranged.

In EFD,

b2 = a2 + a2 = 2a2

b=

Now in AFD

c2 = a2 + b2 = a2 + 2a2 = 3a2

c=

The length of the body diagonal c is equal to 4r, where r is the radius of the
sphere (atom), as all the three spheres along the diagonal touch each other.

Therefore, = 4r

a=

Also we can write, r = a

In this type of structure, total number of atoms is 2 and their volume is


Volume of the cube, a3 will be equal to

or .

Therefore,
Fig. 1.21: Body-centred cubic unit cell (sphere along the body diagonal are shown
with solid boundaries).

1.7.3 Packing Efficiency in Simple Cubic Lattice

In a simple cubic lattice the atoms are located only on the corners of the cube.
The particles touch each other along the edge (Fig. 1.22).

Thus, the edge length or side of the cube a, and the radius of each particle, r
are related as

a = 2r

The volume of the cubic unit cell = a3 = (2r)3 = 8r3

Since a simple cubic unit cell contains only 1 atom

The volume of the occupied space =

Packing efficiency

=
= 52.36% = 52.4 %

Thus, we may conclude that ccp and hcp structures have maximum packing
efficiency.

Fig. 1.22 Simple cubic unit cell. The spheres are in contact with each other along the edge
of the cube.

1.8 Calculations Involving Unit Cell Dimensions

From the unit cell dimensions, it is possible to calculate the volume of the
unit cell. Knowing the density of the metal, we can calculate the mass of the
atoms in the unit cell. The determination of the mass of a single atom gives
an accurate method of determination of Avogadro constant. Suppose, edge
length of a unit cell of a cubic crystal determined by X-ray diffraction is a, d
the density of the solid substance and M the molar mass. In case of cubic
crystal:

Volume of a unit cell = a3


Mass of the unit cell

= number of atoms in unit cell mass of each atom = z m

(Here z is the number of atoms present in one unit cell and m is the mass of a
single atom)

Mass of an atom present in the unit cell:

m (M is molar mass)

Therefore, density of the unit cell

Remember, the density of the unit cell is the same as the density of the
substance. The density of the solid can always be determined by other
methods. Out of the five parameters (d, z M, a and NA), if any four are
known, we can determine the fifth.

Example 1.3

An element has a body-centred cubic (bcc) structure with a cell edge of


288 pm. The density of the element is 7.2 g/cm3. How many atoms are
present in 208 g of the element?
Solution

Volume of the unit cell = (288 pm)3


= (28810-12 m)3 = (28810-10 cm)3
= 2.3910-23 cm3
Volume of 208 g of the element

Number of unit cells in this volume

Since each bcc cubic unit cell contains 2 atoms, therefore, the total
number
of atoms in 208 g = 2 (atoms/unit cell) 12.08 1023 unit cells
= 24.161023 atoms
Example 1.4

X-ray diffraction studies show that copper crystallises in an fcc unit cell
with cell edge of 3.60810-8 cm. In a separate experiment, copper is
determined to have a density of 8.92 g/cm3, calculate the atomic mass of
copper

Solution

In case of fcc lattice, number of atoms per unit cell, z = 4 atoms

Therefore,
= 63.1 g/mol
Atomic mass of copper = 63.1u
Example 1.5

Silver forms ccp lattice and X-ray studies of its crystals show that the
edge length of its unit cell is 408.6 pm. Calculate the density of silver
(Atomic mass = 107.9 u).

Solution

Since the lattice is ccp, the number of silver atoms per unit cell = z = 4

Molar mass of silver = 107.9 g mol 1

Edge length of unit cell = a = 408.6 pm = 408.61012 m

Density,

Intext Questions
1.14 What is the two dimensional coordination number of a molecule in
square close-packed layer?

1.15 A compound forms hexagonal close-packed structure. What is the


total number of voids in 0.5 mol of it? How many of these are tetrahedral
voids?
1.16 A compound is formed by two elements M and N. The element N
forms ccp and atoms of M occupy 1/3rd of tetrahedral voids. What is the
formula of the compound?

1.17 Which of the following lattices has the highest packing efficiency
(i) simple cubic (ii) body-centred cubic and (iii) hexagonal close-packed
lattice?

1.18 An element with molar mass 2.710-2 kg mol-1 forms a cubic unit
cell with edge length 405 pm. If its density is 2.7103 kg m-3, what is the
nature of the cubic unit cell?

1.9 Imperfections in Solids

Although crystalline solids have short range as well as long range order in the
arrangement of their constituent particles, yet crystals are not perfect. Usually
a solid consists of an aggregate of large number of small crystals. These
small crystals have defects in them. This happens when crystallisation
process occurs at fast or moderate rate. Single crystals are formed when the
process of crystallisation occurs at extremely slow rate. Even these crystals
are not free of defects. The defects are basically irregularities in the
arrangement of constituent particles. Broadly speaking, the defects are of two
types, namely, point defects and line defects. Point defects are the
irregularities or deviations from ideal arrangement around a point or an atom
in a crystalline substance, whereas the line defects are the irregularities or
deviations from ideal arrangement in entire rows of lattice points. These
irregularities are called crystal defects. We shall confine our discussion to
point defects only.

1.9.1 Types of Point Defects

Point defects can be classified into three types : (i) stoichiometric defects (ii)
impurity defects and (iii) non-stoichiometric defects.

(a) Stoichiometric Defects

These are the point defects that do not disturb the stoichiometry of the solid.
They are also called intrinsic or thermodynamic defects. Basically these are
of two types, vacancy defects and interstitial defects.

(i) Vacancy Defect: When some of the lattice sites are vacant, the crystal is
said to have vacancy defect (Fig. 1.23). This results in decrease in density of
the substance. This defect can also develop when a substance is heated.

(ii) Interstitial Defect: When some constituent particles (atoms or molecules)


occupy an interstitial site, the crystal is said to have interstitial defect (Fig.
1.24). This defect increases the density of the substance.
Vacancy and interstitial defects as explained above can be shown by non-
ionic solids. Ionic solids must always maintain electrical neutrality. Rather
than simple vacancy or interstitial defects, they show these defects as Frenkel
and Schottky defects.

Fig. 1.23: Vacancy defects

Fig. 1.24: Interstitial defects

(iii) Frenkel Defect: This defect is shown by ionic solids. The smaller ion
(usually cation) is dislocated from its normal site to an interstitial site (Fig.
1.25). It creates a vacancy defect at its original site and an interstitial defect
at its new location.

Frenkel defect is also called dislocation defect. It does not change the
density of the solid. Frenkel defect is shown by ionic substance in which
there is a large difference in the size of ions, for example, ZnS, AgCl, AgBr
and AgI due to small size of Zn2+ and Ag+ ions.

(iv) Schottky Defect: It is basically a vacancy defect in ionic solids. In order


to maintain electrical neutrality, the number of missing cations and anions are
equal (Fig. 1.26).

Like simple vacancy defect, Schottky defect also decreases the density of the
substance. Number of such defects in ionic solids is quite significant. For
example, in NaCl there are approximately 106 Schottky pairs per cm3 at room
temperature. In 1 cm3 there are about 1022 ions. Thus, there is one Schottky
defect per 1016 ions. Schottky defect is shown by ionic substances in which
the cation and anion are of almost similar sizes. For example, NaCl, KCl,
CsCl and AgBr. It may be noted that AgBr shows both, Frenkel as well as
Schottky defects.

(b) Impurity Defects

If molten NaCl containing a little amount of SrCl2 is crystallised, some of the


sites of Na+ ions are occupied by Sr2+ (Fig.1.27). Each Sr2+ replaces two
Na+ ions. It occupies the site of one ion and the other site remains vacant.
The cationic vacancies thus produced are equal in number to that of Sr2+ ions.
Another similar example is the solid solution of CdCl2 and AgCl.

Fig. 1.25: Frenkel defects Fig. 1.26: Schottky defects


Fig. 1.27: Introduction of cation vacancy in NaCl by substitution of Na+ by Sr2+

(c) Non-Stoichiometric Defects

The defects discussed so far do not disturb the stoichiometry of the


crystalline substance. However, a large number of non-stoichiometric
inorganic solids are known which contain the constituent elements in non-
stoichiometric ratio due to defects in their crystal structures. These defects are
of two types: (i) metal excess defect and (ii) metal deficiency defect.

(i) Metal Excess Defect

Metal excess defect due to anionic vacancies: Alkali halides like


NaCl and KCl show this type of defect. When crystals of NaCl are
heated in an atmosphere of sodium vapour, the sodium atoms are
deposited on the surface of the crystal. The Cl ions diffuse to the
surface of the crystal and combine with Na atoms to give NaCl. This
happens by loss of electron by sodium atoms to form Na+ ions. The
released electrons diffuse into the crystal and occupy anionic sites (Fig.
1.28). As a result the crystal now has an excess of sodium. The anionic
sites occupied by unpaired electrons are called F-centres (from the
German word Farbenzenter for colour centre). They impart yellow
colour to the crystals of NaCl. The colour results by excitation of these
electrons when they absorb energy from the visible light falling on the
crystals. Similarly, excess of lithium makes LiCl crystals pink and
excess of potassium makes KCl crystals violet (or lilac).

Metal excess defect due to the presence of extra cations at interstitial


sites: Zinc oxide is white in colour at room temperature. On heating it
loses oxygen and turns yellow.

Now there is excess of zinc in the crystal and its formula becomes Zn1+xO.
The excess Zn2+ ions move to interstitial sites and the electrons to
neighbouring interstitial sites.

(ii) Metal Deficiency Defect

There are many solids which are difficult to prepare in the stoichiometric
composition and contain less amount of the metal as compared to the
stoichiometric proportion. A typical example of this type is FeO which is
mostly found with a composition of Fe0.95O. It may actually range from
Fe0.93O to Fe0.96O. In crystals of FeO some Fe2+ cations are missing and the
loss of positive charge is made up by the presence of required number of Fe3+
ions.
Fig. 1.28: An F-centre in a crystal

1.10 Electrical Properties

Solids exhibit an amazing range of electrical conductivities, extending over


27 orders of magnitude ranging from 1020 to 107 ohm1 m1. Solids can be
classified into three types on the basis of their conductivities.

(i) Conductors: The solids with conductivities ranging between 104 to 107
ohm1m1 are called conductors. Metals have conductivities in the order of
107 ohm1m1 are good conductors.

(ii) Insulators : These are the solids with very low conductivities ranging
between 1020 to 1010 ohm1m1.

(iii) Semiconductors : These are the solids with conductivities in the


intermediate range from 106 to 104 ohm1m1.

1.10.1 Conduction of Electricity in Metals


A conductor may conduct electricity through movement of electrons or ions.
Metallic conductors belong to the former category and electrolytes to the
latter.

Metals conduct electricity in solid as well as molten state. The conductivity


of metals depend upon the number of valence electrons available per atom.
The atomic orbitals of metal atoms form molecular orbitals which are so
close in energy to each other as to form a band. If this band is partially filled
or it overlaps with a higher energy unoccupied conduction band, then
electrons can flow easily under an applied electric field and the metal shows
conductivity (Fig. 1.29 a).

If the gap between filled valence band and the next higher unoccupied band
(conduction band) is large, electrons cannot jump to it and such a substance
has very small conductivity and it behaves as an insulator (Fig. 1.29 b).

1.10.2 Conduction of Electricity in Semiconductors

In case of semiconductors, the gap between the valence band and conduction
band is small (Fig. 1.29c). Therefore, some electrons may jump to conduction
band and show some conductivity. Electrical conductivity of semiconductors
increases with rise in temperature, since more electrons can jump to the
conduction band. Substances like silicon and germanium show this type of
behaviour and are called intrinsic semiconductors.

The conductivity of these intrinsic semiconductors is too low to be of


practical use. Their conductivity is increased by adding an appropriate
amount of suitable impurity. This process is called doping. Doping can be
done with an impurity which is electron rich or electron deficient as
compared to the intrinsic semiconductor silicon or germanium. Such
impurities introduce electronic defects in them.

Fig. 1.29 Distinction among (a) metals (b) insulators and (c) semiconductors. In each case,
an unshaded area represents a conduction band.

(a) Electron rich impurities

Silicon and germanium belong to group 14 of the periodic table and have four
valence electrons each. In their crystals each atom forms four covalent bonds
with its neighbours (Fig. 1.30 a). When doped with a group 15 element like P
or As, which contains five valence electrons, they occupy some of the lattice
sites in silicon or germanium crystal (Fig. 1.30 b). Four out of five electrons
are used in the formation of four covalent bonds with the four neighbouring
silicon atoms. The fifth electron is extra and becomes delocalised. These
delocalised electrons increase the conductivity of doped silicon (or
germanium). Here the increase in conductivity is due to the negatively
charged electron, hence silicon doped with electron-rich impurity is called n-
type semiconductor.

(b) Electron deficit impurities

Silicon or germanium can also be doped with a group 13 element like B, Al


or Ga which contains only three valence electrons. The place where the
fourth valence electron is missing is called electron hole or electron vacancy
(Fig. 1.30 c). An electron from a neighbouring atom can come and fill the
electron hole, but in doing so it would leave an electron hole at its original
position. If it happens, it would appear as if the electron hole has moved in
the direction opposite to that of the electron that filled it. Under the influence
of electric field, electrons would move towards the positively charged plate
through electronic holes, but it would appear as if electron holes are
positively charged and are moving towards negatively charged plate. This
type of semi conductors are called p-type semiconductors.

Fig. 1.30: Creation of n-type and p-type semiconductors by doping groups 13 and 15
elements.
Applications of n-type and p-type semiconductors

Various combinations of n-type and p-type semiconductors are used for


making electronic components. Diode is a combination of n-type and p-type
semiconductors and is used as a rectifier. Transistors are made by
sandwiching a layer of one type of semiconductor between two layers of the
other type of semiconductor. npn and pnp type of transistors are used to
detect or amplify radio or audio signals. The solar cell is an efficient photo-
diode used for conversion of light energy into electrical energy.

Germanium and silicon are group 14 elements and therefore, have a


characteristic valence of four and form four bonds as in diamond. A large
variety of solid state materials have been prepared by combination of groups
13 and 15 or 12 and 16 to simulate average valence of four as in Ge or Si.
Typical compounds of groups 13 15 are InSb, AlP and GaAs. Gallium
arsenide (GaAs) semiconductors have very fast response and have
revolutionised the design of semiconductor devices. ZnS, CdS, CdSe and
HgTe are examples of groups 12 16 compounds. In these compounds, the
bonds are not perfectly covalent and the ionic character depends on the
electronegativities of the two elements.

It is interesting to learn that transition metal oxides show marked differences


in electrical properties. TiO, CrO2 and ReO3 behave like metals. Rhenium
oxide, ReO3 is like metallic copper in its conductivity and appearance.
Certain other oxides like VO, VO2, VO3 and TiO3 show metallic or
insulating properties depending on temperature.

1.11 Magnetic Properties


Every substance has some magnetic properties associated with it. The origin
of these properties lies in the electrons. Each electron in an atom behaves like
a tiny magnet. Its magnetic moment originates from two types of motions (i)
its orbital motion around the nucleus and (ii) its spin around its own axis (Fig.
1.31). Electron being a charged particle and undergoing these motions can be
considered as a small loop of current which possesses a magnetic moment.
Thus, each electron has a permanent spin and an orbital magnetic moment
associated with it. Magnitude of this magnetic moment is very small and is
measured in the unit called Bohr magneton, B. It is equal to 9.27 1024A
m2.

On the basis of their magnetic properties, substances can be classified into


five categories: (i) paramagnetic (ii) diamagnetic (iii) ferromagnetic (iv)
antiferromagnetic and (v) ferrimagnetic.

(i) Paramagnetism: Paramagnetic substances are weakly attracted by a


magnetic field. They are magnetised in a magnetic field in the same direction.
They lose their magnetism in the absence of magnetic field. Paramagnetism is
due to presence of one or more unpaired electrons which are attracted by the
magnetic field. O2, Cu2+, Fe3+, Cr3+ are some examples of such substances.

Fig.1.31: Demonstration of the magnetic moment associated with (a) an orbiting electron
and (b) a spinning electron.

(ii) Diamagnetism: Diamagnetic substances are weakly repelled by a


magnetic field. H2O, NaCl and C6H6 are some examples of such substances.
They are weakly magnetised in a magnetic field in opposite direction.
Diamagnetism is shown by those substances in which all the electrons are
paired and there are no unpaired electrons. Pairing of electrons cancels their
magnetic moments and they lose their magnetic character.

(iii) Ferromagnetism: A few substances like iron, cobalt, nickel, gadolinium


and CrO2 are attracted very strongly by a magnetic field. Such substances are
called ferromagnetic substances. Besides strong attractions, these substances
can be permanently magnetised. In solid state, the metal ions of
ferromagnetic substances are grouped together into small regions called
domains. Thus, each domain acts as a tiny magnet. In an unmagnetised piece
of a ferromagnetic substance the domains are randomly oriented and their
magnetic moments get cancelled. When the substance is placed in a magnetic
field all the domains get oriented in the direction of the magnetic field (Fig.
1.32 a) and a strong magnetic effect is produced. This ordering of domains
persist even when the magnetic field is removed and the ferromagnetic
substance becomes a permanent magnet.

(iv) Antiferromagnetism: Substances like MnO showing anti-ferromagnetism


have domain structure similar to ferromagnetic substance, but their domains
are oppositely oriented and cancel out each others magnetic moment (Fig.
1.32 b).

(v) Ferrimagnetism: Ferrimagnetism is observed when the magnetic moments


of the domains in the substance are aligned in parallel and anti-parallel
directions in unequal numbers (Fig. 1.32 c). They are weakly attracted by
magnetic field as compared to ferromagnetic substances. Fe3O4 (magnetite)
and ferrites like MgFe2O4 and ZnFe2O4 are examples of such substances.
These substances also lose ferrimagnetism on heating and become
paramagnetic.

Fig 1.32: Schematic alignment of magnetic moments in (a) ferromagnetic (b)


antiferromagnetic and (c) ferrimagnetic.

Intext Questions

1.19 What type of defect can arise when a solid is heated? Which
physical property is affected by it and in what way?

1.20 What type of stoichiometric defect is shown by: (i) ZnS (ii) AgBr

1.21 Explain how vacancies are introduced in an ionic solid when a


cation of higher valence is added as an impurity in it.
1.22 Ionic solids, which have anionic vacancies due to metal excess
defect, develop colour. Explain with the help of a suitable example.

1.23 A group 14 element is to be converted into n-type semiconductor by


doping it with a suitable impurity. To which group should this impurity
belong?

1.24 What type of substances would make better permanent magnets,


ferromagnetic or ferrimagnetic. Justify your answer.

Summary

Solids have definite mass, volume and shape. This is due to the fixed
position of their constituent particles, short distances and strong
interactions between them. In amorphous solids, the arrangement of
constituent particles has only short range order and consequently they
behave like super cooled liquids, do not have sharp melting points and
are isotropic in nature. In crystalline solids there is long range order in
the arrangement of their constituent particles. They have sharp melting
points, are anisotropic in nature and their particles have characteristic
shapes. Properties of crystalline solids depend upon the nature of
interactions between their constituent particles. On this basis, they can be
divided into four categories, namely: molecular, ionic, metallic and
covalent solids. They differ widely in their properties.

The constituent particles in crystalline solids are arranged in a regular


pattern which extends throughout the crystal. This arrangement is often
depicted in the form of a three dimensional array of points which is
called crystal lattice. Each lattice point gives the location of one particle
in space. In all, fourteen different types of lattices are possible which are
called Bravais lattices. Each lattice can be generated by repeating its
small characteristic portion called unit cell. A unit cell is characterised
by its edge lengths and three angles between these edges. Unit cells can
be either primitive which have particles only at their corner positions or
centred. The centred unit cells have additional particles at their body
centre (bodycentred), at the centre of each face (face-centred) or at the
centre of two opposite faces (end-centred). There are seven types of
primitive unit cells. Taking centred unit cells also into account, there are
fourteen types of unit cells in all, which result in fourteen Bravais
lattices.

Close-packing of particles result in two highly efficient lattices,


hexagonal close-packed (hcp) and cubic close-packed (ccp). The latter is
also called facecentred
cubic (fcc) lattice. In both of these packings 74% space is filled. The
remaining space is present in the form of two types of voids-octahedral
voids and tetrahedral voids. Other types of packing are not close-
packings and have less efficient packing of particles. While in body-
centred cubic lattice (bcc) 68% space is filled, in simple cubic lattice
only 52.4 % space is filled.
Solids are not perfect in structure. There are different types of
imperfections or defects in them. Point defects and line defects are
common types of defects. Point defects are of three types -
stoichiometric defects, impurity defects and non-stoichiometric defects.
Vacancy defects and interstitial defects are the two basic types of
stoichiometric point defects. In ionic solids, these defects are present as
Frenkel and Schottky defects. Impurity defects are caused by the
presence of an impurity in the crystal. In ionic solids, when the ionic
impurity has a different valence than the main compound, some
vacancies are created. Nonstoichiometric defects are of metal excess type
and metal deficient type. Sometimes calculated amounts of impurities are
introduced by doping in semiconductors that change their electrical
properties. Such materials are widely used in electronics industry. Solids
show many types of magnetic properties like paramagnetism,
diamagnetism, ferromagnetism, antiferromagnetism and ferrimagnetism.
These properties are used in audio, video and other recording devices.
All these properties can be correlated with their electronic configurations
or structures.

Exercises

1.1 Define the term 'amorphous'. Give a few examples of amorphous


solids.

1.2 What makes a glass different from a solid such as quartz? Under
what conditions could quartz be converted into glass?

1.3 Classify each of the following solids as ionic, metallic, molecular,


network (covalent) or amorphous.

(i) Tetra phosphorus decoxide (P4O10)


(ii) Ammonium phosphate (NH4)3PO4
(iii) SiC (viii) Brass

(iv) I2 (ix) Rb

(v) P4 (x) LiBr

(vi) Plastic (xi) Si


(vii) Graphite

1.4 (i) What is meant by the term 'coordination number'?

(ii) What is the coordination number of atoms:


(a) in a cubic close-packed structure?
(b) in a body-centred cubic structure?

1.5 How can you determine the atomic mass of an unknown metal if you
know its density and the dimension of its unit cell? Explain.

1.6 'Stability of a crystal is reflected in the magnitude of its melting


points'. Comment. Collect melting points of solid water, ethyl alcohol,
diethyl ether and methane from a data book. What can you say about the
intermolecular forces between these molecules?
1.7 How will you distinguish between the following pairs of terms:

(i) Hexagonal close-packing and cubic close-packing?


(ii) Crystal lattice and unit cell?
(iii) Tetrahedral void and octahedral void?

1.8 How many lattice points are there in one unit cell of each of the
following lattice?

(i) Face-centred cubic


(ii) Face-centred tetragonal
(iii) Body-centred
1.9 Explain

(i) The basis of similarities and differences between metallic and ionic
crystals.
(ii) Ionic solids are hard and brittle.

1.10 Calculate the efficiency of packing in case of a metal crystal for


(i) simple cubic
(ii) body-centred cubic
(iii) face-centred cubic (with the assumptions that atoms are touching
each other).
1.11 Silver crystallises in fcc lattice. If edge length of the cell is 4.07
108 cm and density is 10.5 g cm3, calculate the atomic mass of silver.
1.12 A cubic solid is made of two elements P and Q. Atoms of Q are at
the corners of the cube and P at the body-centre. What is the formula of
the compound? What are the coordination numbers of P and Q?

1.13 Niobium crystallises in body-centred cubic structure. If density is


8.55 g cm3, calculate atomic radius of niobium using its atomic mass 93
u.

1.14 If the radius of the octahedral void is r and radius of the atoms in
closepacking is R, derive relation between r and R.

1.15 Copper crystallises into a fcc lattice with edge length 3.61 108
cm. Show that the calculated density is in agreement with its measured
value of 8.92 g cm3.

1.16 Analysis shows that nickel oxide has the formula Ni0.98O1.00. What
fractions of nickel exist as NI2+ and Ni3+ ions?
1.17 What is a semiconductor? Describe the two main types of
semiconductors and contrast their conduction mechanism.

1.18 Non-stoichiometric cuprous oxide, Cu2O can be prepared in


laboratory. In this oxide, copper to oxygen ratio is slightly less than 2:1.
Can you account for the fact that this substance is a p-type
semiconductor?

1.19 Ferric oxide crystallises in a hexagonal close-packed array of oxide


ions with two out of every three octahedral holes occupied by ferric ions.
Derive the formula of the ferric oxide.

1.20 Classify each of the following as being either a p-type or a n-type


semiconductor:
(i) Ge doped with In (ii) Si doped with B.

1.21 Gold (atomic radius = 0.144 nm) crystallises in a face-centred unit


cell. What is the length of a side of the cell?

1.22 In terms of band theory, what is the difference


(i) between a conductor and an insulator
(ii) between a conductor and a semiconductor?

1.23 Explain the following terms with suitable examples:


(i) Schottky defect (ii) Frenkel defect (iii) Interstitials and (iv) F-centres.

1.24 Aluminium crystallises in a cubic close-packed structure. Its


metallic radius is 125 pm.

(i) What is the length of the side of the unit cell?


(ii) How many unit cells are there in 1.00 cm3 of aluminium?

1.25 If NaCl is doped with 103 mol % of SrCl2, what is the


concentration of cation vacancies?

1.26 Explain the following with suitable examples:


(i) Ferromagnetism
(ii) Paramagnetism
(iii) Ferrimagnetism
(iv) Antiferromagnetism
(v) 12-16 and 13-15 group compounds.

Answers to Some Intext Questions

1.14 4
1.15 Total number of voids = 9.033 1023 Number of tetrahedral voids =
6.022 1023
1.16 M2N3
1.18 ccp
Chapter Two

Electrostatic Potential and


Capacitance

2.1 INTRODUCTION
In Chapters 6 and 8 (Class XI), the notion of potential energy was
introduced. When an external force does work in taking a body from a
point to another against a force like spring force or gravitational force,
that work gets stored as potential energy of the body. When the
external force is removed, the body moves, gaining kinetic energy and
losing an equal amount of potential energy. The sum of kinetic
and potential energies is thus conserved. Forces of this kind are called
conservative forces. Spring force and gravitational force are examples
of conservative forces.

Coulomb force between two (stationary) charges is also a


conservative force. This is not surprising, since both have inverse-
square dependence on distance and differ mainly in the proportionality
constants the masses in the gravitational law are replaced by
charges in Coulombs law. Thus, like the potential energy of a mass in
a gravitational field, we can define electrostatic potential energy of a
charge in an electrostatic field.
Consider an electrostatic field E due to some charge configuration.
First, for simplicity, consider the field E due to a charge Q placed at
the origin. Now, imagine that we bring a test charge q from a point R
to a point P against the repulsive force on it due to the charge Q. With
reference to Fig. 2.1, this will happen if Q and q are both positive or
both negative. For definiteness, let us take Q, q > 0.

Figure 2.1 A test charge q (> 0) is moved from the point R to the point P against
the repulsive force on it by the charge Q(> 0) placed at the origin.

Two remarks may be made here. First, we assume that the test
charge q is so small that it does not disturb the original configuration,
namely the charge Q at the origin (or else, we keep Q fixed at the
origin by some unspecified force). Second, in bringing the charge q
from R to P, we apply an external force Fext just enough to counter the
repulsive electric force FE (i.e, Fext= FE). This means there is no net
force on or acceleration of the charge q when it is brought from R to P,
i.e., it is brought with infinitesimally slow constant speed. In this
situation, work done by the external force is the negative of the work
done by the electric force, and gets fully stored in the form of potential
energy of the charge q. If the external force is removed on reaching P,
the electric force will take the charge away from Q the stored energy
(potential energy) at P is used to provide kinetic energy to the charge
q in such a way that the sum of the kinetic and potential energies is
conserved.
Thus, work done by external forces in moving a charge q from R to P
is

(2.1)
This work done is against electrostatic repulsive force and gets stored
as potential energy.
At every point in electric field, a particle with charge q possesses a
certain electrostatic potential energy, this work done increases its
potential energy by an amount equal to potential energy difference
between points R and P.
Thus, potential energy difference

(2.2)

(Note here that this displacement is in an opposite sense to the


electric force and hence work done by electric field is negative, i.e.,
WRP.)
Therefore, we can define electric potential energy difference between
two points as the work required to be done by an external force in
moving (without accelerating) charge q from one point to another for
electric field of any arbitrary charge configuration.
Two important comments may be made at this stage:
(i) The right side of Eq. (2.2) depends only on the initial and final
positions of the charge. It means that the work done by an
electrostatic field in moving a charge from one point to another
depends only on the initial and the final points and is independent of
the path taken to go from one point to the other. This is the
fundamental characteristic of a conservative force. The concept of the
potential energy would not be meaningful if the work depended on the
path. The path-independence of work done by an electrostatic field
can be proved using the Coulombs law. We omit this proof here.
(ii) Equation (2.2) defines potential energy difference in terms of the
physically meaningful quantity work. Clearly, potential energy so
defined is undetermined to within an additive constant.What this
means is that the actual value of potential energy is not physically
significant; it is only the difference of potential energy that is
significant. We can always add an arbitrary constant to potential
energy at every point, since this will not change the potential energy
difference:

Put it differently, there is a freedom in choosing the point where


potential energy is zero. A convenient choice is to have electrostatic
potential energy zero at infinity. With this choice, if we take the point R
at infinity, we get from Eq. (2.2)

(2.3)
Since the point P is arbitrary, Eq. (2.3) provides us with a definition of
potential energy of a charge q at any point. Potential energy of charge
q at a point (in the presence of field due to any charge configuration)
is the work done by the external force (equal and opposite to the
electric force) in bringing the charge q from infinity to that point.

Count Alessandro Volta (1745 1827)


Count Alessandro Volta (1745 1827) Italian physicist,
professor at Pavia. Volta established that the animal electri- city
observed by Luigi Galvani, 17371798, in experiments with frog
muscle tissue placed in contact with dissimilar metals, was not due
to any exceptional property of animal tissues but was also
generated whenever any wet body was sandwiched between
dissimilar metals. This led him to develop the first voltaic pile, or
battery, consisting of a large stack of moist disks of cardboard
(electrlyte) sandwiched between disks of metal (electrodes).

2.2 ELECTROSTATIC POTENTIAL


Consider any general static charge configuration. We define potential
energy of a test charge q in terms of the work done on the charge q.
This work is obviously proportional to q, since the force at any point is
qE, where E is the electric field at that point due to the given charge
configuration. It is, therefore, convenient to divide the work by the
amount of charge q, so that the resulting quantity is independent of q.
In other words, work done per unit test charge is characteristic of the
electric field associated with the charge configuration. This leads to
the idea of electrostatic potential V due to a given charge
configuration. From Eq. (2.1), we get:
Work done by external force in bringing a unit positive charge from
point R to P

= VP VR (2.4)
where VP and VR are the electrostatic potentials at P and R,
respectively. Note, as before, that it is not the actual value of potential
but the potential difference that is physically significant. If, as before,
we choose the potential to be zero at infinity, Eq. (2.4) implies:
Work done by an external force in bringing a unit positive charge from
infinity to a point = electrostatic potential (V) at that point.
Figure 2.2 Work done on a test charge q by the electrostatic field due to any given
charge configuration is independent of the path, and depends only on its initial and
final positions.

In other words, the electrostatic potential (V) at any point in a region


with electrostatic field is the work done in bringing a unit
positive charge (without acceleration) from infinity to that point.
The qualifying remarks made earlier regarding potential energy also
apply to the definition of potential. To obtain the work done per unit
test charge, we should take an infinitesimal test charge q, obtain the
work done W in bringing it from infinity to the point and determine the
ratio W/q. Also, the external force at every point of the path is to be
equal and opposite to the electrostatic force on the test charge at that
point.

2.3 POTENTIAL DUE TO A POINT CHARGE


Consider a point charge Q at the origin (Fig. 2.3). For definiteness,
take Q to be positive. We wish to determine the potential at any point
P with position vector r from the origin. For that we must calculate the
work done in bringing a unit positive test charge from infinity to the
point P. For Q > 0, the work done against the repulsive force on the
test charge is positive. Since work done is independent of the path, we
choose a convenient path along the radial direction from infinity to
the point P.
Figure 2.3 Work done in bringing a unit positive test charge from infinity to the
point P, against the repulsive force of charge Q (Q > 0), is the potential at P due to
the charge Q.

At some intermediate point P on the path, the electrostatic force on a

unit positive charge is (2.5)

where is the unit vector along OP. Work done against this force
from r to r + r is

(2.6)
The negative sign appears because for r < 0, W is positive . Total
work done (W) by the external force is obtained by integrating Eq.
(2.6) from r = to r = r,
(2.7)

This, by definition is the potential at P due to the charge Q

(2.8)
Equation (2.8) is true for any sign of the charge Q, though we
considered Q > 0 in its derivation. For Q < 0, V < 0, i.e., work done (by
the external force) per unit positive test charge in bringing it from
infinity to the point is negative. This is equivalent to saying that work
done by the electrostatic force in bringing the unit positive charge form
infinity to the point P is positive. [This is as it should be, since for Q <
0, the force on a unit positive test charge is attractive, so that the
electrostatic force and the displacement (from infinity to P) are in the
same direction.] Finally, we note that Eq. (2.8) is consistent with the
choice that potential at infinity be zero.
Figure 2.4 Variation of potential V with r [in units of (Q/40) m-1] (blue curve) and
field with r [in units of (Q/40) m-2] (black curve) for a point charge Q.

Figure (2.4) shows how the electrostatic potential ( 1/r) and the
electrostatic field ( 1/r2 ) varies with r.

Example 2.1
(a) Calculate the potential at a point P due to a charge of 4 10
7C located 9 cm away.

(b) Hence obtain the work done in bringing a charge of 2 109 C


from infinity to the point P. Does the answer depend on the path
along which the charge is brought?
Solution

(a)

= 4 104 V

(b)
= 8 105 J
No, work done will be path independent. Any arbitrary infinitesimal
path can be resolved into two perpendicular displacements: One
along r and another perpendicular to r. The work done
corresponding to the later will be zero.
2.4 POTENTIAL DUE TO AN ELECTRIC DIPOLE
As we learnt in the last chapter, an electric dipole consists of two
charges q and q separated by a (small) distance 2a. Its total charge
is zero. It is characterised by a dipole moment vector p whose
magnitude is q 2a and which points in the direction from q to q (Fig.
2.5). We also saw that the electric field of a dipole at a point with
position vector r depends not just on the magnitude r, but also on the
angle between r and p. Further, the field falls off, at large distance, not
as 1/r2 (typical of field due to a single charge) but as 1/r3. We, now,
determine the electric potential due to a dipole and contrast it with the
potential due to a single charge.
As before, we take the origin at the centre of the dipole. Now we know
that the electric field obeys the superposition principle. Since potential
is related to the work done by the field, electrostatic potential also
follows the superposition principle. Thus, the potential due to the
dipole is the sum of potentials due to the charges q and q

(2.9)
where r1 and r2 are the distances of the point P from q and q,
respectively.
Now, by geometry,

cos

cos (2.10)
Figure 2.5 Quantities involved in the calculation of potential due to a dipole.

We take r much greater than a ( ) and retain terms only upto


the first order in a/r

(2.11)
Similarly,

(2.12)
Using the Binomial theorem and retaining terms upto the first order in
a/r ; we obtain,

[2.13(a)]

[2.13(b)]
Using Eqs. (2.9) and (2.13) and p = 2qa, we get

(2.14)

Now, p cos =

where is the unit vector along the position vector OP.


The electric potential of a dipole is then given by

; (r >> a) (2.15)
Equation (2.15) is, as indicated, approximately true only for distances
large compared to the size of the dipole, so that higher order terms
in a/r are negligible. For a point dipole p at the origin, Eq. (2.15) is,
however, exact.
From Eq. (2.15), potential on the dipole axis ( = 0, ) is given by

(2.16)
(Positive sign for = 0, negative sign for = .) The potential in the
equatorial plane ( = /2) is zero.
The important contrasting features of electric potential of a dipole from
that due to a single charge are clear from Eqs. (2.8) and (2.15):

(i) The potential due to a dipole depends not just on r but also on the
angle between the position vector r and the dipole moment vector p.
(It is, however, axially symmetric about p. That is, if you rotate the
position vector r about p, keeping fixed, the points corresponding to
P on the cone so generated will have the same potential as at P.)

(ii) The electric dipole potential falls off, at large distance, as 1/r2, not
as 1/r, characteristic of the potential due to a single charge. (You can
refer to the Fig. 2.5 for graphs of 1/r2 versus r and 1/r versus r, drawn
there in another context.)

2.5 POTENTIAL DUE TO A SYSTEM OF CHARGES


Consider a system of charges q1, q2,, qn with position vectors r1, r2,
, rn relative to some origin (Fig. 2.6). The potential V1 at P due to the
charge q1 is

where r1P is the distance between q1 and P.


Similarly, the potential V2 at P due to q2 and V3 due to q3 are given by

,
where r2P and r3P are the distances of P from charges q2 and q3,
respectively; and so on for the potential due to other charges. By the
superposition principle, the potential V at P due to the total charge
configuration is the algebraic sum of the potentials due to the
individual charges
V = V1 + V2 + ... + Vn (2.17)

(2.18)

Figure 2.6 Potential at a point due to a system of charges is the sum of potentials
due to individual charges.

If we have a continuous charge distribution characterised by a charge


density (r), we divide it, as before, into small volume elements each
of size v and carrying a charge v. We then determine the potential
due to each volume element and sum (strictly speaking , integrate)
over all such contributions, and thus determine the potential due to the
entire distribution.
We have seen in Chapter 1 that for a uniformly charged spherical
shell, the electric field outside the shell is as if the entire charge is
concentrated at the centre. Thus, the potential outside the shell is
given by

[2.19(a)]
where q is the total charge on the shell and R its radius. The electric
field inside the shell is zero. This implies (Section 2.6) that potential is
constant inside the shell (as no work is done in moving a charge
inside the shell), and, therefore, equals its value at the surface, which
is

[2.19(b)]

Example 2.2 Two charges 3 108 C and 2 108 C are


located 15 cm apart. At what point on the line joining the two
charges is the electric potential zero? Take the potential at infinity
to be zero.
Solution Let us take the origin O at the location of the positive
charge. The line joining the two charges is taken to be the x-axis;
the negative charge is taken to be on the right side of the origin
(Fig. 2.7).
Figure 2.7

Let P be the required point on the x-axis where the potential is


zero. If x is the x-coordinate of P, obviously x must be positive.
(There is no possibility of potentials due to the two charges adding
up to zero for
x < 0.) If x lies between O and A, we have

where x is in cm. That is,

which gives x = 9 cm.


If x lies on the extended line OA, the required condition is

which gives

x = 45 cm
Thus, electric potential is zero at 9 cm and 45 cm away from the
positive charge on the side of the negative charge. Note that the
formula for potential used in the calculation required choosing
potential to be zero at infinity.

Example 2.3 Figures 2.8 (a) and (b) show the field lines of a
positive and negative point charge respectively.

Figure 2.8

(a) Give the signs of the potential difference VP VQ; VB VA.


(b) Give the sign of the potential energy difference of a small
negative charge between the points Q and P; A and B.
(c) Give the sign of the work done by the field in moving a small
positive charge from Q to P.

(d) Give the sign of the work done by the external agency in
moving a small negative charge from B to A.
(e) Does the kinetic energy of a small negative charge increase or
decrease in going from B to A?
Solution

(a) As , VP > VQ. Thus, (VP VQ) is positive. Also VB is less


negative than VA . Thus, VB > VA or (VB VA) is positive.
(b) A small negative charge will be attracted towards positive
charge. The negative charge moves from higher potential energy
to lower potential energy. Therefore the sign of potential energy
difference of a small negative charge between Q and P is positive.
Similarly, (P.E.)A > (P.E.)B and hence sign of potential energy
differences is positive.
(c) In moving a small positive charge from Q to P, work has to be
done by an external agency against the electric field. Therefore,
work done by the field is negative.
(d) In moving a small negative charge from B to A work has to be
done by the external agency. It is positive.
(e) Due to force of repulsion on the negative charge, velocity
decreases and hence the kinetic energy decreases in going from
B to A.

Electric potential, equipotential surfaces:


http://video.mit.edu/watch/4-electrostatic-potential-elctric-energy-
ev-conservative-field-equipotential-sufaces-12584/
2.6 EQUIPOTENTIAL SURFACES
An equipotential surface is a surface with a constant value of potential
at all points on the surface. For a single charge q, the potential is
given by Eq. (2.8):

This shows that V is a constant if r is constant . Thus, equipotential


surfaces of a single point charge are concentric spherical surfaces
centred at the charge.
Figure 2.9 For a single charge q (a) equipotential surfaces are spherical surfaces centred at the charge, and (b)

electric field lines are radial, starting from the charge if q > 0.
Now the electric field lines for a single charge q are radial lines
starting from or ending at the charge, depending on whether q is
positive or negative. Clearly, the electric field at every point is normal
to the equipotential surface passing through that point. This is true in
general: for any charge configuration, equipotential surface through a
point is normal to the electric field at that point. The proof of this
statement is simple.

Figure 2.10 Equipotential surfaces for a uniform electric field.

If the field were not normal to the equipotential surface, it would have
non-zero component along the surface. To move a unit test charge
against the direction of the component of the field, work would have to
be done. But this is in contradiction to the definition of an equipotential
surface: there is no potential difference between any two points on the
surface and no work is required to move a test charge on the surface.
The electric field must, therefore, be normal to the equipotential
surface at every point. Equipotential surfaces offer an alternative
visual picture in addition to the picture of electric field lines around a
charge configuration.

For a uniform electric field E, say, along the -axis, the equipotential
surfaces are planes normal to the -axis, i.e., planes parallel to the y-
z plane (Fig. 2.10). Equipotential surfaces for (a) a dipole and (b) two
identical positive charges are shown in Fig. 2.11.

Figure 2.11 Some equipotential surfaces for (a) a dipole, (b) two identical positive
charges.

2.6.1 Relation between field and potential


Consider two closely spaced equipotential surfaces A and B (Fig.
2.12) with potential values V and V + V, where V is the change in V
in the direction of the electric field E. Let P be a point on the surface B.
l is the perpendicular distance of the surface A from P. Imagine that a
unit positive charge is moved along this perpendicular from the
surface B to surface A against the electric field. The work done in this
process is |E| l.
Figure 2.12 From the potential to the field.

This work equals the potential difference


VAVB .

Thus,
|E| l = V (V +V)= V

i.e., |E|= (2.20)


Since V is negative, V = |V|. we can rewrite
Eq (2.20) as
(2.21)

We thus arrive at two important conclusions concerning the relation


between electric field and potential:
(i) Electric field is in the direction in which the potential decreases
steepest.
(ii) Its magnitude is given by the change in the magnitude of potential
per unit displacement normal to the equipotential surface at the point.

2.7 POTENTIAL ENERGY OF A SYSTEM OF


CHARGES
Consider first the simple case of two charges q1and q2 with position
vector r1 and r2 relative to some origin. Let us calculate the work done
(externally) in building up this configuration. This means that we
consider the charges q1 and q2 initially at infinity and determine the
work done by an external agency to bring the charges to the given
locations. Suppose, first the charge q1 is brought from infinity to the
point r1. There is no external field against which work needs to be
done, so work done in bringing q1 from infinity to r1 is zero. This
charge produces a potential in space given by

where r1P is the distance of a point P in space from the location of q1.
From the definition of potential, work done in bringing charge q2 from
infinity to the point r2 is q2 times the potential at r2 due to q1:

work done on q2 =
where r12 is the distance between points 1 and 2.

Figure 2.13 Potential energy of a system of charges q1 and q2is directly


proportional to the product of charges and inversely to the distance between them.

Since electrostatic force is conservative, this work gets stored in the


form of potential energy of the system. Thus, the potential energy of a
system of two charges q1 and q2 is

(2.22)
Obviously, if q2 was brought first to its present location and q1 brought
later, the potential energy U would be the same. More generally, the
potential energy expression, Eq. (2.22), is unaltered whatever way the
charges are brought to the specified locations, because of path-
independence of work for electrostatic force.
Equation (2.22) is true for any sign of q1and q2. If q1q2 > 0, potential
energy is positive. This is as expected, since for like charges (q1q2 >
0), electrostatic force is repulsive and a positive amount of work is
needed to be done against this force to bring the charges from infinity
to a finite distance apart. For unlike charges (q1 q2 < 0), the
electrostatic force is attractive. In that case, a positive amount of work
is needed against this force to take the charges from the given
location to infinity. In other words, a negative amount of work is
needed for the reverse path (from infinity to the present locations), so
the potential energy is negative.

2.14 Potential energy of a system of three charges is given by Eq. (2.26), with the
notation given in the figure.

Equation (2.22) is easily generalised for a system of any number of


point charges. Let us calculate the potential energy of a system of
three charges q1, q2 and q3 located at r1, r2, r3, respectively. To bring
q1 first from infinity to r1, no work is required. Next we bring q2 from
infinity to r2. As before, work done in this step is

(2.23)
The charges q1 and q2 produce a potential, which at any point P is
given by

(2.24)

Work done next in bringing q3 from infinity to the point r3 is q3 times


V1, 2 at r3

(2.25)
The total work done in assembling the charges at the given locations
is obtained by adding the work done in different steps [Eq. (2.23) and
Eq. (2.25)],

(2.26)
Again, because of the conservative nature of the electrostatic force (or
equivalently, the path independence of work done), the final
expression for U, Eq. (2.26), is independent of the manner in which
the configuration is assembled. The potential energy is characteristic
of the present state of configuration, and not the way the state is
achieved.

Example 2.4 Four charges are arranged at the corners of a


square ABCD of side d, as shown in Fig. 2.15.(a) Find the work
required to put together this arrangement. (b) A charge q0 is
brought to the centre E of the square, the four charges being held
fixed at its corners. How much extra work is needed to do this?

Figure 2.15

Solution
(a) Since the work done depends on the final arrangement of the
charges, and not on how they are put together, we calculate work
needed for one way of putting the charges at A, B, C and D.
Suppose, first the charge +q is brought to A, and then the charges
q, +q, and q are brought to B, C and D, respectively. The total
work needed can be calculated in steps:

(i) Work needed to bring charge +q to A when no charge is


present elsewhere: this is zero.
(ii) Work needed to bring q to B when +q is at A. This is given by
(charge at B) (electrostatic potential at B due to charge +q at A)

(iii) Work needed to bring charge +q to C when +q is at A and q is


at B. This is given by (charge at C) (potential at C due to
charges at A and B)

(iv) Work needed to bring q to D when +q at A,q at B, and +q at


C. This is given by (charge at D) (potential at D due to charges
at A, B and C)

Add the work done in steps (i), (ii), (iii) and (iv). The total
work required is

The work done depends only on the arrangement of the charges,


and not how they are assembled. By definition, this is the total
electrostatic energy of the charges.
(Students may try calculating same work/energy by taking charges
in any other order they desire and convince themselves that the
energy will remain the same.)
(b) The extra work necessary to bring a charge q0 to the point E
when the four charges are at A, B, C and D is q0 (electrostatic
potential at E due to the charges at A, B, C and D). The
electrostatic potential at E is clearly zero since potential due to A
and C is cancelled by that due to B and D. Hence no work is
required to bring any charge to point E.

2.8 POTENTIAL ENERGY IN AN EXTERNAL FIELD

2.8.1 Potential energy of a single charge


In Section 2.7, the source of the electric field was specified the
charges and their locations - and the potential energy of the system of
those charges was determined. In this section, we ask a related but a
distinct question. What is the potential energy of a charge q in a given
field? This question was, in fact, the starting point that led us to the
notion of the electrostatic potential (Sections 2.1 and 2.2). But here we
address this question again to clarify in what way it is different from
the discussion in Section 2.7.
The main difference is that we are now concerned with the potential
energy of a charge (or charges) in an external field. The external field
E is not produced by the given charge(s) whose potential energy we
wish to calculate. E is produced by sources external to the given
charge(s).The external sources may be known, but often they are
unknown or unspecified; what is specified is the electric field E or the
electrostatic potential V due to the external sources. We assume that
the charge q does not significantly affect the sources producing the
external field. This is true if q is very small, or the external sources are
held fixed by other unspecified forces. Even if q is finite, its influence
on the external sources may still be ignored in the situation when very
strong sources far away at infinity produce a finite field E in the region
of interest. Note again that we are interested in determining the
potential energy of a given charge q (and later, a system of charges)
in the external field; we are not interested in the potential energy of the
sources producing the external electric field.
The external electric field E and the corresponding external potential V
may vary from point to point. By definition, V at a point P is the work
done in bringing a unit positive charge from infinity to the point P.
(We continue to take potential at infinity to be zero.) Thus, work done
in bringing a charge q from infinity to the point P in the external field is
qV. This work is stored in the form of potential energy of q. If the point
P has position vector r relative to some origin, we can write:
Potential energy of q at r in an external field

= qV(r) (2.27)
where V(r) is the external potential at the point r.
Thus, if an electron with charge q = e = 1.61019 C is accelerated by
a potential difference of V = 1 volt, it would gain energy of qV = 1.6
1019J. This unit of energy is defined as 1 electron volt or 1eV, i.e.,
1 eV=1.6 1019J. The units based on eV are most commonly used in
atomic, nuclear and particle physics, (1 keV = 103eV = 1.6 1016J, 1
MeV
= 106eV = 1.6 1013J, 1 GeV = 109eV = 1.6 1010J and 1 TeV =
1012eV
= 1.6 107J). [This has already been defined on Page 117, XI
Physics Part I, Table 6.1.]

2.8.2 Potential energy of a system of two charges


in an external field
Next, we ask: what is the potential energy of a system of two charges
q1 and q2 located at r1and r2, respectively, in an external field? First,
we calculate the work done in bringing the charge q1 from infinity to r1.
Work done in this step is q1 V(r1), using Eq. (2.27). Next, we consider
the work done in bringing q2 to r2. In this step, work is done not only
against the external field E but also against the field due to q1.
Work done on q2 against the external field
= q2 V (r2)
Work done on q2 against the field due to q1

where r12 is the distance between q1 and q2. We have made use of
Eqs. (2.27) and (2.22). By the superposition principle for fields, we add
up the work done on q2 against the two fields (E and that due to q1):
Work done in bringing q2 to r2

(2.28)
Thus,
Potential energy of the system
= the total work done in assembling the configuration

(2.29)

Example 2.5
(a) Determine the electrostatic potential energy of a system
consisting of two charges 7 C and 2 C (and with no external
field) placed at (9 cm, 0, 0) and (9 cm, 0, 0) respectively.
(b) How much work is required to separate the two charges
infinitely away from each other?
(c) Suppose that the same system of charges is now placed in an
external electric field E = A (1/r2); A = 9 105 C m2. What would
the electrostatic energy of the configuration be?

Solution

(a) = 0.7 J.
(b) W = U2 U1 = 0 U = 0 (0.7) = 0.7 J.

(c) The mutual interaction energy of the two charges remains


unchanged. In addition, there is the energy of interaction of the
two charges with the external electric field. We find,

and the net electrostatic energy is

2.8.3 Potential energy of a dipole in an external


field
Consider a dipole with charges q1 = +q and q2 = q placed in a
uniform electric field E, as shown in Fig. 2.16.
Figure 2.16 Potential energy of a dipole in a uniform external field.

As seen in the last chapter, in a uniform electric field, the dipole


experiences no net force; but experiences a torque given by
= pE (2.30) which will tend to rotate it (unless p is parallel or
antiparallel to E). Suppose an external torque ext is applied in such a
manner that it just neutralises this torque and rotates it in the plane of
paper from angle 0 to angle 1 at an infinitesimal angular speed and
without angular acceleration. The amount of work done by the external
torque will be given by

(2.31)
This work is stored as the potential energy of the system. We can then
associate potential energy U() with an inclination of the dipole.
Similar to other potential energies, there is a freedom in choosing the
angle where the potential energy U is taken to be zero. A natural
choice is to take
0 = / 2. (n explanation for it is provided towards the end of
discussion.) We can then write,

(2.32)

This expression can alternately be understood also from Eq. (2.29).


We apply Eq. (2.29) to the present system of two charges +q and q.
The potential energy expression then reads

(2.33)
Here, r1 and r2 denote the position vectors of +q and q. Now, the
potential difference between positions r1 and r2 equals the work
done in bringing a unit positive charge against field from r2 to r1. The
displacement parallel to the force is 2a cos. Thus, [V(r1)V (r2)] = E
2a cos . We thus obtain,

(2.34)
We note that U () differs from U() by a quantity which is just a
constant for a given dipole. Since a constant is insignificant for
potential energy, we can drop the second term in Eq. (2.34) and it
then reduces to Eq. (2.32).
We can now understand why we took 0= /2. In this case, the work
done against the external field E in bringing +q and q are equal and
opposite and cancel out, i.e., q [V (r1) V (r2)]=0.

Example 2.6 A molecule of a substance has a permanent electric


dipole moment of magnitude 1029 C m. A mole of this substance
is polarised (at low temperature) by applying a strong electrostatic
field of magnitude 106 V m1. The direction of the field is suddenly
changed by an angle of 60. Estimate the heat released by the
substance in aligning its dipoles along the new direction of the
field. For simplicity, assume 100% polarisation of the sample.
Solution Here, dipole moment of each molecules = 1029 C m
As 1 mole of the substance contains 6 1023 molecules,
total dipole moment of all the molecules, p = 6 1023 1029 C m
= 6 106 C m
Initial potential energy, Ui = pE cos = 6106106 cos 0 = 6
J
Final potential energy (when = 60), Uf = 6 106 106 cos 60
= 3 J
Change in potential energy = 3 J (6J) = 3 J
So, there is loss in potential energy. This must be the energy
released by the substance in the form of heat in aligning its
dipoles.
2.9 ELECTROSTATICS OF CONDUCTORS
Conductors and insulators were described briefly in Chapter 1.
Conductors contain mobile charge carriers. In metallic conductors,
these charge carriers are electrons. In a metal, the outer (valence)
electrons part away from their atoms and are free to move. These
electrons are free within the metal but not free to leave the metal. The
free electrons form a kind of gas; they collide with each other and
with the ions, and move randomly in different directions. In an external
electric field, they drift against the direction of the field. The positive
ions made up of the nuclei and the bound electrons remain held in
their fixed positions. In electrolytic conductors, the charge carriers are
both positive and negative ions; but the situation in this case is more
involved the movement of the charge carriers is affected both by the
external electric field as also by the
so-called chemical forces (see Chapter 3). We shall restrict our
discussion to metallic solid conductors. Let us note important results
regarding electrostatics of conductors.

1. Inside a conductor, electrostatic field is zero


Consider a conductor, neutral or charged. There may also be an
external electrostatic field. In the static situation, when there is no
current inside or on the surface of the conductor, the electric field is
zero everywhere inside the conductor. This fact can be taken as the
defining property of a conductor. A conductor has free electrons. As
long as electric field is not zero, the free charge carriers would
experience force and drift. In the static situation, the free charges have
so distributed themselves that the electric field is zero everywhere
inside. Electrostatic field is zero inside a conductor.

2. At the surface of a charged conductor,


electrostatic field must be normal to the surface at
every point
If E were not normal to the surface, it would have some non-zero
component along the surface. Free charges on the surface of the
conductor would then experience force and move. In the static
situation, therefore, E should have no tangential component. Thus
electrostatic field at the surface of a charged conductor must be
normal to the surface at every point. (For a conductor without any
surface charge density, field is zero even at the surface.) See result 5.

3. The interior of a conductor can have no excess


charge in the static situation
A neutral conductor has equal amounts of positive and negative
charges in every small volume or surface element. When the
conductor is charged, the excess charge can reside only on the
surface in the static situation. This follows from the Gausss law.
Consider any arbitrary volume element v inside a conductor. On the
closed surface S bounding the volume element v, electrostatic field is
zero. Thus the total electric flux through S is zero. Hence, by Gausss
law, there is no net charge enclosed by S. But the surface S can be
made as small as you like, i.e., the volume v can be made vanishingly
small. This means there is no net charge at any point inside the
conductor, and any excess charge must reside at the surface.

4. Electrostatic potential is constant throughout


the volume of the conductor and has the same
value (as inside) on its surface
This follows from results 1 and 2 above. Since E = 0 inside the
conductor and has no tangential component on the surface, no work is
done in moving a small test charge within the conductor and on its
surface. That is, there is no potential difference between any two
points inside or on the surface of the conductor. Hence, the result. If
the conductor is charged, electric field normal to the surface exists;
this means potential will be different for the surface and a point just
outside the surface.
In a system of conductors of arbitrary size, shape and charge
configuration, each conductor is characterised by a constant value of
potential, but this constant may differ from one conductor to
the other.

5. Electric field at the surface of a charged


conductor

(2.35)

where is the surface charge density and is a unit vector normal to


the surface in the outward direction.
To derive the result, choose a pill box (a short cylinder) as the
Gaussian surface about any point P on the surface, as shown in Fig.
2.17. The pill box is partly inside and partly outside the surface of the
conductor. It has a small area of cross section S and negligible
height.
Figure 2.17 The Gaussian surface (a pill box) chosen to derive Eq. (2.35) for
electric field at surface of a charged conductor.

Just inside the surface, the electrostatic field is zero; just outside, the
field is normal to the surface with magnitude E. Thus, the contribution
to the total flux through the pill box comes only from the outside
(circular) cross-section of the pill box. This equals ES (positive for
> 0, negative for < 0), since over the small area S, E may be
considered constant and E and S are parallel or antiparallel. The
charge enclosed by the pill box is S.
By Gausss law

ES =

E= (2.36)
Including the fact that electric field is normal to the surface, we get the
vector relation, Eq. (2.35), which is true for both signs of . For > 0,
electric field is normal to the surface outward; for < 0, electric field is
normal to the surface inward.

6. Electrostatic shielding
Consider a conductor with a cavity, with no charges inside the cavity.
A remarkable result is that the electric field inside the cavity is zero,
whatever be the size and shape of the cavity and whatever be the
charge on the conductor and the external fields in which it might be
placed. We have proved a simple case of this result already: the
electric field inside a charged spherical shell is zero. The proof of the
result for the shell makes use of the spherical symmetry of the shell
(see Chapter 1). But the vanishing of electric field in the (charge-free)
cavity of a conductor is, as mentioned above, a very general result. A
related result is that even if the conductor is charged or charges are
induced on a neutral conductor by an external field, all charges reside
only on the outer surface of a conductor with cavity.
Figure 2.18 The electric field inside a cavity of any conductor is zero. All charges
reside only on the outer surface of a conductor with cavity. (There are no charges
placed in the cavity.)

The proofs of the results noted in Fig. 2.18 are omitted here, but we
note their important implication. Whatever be the charge and field
configuration outside, any cavity in a conductor remains shielded from
outside electric influence: the field inside the cavity is always zero.
This is known as electrostatic shielding. The effect can be made use
of in protecting sensitive instruments from outside electrical influence.
Figure 2.19 gives a summary of the important electrostatic properties
of a conductor.

Figure 2.19 Some important electrostatic properties of a conductor.


Example 2.7
(a) A comb run through ones dry hair attracts small bits of paper.
Why?
What happens if the hair is wet or if it is a rainy day? (Remember,
a paper does not conduct electricity.)
Solution
(a) This is because the comb gets charged by friction. The
molecules in the paper gets polarised by the charged comb,
resulting in a net force of attraction. If the hair is wet, or if it is rainy
day, friction between hair and the comb reduces. The comb does
not get charged and thus it will not attract small bits of paper.
(b) To enable them to conduct charge (produced by friction) to the
ground; as too much of static electricity accumulated may result in
spark and result in fire.
(c) Reason similar to (b).
(d) Current passes only when there is difference in potential.

2.10 DIELECTRICS AND POLARISATION


Dielectrics are non-conducting substances. In contrast to conductors,
they have no (or negligible number of) charge carriers. Recall from
Section 2.9 what happens when a conductor is placed in an external
electric field. The free charge carriers move and charge distribution in
the conductor adjusts itself in such a way that the electric field due to
induced charges opposes the external field within the conductor. This
happens until, in the static situation, the two fields cancel each other
and the net electrostatic field in the conductor is zero. In a dielectric,
this free movement of charges is not possible. It turns out that the
external field induces dipole moment by stretching or re-orienting
molecules of the dielectric. The collective effect of all the molecular
dipole moments is net charges on the surface of the dielectric which
produce a field that opposes the external field. Unlike in a conductor,
however, the opposing field so induced does not exactly cancel the
external field. It only reduces it. The extent of the effect depends on
the nature of the dielectric. To understand the effect, we need to look
at the charge distribution of a dielectric at the molecular level.

Figure 2.20 Difference in behaviour of a conductor and a dielectric in an external


electric field.

The molecules of a substance may be polar or non-polar. In a non-


polar molecule, the centres of positive and negative charges coincide.
The molecule then has no permanent (or intrinsic) dipole moment.
Examples of non-polar molecules are oxygen (O2) and hydrogen (H2)
molecules which, because of their symmetry, have no dipole moment.
On the other hand, a polar molecule is one in which the centres of
positive and negative charges are separated (even when there is no
external field). Such molecules have a permanent dipole moment. An
ionic molecule such as HCl or a molecule of water (H2O) are
examples of polar molecules.

Figure 2.21 Some examples of polar and non-polar molecules.

In an external electric field, the positive and negative charges of a


non-polar molecule are displaced in opposite directions. The
displacement stops when the external force on the constituent
charges of the molecule is balanced by the restoring force (due to
internal fields in the molecule). The non-polar molecule thus develops
an induced dipole moment. The dielectric is said to be polarised by the
external field. We consider only the simple situation when the induced
dipole moment is in the direction of the field and is proportional to the
field strength. (Substances for which this assumption is true are called
linear isotropic dielectrics.) The induced dipole moments of different
molecules add up giving a net dipole moment of the dielectric in the
presence of the external field.

Figure 2.22 A dielectric develops a net dipole moment in an external electric field.
(a) Non-polar molecules, (b) Polar molecules.
A dielectric with polar molecules also develops a net dipole moment in
an external field, but for a different reason. In the absence of any
external field, the different permanent dipoles are oriented randomly
due to thermal agitation; so the total dipole moment is zero. When an
external field is applied, the individual dipole moments tend to align
with the field. When summed over all the molecules, there is then a
net dipole moment in the direction of the external field, i.e., the
dielectric is polarised. The extent of polarisation depends on the
relative strength of two mutually opposite factors: the dipole potential
energy in the external field tending to align the dipoles with the field
and thermal energy tending to disrupt the alignment. There may be, in
addition, the induced dipole moment effect as for non-polar
molecules, but generally the alignment effect is more important for
polar molecules.
Thus in either case, whether polar or non-polar, a dielectric develops a
net dipole moment in the presence of an external field. The dipole
moment per unit volume is called polarisation and is denoted by P. For
linear isotropic dielectrics,

(2.37)
where e is a constant characteristic of the dielectric and is known as
the electric susceptibility of the dielectric medium.
It is possible to relate e to the molecular properties of the substance,
but we shall not pursue that here.
The question is: how does the polarised dielectric modify the original
external field inside it? Let us consider, for simplicity, a rectangular
dielectric slab placed in a uniform external field E0 parallel to two of its
faces. The field causes a uniform polarisation P of the dielectric. Thus
every volume element v of the slab has a dipole moment P v in the
direction of the field. The volume element v is macroscopically small
but contains a very large number of molecular dipoles. Anywhere
inside the dielectric, the volume element v has no net charge (though
it has net dipole moment). This is, because, the positive charge of one
dipole sits close to the negative charge of the adjacent dipole.
However, at the surfaces of the dielectric normal to the electric field,
there is evidently a net charge density. As seen in Fig 2.23, the
positive ends of the dipoles remain unneutralised at the right surface
and the negative ends at the left surface. The unbalanced charges are
the induced charges due to the external field.
Figure 2.23 A uniformly polarised dielectric amounts to induced surface
charge density, but no volume charge density.

Thus the polarised dielectric is equivalent to two charged surfaces


with induced surface charge densities, say p and p. Clearly, the
field produced by these surface charges opposes the external field.
The total field in the dielectric is, thereby, reduced from the case when
no dielectric is present. We should note that the surface charge
density p arises from bound (not free charges) in the dielectric.

2.11 CAPACITORS AND CAPACITANCE


A capacitor is a system of two conductors separated by an insulator
(Fig. 2.24). The conductors have charges, say Q1 and Q2, and
potentials V1 and V2. Usually, in practice, the two conductors have
charges Q and Q, with potential difference V = V1 V2 between
them. We shall consider only this kind of charge configuration of the
capacitor. (Even a single conductor can be used as a capacitor by
assuming the other at infinity.) The conductors may be so charged by
connecting them to the two terminals of a battery. Q is called the
charge of the capacitor, though this, in fact, is the charge on one of
the conductors the total charge of the capacitor is zero.
Figure 2.24 A system of two conductors separated by an insulator forms a
capacitor.

The electric field in the region between the conductors is proportional


to the charge Q. That is, if the charge on the capacitor is, say doubled,
the electric field will also be doubled at every point. (This follows from
the direct proportionality between field and charge implied by
Coulombs law and the superposition principle.) Now, potential
difference V is the work done per unit positive charge in taking a small
test charge from the conductor 2 to 1 against the field. Consequently,
V is also proportional to Q, and the ratio Q/V is a constant:

(2.38)
The constant C is called the capacitance of the capacitor. C is
independent of Q or V, as stated above. The capacitance C depends
only on the geometrical configuration (shape, size, separation) of the
system of two conductors. [As we shall see later, it also depends on
the nature of the insulator (dielectric) separating the two conductors.]
The SI unit of capacitance is 1 farad (=1 coulomb volt-1) or 1 F = 1 C
V1. A capacitor with fixed capacitance is symbolically shown as ---||---
, while the one with variable capacitance is shown as .
Equation (2.38) shows that for large C, V is small for a given Q. This
means a capacitor with large capacitance can hold large amount of
charge Q at a relatively small V. This is of practical importance. High
potential difference implies strong electric field around the conductors.
A strong electric field can ionise the surrounding air and accelerate the
charges so produced to the oppositely charged plates, thereby
neutralising the charge on the capacitor plates, at least partly. In other
words, the charge of the capacitor leaks away due to the reduction in
insulating power of the intervening medium.
The maximum electric field that a dielectric medium can withstand
without break-down (of its insulating property) is called its dielectric
strength; for air it is about 3 106 Vm1. For a separation between
conductors of the order of 1 cm or so, this field corresponds to a
potential difference of 3 104 V between the conductors. Thus, for a
capacitor to store a large amount of charge without leaking, its
capacitance should be high enough so that the potential difference
and hence the electric field do not exceed the break-down limits. Put
differently, there is a limit to the amount of charge that can be stored
on a given capacitor without significant leaking. In practice, a farad is
a very big unit; the most common units are its sub-multiples 1 F =
106 F, 1 nF = 109 F, 1 pF = 1012 F, etc. Besides its use in storing
charge, a capacitor is a key element of most ac circuits with important
functions, as described in Chapter 7.
2.12 THE PARALLEL PLATE CAPACITOR
A parallel plate capacitor consists of two large plane
parallel conducting plates separated by a small distance (Fig. 2.25).

Figure 2.25 The parallel plate capacitor.

We first take the intervening medium between the plates to be


vacuum. The effect of a dielectric medium between the plates is
discussed in the next section. Let A be the area of each plate and d
the separation between them. The two plates have charges Q and Q.
Since d is much smaller than the linear dimension of the plates (d2 <<
A), we can use the result on electric field by an infinite plane sheet of
uniform surface charge density (Section 1.15). Plate 1 has surface
charge density = Q/A and plate 2 has a surface charge density .
Using Eq. (1.33), the electric field in different regions is:
Outer region I (region above the plate 1),
(2.39)

Outer region II (region below the plate 2),

(2.40)
In the inner region between the plates 1 and 2, the electric fields due
to the two charged plates add up, giving

(2.41)
The direction of electric field is from the positive to the negative plate.
Thus, the electric field is localised between the two plates and is
uniform throughout. For plates with finite area, this will not be true
near the outer boundaries of the plates. The field lines bend outward
at the edges an effect called fringing of the field. By the same
token, will not be strictly uniform on the entire plate. [E and are
related by Eq. (2.35).] However, for d2 << A, these effects can be
ignored in the regions sufficiently far from the edges, and the field
there is given by Eq. (2.41). Now for uniform electric field, potential
difference is simply the electric field times the distance between the
plates, that is,

(2.42)
The capacitance C of the parallel plate capacitor is then
= (2.43)

which, as expected, depends only on the geometry of the system. For


typical values like A = 1 m2, d = 1 mm, we get

(2.44)

(You can check that if 1F= 1C V1 = 1C (NC1m)1 = 1 C2 N1m1.)


This shows that 1F is too big a unit in practice, as remarked earlier.
Another way of seeing the bigness of 1F is to calculate the area of
the plates needed to have C = 1F for a separation of, say 1 cm:

(2.45)
which is a plate about 30 km in length and breadth!

Factors affecting capacitance, capacitors in action Interactive Java tutorial

http://micro.magnet.fsu.edu/electromag/java/capacitance/
2.13 EFFECT OF DIELECTRIC ON CAPACITANCE
With the understanding of the behavior of dielectrics in an external
field developed in Section 2.10, let us see how the capacitance of a
parallel plate capacitor is modified when a dielectric is present. As
before, we have two large plates, each of area A, separated by a
distance d. The charge on the plates is Q, corresponding to the
charge density (with = Q/A). When there is vacuum between the
plates,

and the potential difference V0 is


V0 = E0d
The capacitance C0 in this case is

(2.46)
Consider next a dielectric inserted between the plates fully occupying
the intervening region. The dielectric is polarised by the field and, as
explained in Section 2.10, the effect is equivalent to two charged
sheets (at the surfaces of the dielectric normal to the field) with
surface charge densities p and p. The electric field in the dielectric
then corresponds to the case when the net surface charge density on
the plates is ( p).
That is,
(2.47)

so that the potential difference across the plates is

(2.48)
For linear dielectrics, we expect p to be proportional to E0, i.e., to .
Thus, ( p) is proportional to and we can write

(2.49)
where K is a constant characteristic of the dielectric. Clearly, K > 1.
We then have

(2.50)
The capacitance C, with dielectric between the plates, is then

(2.51)
The product 0K is called the permittivity of the medium and is
denoted by
= 0 K (2.52)
For vacuum K = 1 and = 0; 0 is called the permittivity of the
vacuum. The dimensionless ratio
(2.53)

is called the dielectric constant of the substance. As remarked before,


from Eq. (2.49), it is clear that K is greater than 1. From Eqs. (2.46)
and (2. 51)

(2.54)

Thus, the dielectric constant of a substance is the factor (>1) by which


the capacitance increases from its vacuum value, when the dielectric
is inserted fully between the plates of a capacitor. Though we arrived
at Eq. (2.54) for the case of a parallel plate capacitor, it holds good for
any type of capacitor and can, in fact, be viewed in general as a
definition of the dielectric constant of a substance.

ELECTRIC DISPLACEMENT
We have introduced the notion of dielectric constant and arrived at
Eq. (2.54), without giving the explicit relation between the induced
charge density p and the polarisation P. We take without proof

the result that

where is a unit vector along the outward normal to the surface.


Above equation is general, true for any shape of the dielectric. For
the slab in Fig. 2.23, P is along at the right surface and
opposite to at the left surface. Thus at the right surface,
induced charge density is positive and at the left surface, it is
negative, as guessed already in our qualitative discussion before.
Putting the equation for electric field in vector form

or (0 E + P) =
The quantity 0 E + P is called the electric displacement and is
denoted by D. It is a vector quantity. Thus,

D = 0 E + P, D = ,
The significance of D is this : in vacuum, E is related to the free
charge density . When a dielectric medium is present, the
corresponding role is taken up by D. For a dielectric medium, it
is D not E that is directly related to free charge density , as seen
in above equation. Since P is in the same direction as E, all the
three vectors P, E and D are parallel.
The ratio of the magnitudes of D and E is

Thus,
D = 0 K E
and P = D 0E = 0 (K 1)E
This gives for the electric susceptibility e defined in Eq. (2.37)
e =0 (K1)
Example 2.8 A slab of material of dielectric constant K has the
same area as the plates of a parallel-plate capacitor but has a
thickness (3/4)d, where d is the separation of the plates. How is
the capacitance changed when the slab is inserted between the
plates?

Solution Let E0 = V0/d be the electric field between the plates


when there is no dielectric and the potential difference is V0. If the
dielectric is now inserted, the electric field in the dielectric will be E
= E0/K. The potential difference will then be

The potential difference decreases by the factor (K + 3)/K while


the free charge Q0 on the plates remains unchanged. The
capacitance thus increases

2.14 COMBINATION OF CAPACITORS


We can combine several capacitors of capacitance C1, C2,, Cn to
obtain a system with some effective capacitance C. The effective
capacitance depends on the way the individual capacitors are
combined. Two simple possibilities are discussed below.

2.14.1 Capacitors in series


Figure 2.26 shows capacitors C1 and C2 combined in series.
The left plate of C1 and the right plate of C2 are connected to two
terminals of a battery and have charges Q and Q , respectively. It
then follows that the right plate of C1 has charge Q and the left plate
of C2 has charge Q. If this was not so, the net charge on each
capacitor would not be zero. This would result in an electric field in the
conductor connecting C1and C2. Charge would flow until the net
charge on both C1 and C2 is zero and there is no electric field in the
conductor connecting C1 and C2. Thus, in the series
combination,charges on the two plates (Q) are the same on each
capacitor. The total potential drop V across the combination is the sum
of the potential drops V1 and V2 across C1 and C2, respectively.

V = V1 + V2 = (2.55)

i.e., , (2.56)
Figure 2.26 Combination of two capacitors in series.

Figure 2.27 Combination of n capacitors in series.

Now we can regard the combination as an effective capacitor with


charge Q and potential difference V. The effective capacitance of the
combination is

(2.57)
We compare Eq. (2.57) with Eq. (2.56), and obtain

(2.58)
The proof clearly goes through for any number of capacitors arranged
in a similar way. Equation (2.55), for n capacitors arranged in series,
generalises to

(2.59)
Following the same steps as for the case of two capacitors, we get the
general formula for effective capacitance of a series combination of n
capacitors:

(2.60)

2.14.2 Capacitors in parallel


Figure 2.28 Parallel combination of (a) two capacitors, (b) n capacitors.

Figure 2.28 (a) shows two capacitors arranged in parallel. In this case,
the same potential difference is applied across both the capacitors.
But the plate charges (Q1) on capacitor 1 and the plate charges
(Q2) on the capacitor 2 are not necessarily the same:
Q1 = C1V, Q2 = C2V (2.61)
The equivalent capacitor is one with charge
Q = Q1 + Q2 (2.62)

and potential difference V.


Q = CV = C1V + C2V (2.63)
The effective capacitance C is, from Eq. (2.63),
C = C1 + C2 (2.64)

The general formula for effective capacitance C for parallel


combination of n capacitors [Fig. 2.28 (b)] follows similarly,
Q = Q1 + Q2 + ... + Qn (2.65)
i.e., CV = C1V + C2V + ... CnV (2.66)
which gives
C = C1 + C2 + ... Cn (2.67)

Example 2.9 A network of four 10 F capacitors is connected to a


500 V supply, as shown in Fig. 2.29. Determine (a) the equivalent
capacitance of the network and (b) the charge on each capacitor.
(Note, the charge on a capacitor is the charge on the plate with
higher potential, equal and opposite to the charge on the plate with
lower potential.)
Figure 2.29

Solution
(a) In the given network, C1, C2 and C3 are connected in series.
The effective capacitance C of these three capacitors is given by

For C1 = C2 = C3 = 10 F, C = (10/3) F. The network has C and


C4 connected in parallel. Thus, the equivalent capacitance C of
the network is

C = C + C4 = F =13.3F
(b) Clearly, from the figure, the charge on each of the capacitors,
C1, C2 and C3 is the same, say Q. Let the charge on C4 be Q.
Now, since the potential difference across AB is Q/C1, across BC
is Q/C2, across CD is Q/C3 , we have
.

Also, Q/C4 = 500 V.


This gives for the given value of the capacitances,

and

2.15 ENERGY STORED IN A CAPACITOR


A capacitor, as we have seen above, is a system of two conductors
with charge Q and Q. To determine the energy stored in this
configuration, consider initially two uncharged conductors 1 and 2.
Imagine next a process of transferring charge from conductor 2 to
conductor 1 bit by bit, so that at the end, conductor 1 gets charge Q.
By charge conservation, conductor 2 has charge Q at the end (Fig
2.30 ).
FIGURE 2.30 (a) Work done in a small step of building charge on conductor 1 from
Q to Q + Q . (b) Total work done in charging the capacitor may be viewed as
stored in the energy of electric field between the plates

In transferring positive charge from conductor 2 to conductor 1, work


will be done externally, since at any stage conductor 1 is at a higher
potential than conductor 2. To calculate the total work done, we first
calculate the work done in a small step involving transfer of an
infinitesimal (i.e., vanishingly small) amount of charge. Consider the
intermediate situation when the conductors 1 and 2 have charges Q
and
Q respectively. At this stage, the potential difference V between
conductors 1 to 2 is Q/C, where C is the capacitance of the system.
Next imagine that a small charge Q is transferred from conductor 2
to 1. Work done in this step ( W), resulting in charge Q on conductor
1 increasing to Q+ Q, is given by

(2.68)
Since Q can be made as small as we like, Eq. (2.68) can be written
as

(2.69)

Equations (2.68) and (2.69) are identical because the term of second
order in Q, i.e., Q2/2C, is negligible, since Q is arbitrarily small.
The total work done (W) is the sum of the small work ( W) over the
very large number of steps involved in building the charge Q from
zero to Q.

= (2.70)

(2.71)

(2.72)
The same result can be obtained directly from Eq. (2.68) by
integration

This is not surprising since integration is nothing but summation of a


large number of small terms.
We can write the final result, Eq. (2.72) in different ways

Since electrostatic force is conservative, this work is stored in the form


of potential energy of the system. For the same reason, the final result
for potential energy [Eq. (2.73)] is independent of the manner in which
the charge configuration of the capacitor is built up. When the
capacitor discharges, this stored-up energy is released. It is possible
to view the potential energy of the capacitor as stored in the electric
field between the plates. To see this, consider for simplicity, a parallel
plate capacitor [of area A(of each plate) and separation d between the
plates].
Energy stored in the capacitor

= (2.74)
The surface charge density is related to the electric field E between
the plates,

(2.75)
From Eqs. (2.74) and (2.75) , we get
Energy stored in the capacitor

U= (2.76)
Note that Ad is the volume of the region between the plates (where
electric field alone exists). If we define energy density as energy
stored per unit volume of space, Eq (2.76) shows that
Energy density of electric field,

u =(1/2)0E2 (2.77)
Though we derived Eq. (2.77) for the case of a parallel plate capacitor,
the result on energy density of an electric field is, in fact, very general
and holds true for electric field due to any configuration of charges.

Example 2.10 (a) A 900 pF capacitor is charged by 100 V battery


[Fig. 2.31(a)]. How much electrostatic energy is stored by the
capacitor? (b) The capacitor is disconnected from the battery and
connected to another 900 pF capacitor [Fig. 2.31(b)]. What is the
electrostatic energy stored by the system?
Figure 2.31

Solution
(a) The charge on the capacitor is
Q = CV = 900 1012 F 100 V = 9 108 C

The energy stored by the capacitor is


= (1/2) CV2 = (1/2) QV
= (1/2) 9 108C 100 V = 4.5 106 J
(b) In the steady situation, the two capacitors have their positive
plates at the same potential, and their negative plates at the same
potential. Let the common potential difference be V. The charge
on each capacitor is then Q = CV. By charge conservation, Q =
Q/2. This implies V = V/2. The total energy of the system is
Thus in going from (a) to (b), though no charge is lost; the final
energy is only half the initial energy. Where has the remaining
energy gone?
There is a transient period before the system settles to the
situation (b). During this period, a transient current flows from the
first capacitor to the second. Energy is lost during this time in the
form of heat and electromagnetic radiation.

Van de Graaff generator, principle and demonstration:


http://www.physics.gla.ac.uk/~kskeldon/PubSci/exhibits/E10/

2.16 VAN DE GRAAFF GENERATOR


This is a machine that can build up high voltages of the order of a few
million volts. The resulting large electric fields are used to accelerate
charged particles (electrons, protons, ions) to high energies needed
for experiments to probe the small scale structure of matter. The
principle underlying the machine is as follows.
Suppose we have a large spherical conducting shell of radius R, on
which we place a charge Q. This charge spreads itself uniformly all
over the sphere. As we have seen in Section 1.14, the field outside
the sphere is just that of a point charge Q at the centre; while the field
inside the sphere vanishes. So the potential outside is that of a point
charge; and inside it is constant, namely the value at the radius R. We
thus have:
Potential inside conducting spherical shell of radius R carrying charge
Q

= constant

(2.78)
Now, as shown in Fig. 2.32, let us suppose that in some way we
introduce a small sphere of radius r, carrying some charge q, into the
large one, and place it at the centre. The potential due to this new
charge clearly has the following values at the radii indicated:
Potential due to small sphere of radius r carrying charge q

at surface of small sphere

at large shell of radius R. (2.79)


Taking both charges q and Q into account we have for the total
potential V and the potential difference the values
(2.80)

Figure 2.32 Illustrating the principle of the electrostatic generator.

Assume now that q is positive. We see that, independent of the


amount of charge Q that may have accumulated on the larger sphere
and even if it is positive, the inner sphere is always at a higher
potential: the difference V(r)V(R) is positive. The potential due to Q is
constant upto radius R and so cancels out in the difference!
This means that if we now connect the smaller and larger sphere by a
wire, the charge q on the former will immediately flow onto the matter,
even though the charge Q may be quite large. The natural tendency is
for positive charge to move from higher to lower potential. Thus,
provided we are somehow able to introduce the small charged sphere
into the larger one, we can in this way keep piling up larger and larger
amount of charge on the latter. The potential (Eq. 2.78) at the outer
sphere would also keep rising, at least until we reach the breakdown
field of air.

Figure 2.33 Principle of construction of Van de Graaff generator.

This is the principle of the van de Graaff generator. It is a machine


capable of building up potential difference of a few million volts, and
fields close to the breakdown field of air which is about 3 106 V/m. A
schematic diagram of the van de Graaff generator is given in Fig.
2.33. A large spherical conducting shell (of few metres radius) is
supported at a height several meters above the ground on an
insulating column. A long narrow endless belt insulating material, like
rubber or silk, is wound around two pulleys one at ground level, one
at the centre of the shell. This belt is kept continuously moving by a
motor driving the lower pulley. It continuously carries positive charge,
sprayed on to it by a brush at ground level, to the top. There it
transfers its positive charge to another conducting brush connected to
the large shell. Thus positive charge is transferred to the shell, where
it spreads out uniformly on the outer surface. In this way, voltage
differences of as much as 6 or 8 million volts (with respect to ground)
can be built up.

SUMMARY

1. Electrostatic force is a conservative force. Work done by an


external force (equal and opposite to the electrostatic force) in
bringing a charge q from a point R to a point P is VP VR, which is
the difference in potential energy of charge q between the final
and initial points.
2. Potential at a point is the work done per unit charge (by an
external agency) in bringing a charge from infinity to that point.
Potential at a point is arbitrary to within an additive constant, since
it is the potential difference between two points which is physically
significant. If potential at infinity is chosen to be zero; potential at a
point with position vector r due to a point charge Q placed at the
origin is given is given by

3. The electrostatic potential at a point with position vector r due to


a point dipole of dipole moment p placed at the origin is

The result is true also for a dipole (with charges q and q


separated by 2a) for r >> a.
4. For a charge configuration q1, q2, ..., qn with position vectors r1,
r2, ... rn, the potential at a point P is given by the superposition
principle

where r1P is the distance between q1 and P, as and so on.


5. An equipotential surface is a surface over which potential has a
constant value. For a point charge, concentric spheres centered at
a location of the charge are equipotential surfaces. The electric
field E at a point is perpendicular to the equipotential surface
through the point. E is in the direction of the steepest decrease of
potential.
6. Potential energy stored in a system of charges is the work done
(by an external agency) in assembling the charges at their
locations. Potential energy of two charges q1, q2 at r1, r2 is given
by
where r12 is distance between q1 and q2.
7. The potential energy of a charge q in an external potential V(r)
is qV(r).
The potential energy of a dipole moment p in a uniform electric
field E is p.E.

8. Electrostatics field E is zero in the interior of a conductor; just


outside the surface of a charged conductor, E is normal to the

surface given by where is the unit vector along the


outward normal to the surface and is the surface charge density.
Charges in a conductor can reside only at its surface. Potential is
constant within and on the surface of a conductor. In a cavity
within a conductor (with no charges), the electric field is zero.
9. A capacitor is a system of two conductors separated by an
insulator. Its capacitance is defined by C = Q/V, where Q and Q
are the charges on the two conductors and V is the potential
difference between them. C is determined purely geometrically, by
the shapes, sizes and relative positions of the two conductors. The
unit of capacitance is farad:,
1 F = 1 C V1. For a parallel plate capacitor (with vacuum between
the plates),

C=
where A is the area of each plate and d the separation between
them.

10. If the medium between the plates of a capacitor is filled with an


insulating substance (dielectric), the electric field due to the
charged plates induces a net dipole moment in the dielectric. This
effect, called polarisation, gives rise to a field in the opposite
direction. The net electric field inside the dielectric and hence the
potential difference between the plates is thus reduced.
Consequently, the capacitance C increases from its value C0
when there is no medium (vacuum),
C = KC0
where K is the dielectric constant of the insulating substance.
11. For capacitors in the series combination, the total capacitance
C is given by

In the parallel combination, the total capacitance C is:

C = C1 + C2 + C3 + ...
where C1, C2, C3... are individual capacitances.
12. The energy U stored in a capacitor of capacitance C, with
charge Q and voltage V is

The electric energy density (energy per unit volume) in a region


with electric field is (1/2)0E2.
13. A Van de Graaff generator consists of a large spherical
conducting shell (a few metre in diameter). By means of a moving
belt and suitable brushes, charge is continuously transferred to the
shell and potential difference of the order of several million volts is
built up, which can be used for accelerating charged particles.

Points to Ponder
1. Electrostatics deals with forces between charges at rest. But if
there is a force on a charge, how can it be at rest? Thus, when we
are talking of r force between charges, it should be understood
that each charge is being kept at rest by some unspecified force
that opposes the net Coulomb force on the charge.
2. A capacitor is so configured that it confines the electric field
lines within a small region of space. Thus, even though field may
have considerable strength, the potential difference between the
two conductors of a capacitor is small.
3. Electric field is discontinuous across the surface of a spherical

charged shell. It is zero inside and outside. Electric potential


is, however continuous across the surface, equal to q/40R at
the surface.
4. The torque p E on a dipole causes it to oscillate about E. Only
if there is a dissipative mechanism, the oscillations are damped
and the dipole eventually aligns with E.

5. Potential due to a charge q at its own location is not defined it


is infinite.
6. In the expression qV(r) for potential energy of a charge q,V(r) is
the potential due to external charges and not the potential due
to q. As seen in point 5, this expression will be ill-defined if V(r)
includes potential due to a charge q itself.
7. A cavity inside a conductor is shielded from outside electrical
influences. It is worth noting that electrostatic shielding does not
work the other way round; that is, if you put charges inside the
cavity, the exterior of the conductor is not shielded from the fields
by the inside charges.

EXERCISES
2.1 Two charges 5 108 C and 3 108 C are located 16 cm
apart. At what point(s) on the line joining the two charges is the
electric potential zero? Take the potential at infinity to be zero.
2.2 A regular hexagon of side 10 cm has a charge 5 C at each of
its vertices. Calculate the potential at the centre of the hexagon.
2.3 Two charges 2 C and 2 C are placed at points A and B 6
cm apart.
(a) Identify an equipotential surface of the system.
(b) What is the direction of the electric field at every point on this
surface?
2.4 A spherical conductor of radius 12 cm has a charge of 1.6
107C distributed uniformly on its surface. What is the electric field
(a) inside the sphere
(b) just outside the sphere
(c) at a point 18 cm from the centre of the sphere?
2.5 A parallel plate capacitor with air between the plates has a
capacitance of 8 pF (1pF = 1012 F). What will be the capacitance
if the distance between the plates is reduced by half, and the
space between them is filled with a substance of dielectric
constant 6?

2.6 Three capacitors each of capacitance 9 pF are connected in


series.
(a) What is the total capacitance of the combination?
(b) What is the potential difference across each capacitor if the
combination is connected to a 120 V supply?
2.7 Three capacitors of capacitances 2 pF, 3 pF and 4 pF are
connected in parallel.
(a) What is the total capacitance of the combination?
(b) Determine the charge on each capacitor if the combination is
connected to a 100 V supply.

2.8 In a parallel plate capacitor with air between the plates, each
plate has an area of 6 103 m2 and the distance between the
plates is 3 mm. Calculate the capacitance of the capacitor. If this
capacitor is connected to a 100 V supply, what is the charge on
each plate of the capacitor?

2.9 Explain what would happen if in the capacitor given in Exercise


2.8, a 3 mm thick mica sheet (of dielectric constant = 6) were
inserted between the plates,
(a) while the voltage supply remained connected.
(b) after the supply was disconnected.
2.10 A 12pF capacitor is connected to a 50V battery. How much
electrostatic energy is stored in the capacitor?
2.11 A 600pF capacitor is charged by a 200V supply. It is then
disconnected from the supply and is connected to another
uncharged 600 pF capacitor. How much electrostatic energy is lost
in the process?

Additional Exercises
2.12 A charge of 8 mC is located at the origin. Calculate the work
done in taking a small charge of 2 109 C from a point P (0, 0, 3
cm) to a point Q (0, 4 cm, 0), via a point R (0, 6 cm, 9 cm).
2.13 A cube of side b has a charge q at each of its vertices.
Determine the potential and electric field due to this charge array
at the centre of the cube.
2.14 Two tiny spheres carrying charges 1.5 C and 2.5 C are
located 30 cm apart. Find the potential and electric field:
(a) at the mid-point of the line joining the two charges, and
(b) at a point 10 cm from this midpoint in a plane normal to the line
and passing through the mid-point.
2.15 A spherical conducting shell of inner radius r1 and outer
radius r2 has a charge Q.
(a) A charge q is placed at the centre of the shell. What is the
surface charge density on the inner and outer surfaces of the
shell?
(b) Is the electric field inside a cavity (with no charge) zero, even if
the shell is not spherical, but has any irregular shape? Explain.
2.16 (a) Show that the normal component of electrostatic field has
a discontinuity from one side of a charged surface to another given
by

where is a unit vector normal to the surface at a point and is


the surface charge density at that point. (The direction of is
from side 1 to side 2.) Hence show that just outside a conductor,
the electric field is /0.
(b) Show that the tangential component of electrostatic field is
continuous from one side of a charged surface to another. [Hint:
For (a), use Gausss law. For, (b) use the fact that work done by
electrostatic field on a closed loop is zero.]
2.17 A long charged cylinder of linear charged density is
surrounded by a hollow co-axial conducting cylinder. What is the
electric field in the space between the two cylinders?

2.18 In a hydrogen atom, the electron and proton are bound at a


distance of about 0.53 :
(a) Estimate the potential energy of the system in eV, taking the
zero of the potential energy at infinite separation of the electron
from proton.
(b) What is the minimum work required to free the electron, given
that its kinetic energy in the orbit is half the magnitude of potential
energy obtained in (a)?
(c) What are the answers to (a) and (b) above if the zero of
potential energy is taken at 1.06 separation?

2.19 If one of the two electrons of a H2 molecule is removed, we


get a hydrogen molecular ion H+2. In the ground state of an H+2,
the two protons are separated by roughly 1.5 , and the electron is
roughly 1 from each proton. Determine the potential energy of
the system. Specify your choice of the zero of potential energy.
2.20 Two charged conducting spheres of radii a and b are
connected to each other by a wire. What is the ratio of electric
fields at the surfaces of the two spheres? Use the result obtained
to explain why charge density on the sharp and pointed ends of a
conductor is higher than on its flatter portions.
2.21 Two charges q and +q are located at points (0, 0, a) and
(0, 0, a), respectively.
(a) What is the electrostatic potential at the points (0, 0, z) and
(x, y, 0) ?
(b) Obtain the dependence of potential on the distance r of a point
from the origin when r/a >> 1.

(c) How much work is done in moving a small test charge from the
point (5,0,0) to (7,0,0) along the x-axis? Does the answer change
if the path of the test charge between the same points is not along
the x-axis?
2.22 Figure 2.34 shows a charge array known as an electric
quadrupole. For a point on the axis of the quadrupole, obtain the
dependence
of potential on r for r/a >> 1, and contrast your results with that due
to an electric dipole, and an electric monopole (i.e., a single
charge).

Figure 2.34

2.23 An electrical technician requires a capacitance of 2 F in a


circuit across a potential difference of 1 kV. A large number of 1
F capacitors are available to him each of which can withstand a
potential difference of not more than 400 V. Suggest a possible
arrangement that requires the minimum number of capacitors.

2.24 What is the area of the plates of a 2 F parallel plate capacitor,


given that the separation between the plates is 0.5 cm? [You will
realise from your answer why ordinary capacitors are in the range
of F or less. However, electrolytic capacitors do have a much
larger capacitance (0.1 F) because of very minute separation
between the conductors.]

2.25 Obtain the equivalent capacitance of the network in Fig. 2.35.


For a 300 V supply, determine the charge and voltage across each
capacitor.

Figure 2.35

2.26 The plates of a parallel plate capacitor have an area of 90


cm2 each and are separated by 2.5 mm. The capacitor is charged
by connecting it to a 400 V supply.

(a) How much electrostatic energy is stored by the capacitor?


(b) View this energy as stored in the electrostatic field between the
plates, and obtain the energy per unit volume u. Hence arrive at a
relation between u and the magnitude of electric field E between
the plates.

2.27 A 4 F capacitor is charged by a 200 V supply. It is then


disconnected from the supply, and is connected to another
uncharged 2 F capacitor. How much electrostatic energy of the
first capacitor is lost in the form of heat and electromagnetic
radiation?
2.28 Show that the force on each plate of a parallel plate capacitor
has a magnitude equal to () QE, where Q is the charge on the
capacitor, and E is the magnitude of electric field between the
plates. Explain the origin of the factor .
2.29 A spherical capacitor consists of two concentric spherical
conductors, held in position by suitable insulating supports (Fig.
2.36). Show
Figure 2.36

that the capacitance of a spherical capacitor is given by

where r1 and r2 are the radii of outer and inner spheres,


respectively.
2.30 A spherical capacitor has an inner sphere of radius 12 cm
and an outer sphere of radius 13 cm. The outer sphere is earthed
and the inner sphere is given a charge of 2.5 C. The space
between the concentric spheres is filled with a liquid of dielectric
constant 32.
(a) Determine the capacitance of the capacitor.
(b) What is the potential of the inner sphere?
(c) Compare the capacitance of this capacitor with that of an
isolated sphere of radius 12 cm. Explain why the latter is much
smaller.
2.31 Answer carefully:
(a) Two large conducting spheres carrying charges Q1 and Q2 are
brought close to each other. Is the magnitude of electrostatic force
between them exactly given by Q1 Q2/40r2, where r is the
distance between their centres?
(b) If Coulombs law involved 1/r3 dependence (instead of 1/r2),
would Gausss law be still true ?

(c) A small test charge is released at rest at a point in an


electrostatic field configuration. Will it travel along the field line
passing through that point?
(d) What is the work done by the field of a nucleus in a complete
circular orbit of the electron? What if the orbit is elliptical?
(e) We know that electric field is discontinuous across the surface
of a charged conductor. Is electric potential also discontinuous
there?
(f) What meaning would you give to the capacitance of a single
conductor?

(g) Guess a possible reason why water has a much greater


dielectric constant (= 80) than say, mica (= 6).
2.32 A cylindrical capacitor has two co-axial cylinders of length 15
cm and radii 1.5 cm and 1.4 cm. The outer cylinder is earthed and
the inner cylinder is given a charge of 3.5 C. Determine the
capacitance of the system and the potential of the inner cylinder.
Neglect end effects (i.e., bending of field lines at the ends).
2.33 A parallel plate capacitor is to be designed with a voltage
rating 1 kV, using a material of dielectric constant 3 and dielectric
strength about 107 Vm1. (Dielectric strength is the maximum
electric field a material can tolerate without breakdown, i.e.,
without starting to conduct electricity through partial ionisation.)
For safety, we should like the field never to exceed, say 10% of
the dielectric strength. What minimum area of the plates is
required to have a capacitance of 50 pF?
2.34 Describe schematically the equipotential surfaces
corresponding to
(a) a constant electric field in the z-direction,
(b) a field that uniformly increases in magnitude but remains in a
constant (say, z) direction,
(c) a single positive charge at the origin, and
(d) a uniform grid consisting of long equally spaced parallel
charged wires in a plane.
2.35 In a Van de Graaff type generator a spherical metal shell is to
be a 15 106 V electrode. The dielectric strength of the gas
surrounding the electrode is 5 107 Vm1. What is the minimum
radius of the spherical shell required? (You will learn from this
exercise why one cannot build an electrostatic generator using a
very small shell which requires a small charge to acquire a high
potential.)
2.36 A small sphere of radius r1 and charge q1 is enclosed by a
spherical shell of radius r2 and charge q2. Show that if q1 is
positive, charge will necessarily flow from the sphere to the shell
(when the two are connected by a wire) no matter what the charge
q2 on the shell is.
2.37 Answer the following:

(a) The top of the atmosphere is at about 400 kV with respect to


the surface of the earth, corresponding to an electric field that
decreases with altitude. Near the surface of the earth, the field is
about 100 Vm1. Why then do we not get an electric shock as we
step out of our house into the open? (Assume the house to be a
steel cage so there is no field inside!)
(b) A man fixes outside his house one evening a two metre high
insulating slab carrying on its top a large aluminium sheet of area
1m2. Will he get an electric shock if he touches the metal sheet
next morning?
(c) The discharging current in the atmosphere due to the small
conductivity of air is known to be 1800 A on an average over the
globe. Why then does the atmosphere not discharge itself
completely in due course and become electrically neutral? In other
words, what keeps the atmosphere charged?
(d) What are the forms of energy into which the electrical energy of
the atmosphere is dissipated during a lightning?
(Hint: The earth has an electric field of about 100 Vm1 at its
surface in the downward direction, corresponding to a surface
charge density = 109 C m2. Due to the slight conductivity of the
atmosphere up to about 50 km (beyond which it is good
conductor), about + 1800 C is pumped every second into the earth
as a whole. The earth, however, does not get discharged since
thunderstorms and lightning occurring continually all over the
globe pump an equal amount of negative charge on the earth.)
Chapter Fourteen

SEMICONDUCTOR ELECTRONICS:
MATERIALS, DEVICES AND SIMPLE
CIRCUITS

14.1 INTRODUCTION

Devices in which a controlled flow of electrons can be obtained are the


basic building blocks of all the electronic circuits. Before the discovery
of transistor in 1948, such devices were mostly vacuum tubes (also
called valves) like the vacuum diode which has two electrodes, viz.,
anode (often called plate) and cathode; triode which has three
electrodes cathode, plate and grid; tetrode and pentode
(respectively with 4 and 5 electrodes). In a vacuum tube, the electrons
are supplied by a heated cathode and the controlled flow of these
electrons in vacuum is obtained by varying the voltage between its
different electrodes. Vacuum is required in the inter-electrode space;
otherwise the moving electrons may lose their energy on collision with
the air molecules in their path. In these devices the electrons can flow
only from the cathode to the anode (i.e., only in one direction).
Therefore, such devices are generally referred to as valves. These
vacuum tube devices are bulky, consume high power, operate
generally at high voltages (~100 V) and have limited life and low
reliability. The seed of the development of modern solid-state

semiconductor electronics goes back to 1930s when it was realised


that some solid-state semiconductors and their junctions offer the
possibility of controlling the number and the direction of flow of charge
carriers through them. Simple excitations like light, heat or small
applied voltage can change the number of mobile charges in a
semiconductor. Note that the supply and flow of charge carriers in the
semiconductor devices are within the solid itself, while in the earlier
vacuum tubes/valves, the mobile electrons were obtained from a
heated cathode and they were made to flow in an evacuated space or
vacuum. No external heating or large evacuated space is required by
the semiconductor devices. They are small in size, consume low
power, operate at low voltages and have long life and high reliability.
Even the Cathode Ray Tubes (CRT) used in television and computer
monitors which work on the principle of vacuum tubes are being
replaced by Liquid Crystal Display (LCD) monitors with supporting
solid state electronics. Much before the full implications of the
semiconductor devices was formally understood, a naturally occurring
crystal of galena (Lead sulphide, PbS) with a metal point contact
attached to it was used as detector of radio waves.

In the following sections, we will introduce the basic concepts of


semiconductor physics and discuss some semiconductor devices like
junction diodes (a 2-electrode device) and bipolar junction transistor (a
3-electrode device). A few circuits illustrating their applications will
also be described.
14.2 CLASSIFICATION OF METALS,
CONDUCTORS AND SEMICONDUCTORS

On the basis of conductivity


On the basis of the relative values of electrical conductivity () or
resistivity ( = 1/), the solids are broadly classified as:

(i) Metals: They possess very low resistivity (or high conductivity).
~ 102 108 m
~ 102 108 S m1
(ii) Semiconductors: They have resistivity or conductivity intermediate
to metals and insulators.
~ 105 106 m
~ 105 106 S m1
(iii) Insulators: They have high resistivity (or low conductivity).
~ 1011 1019 m

~ 1011 1019 S m1
The values of and given above are indicative of magnitude and
could well go outside the ranges as well. Relative values of the
resistivity are not the only criteria for distinguishing metals, insulators
and semiconductors from each other. There are some other
differences, which will become clear as we go along in this chapter.
Our interest in this chapter is in the study of semiconductors which
could be:
(i) Elemental semiconductors: Si and Ge
(ii) Compound semiconductors: Examples are:

Inorganic: CdS, GaAs, CdSe, InP, etc.


Organic: anthracene, doped pthalocyanines, etc.
Organic polymers: polypyrrole, polyaniline, polythiophene, etc.
Most of the currently available semiconductor devices are based on
elemental semiconductors Si or Ge and compound inorganic
semiconductors. However, after 1990, a few semiconductor devices
using organic semiconductors and semiconducting polymers have
been developed signalling the birth of a futuristic technology of
polymer-electronics and molecular-electronics. In this chapter, we will
restrict ourselves to the study of inorganic semiconductors, particularly
elemental semiconductors Si and Ge. The general concepts
introduced here for discussing the elemental semiconductors, by-and-
large, apply to most of the compound semiconductors as well.

On the basis of energy bands


According to the Bohr atomic model, in an isolated atom the energy of
any of its electrons is decided by the orbit in which it revolves. But
when the atoms come together to form a solid they are close to each
other. So the outer orbits of electrons from neighbouring atoms would
come very close or could even overlap. This would make the nature of
electron motion in a solid very different from that in an isolated atom.
Inside the crystal each electron has a unique position and no two
electrons see exactly the same pattern of surrounding charges.
Because of this, each electron will have a different energy level.
These different energy levels with continuous energy variation form
what are called energy bands. The energy band which includes the
energy levels of the valence electrons is called the valence band. The
energy band above the valence band is called the conduction band.
With no external energy, all the valence electrons will reside in the
valence band. If the lowest level in the conduction band happens to be
lower than the highest level of the valence band, the electrons from
the valence band can easily move into the conduction band. Normally
the conduction band is empty. But when it overlaps on the valence
band electrons can move freely into it. This is the case with metallic
conductors.
If there is some gap between the conduction band and the valence
band, electrons in the valence band all remain bound and no free
electrons are available in the conduction band. This makes the
material an insulator. But some of the electrons from the valence band
may gain external energy to cross the gap between the conduction
band and the valence band. Then these electrons will move into the
conduction band. At the same time they will create vacant energy
levels in the valence band where other valence electrons can move.
Thus the process creates the possibility of conduction due to electrons
in conduction band as well as due to vacancies in the valence band.
Let us consider what happens in the case of Si or Ge crystal
containing N atoms. For Si, the outermost orbit is the third orbit (n =
3), while for Ge it is the fourth orbit (n = 4). The number of electrons in
the outermost orbit is 4 (2s and 2p electrons). Hence, the total number
of outer electrons in the crystal is 4N. The maximum possible number
of electrons in the outer orbit is 8 (2s + 6p electrons). So, for the 4N
valence electrons there are 8N available energy states. These 8N
discrete energy levels can either form a continuous band or they may
be grouped in different bands depending upon the distance between
the atoms in the crystal (see box on Band Theory of Solids).
At the distance between the atoms in the crystal lattices of Si and Ge,
the energy band of these 8N states is split apart into two which are
separated by an energy gap Eg (Fig. 14.1). The lower band which is
completely occupied by the 4N valence electrons at temperature of
absolute zero is the valence band. The other band consisting of 4N
energy states, called the conduction band, is completely empty at
absolute zero.

BAND THEORY OF SOLIDS

Consider that the Si or Ge crystal contains N atoms. Electrons of each atom


will have discrete energies in different orbits. The electron energy will be same
if all the atoms are isolated, i.e., separated from each other by a large
distance. However, in a crystal, the atoms are close to each other (2 to 3 )
and therefore the electrons interact with each other and also with the
neighbouring atomic cores. The overlap (or interaction) will be more felt by the
electrons in the outermost orbit while the inner orbit or core electron energies
may remain unaffected. Therefore, for understanding electron energies in Si or
Ge crystal, we need to consider the changes in the energies of the electrons in
the outermost orbit only. For Si, the outermost orbit is the third orbit (n = 3),
while for Ge it is the fourth orbit (n = 4). The number of electrons in the
outermost orbit is 4 (2s and 2p electrons). Hence, the total number of outer
electrons in the crystal is 4N. The maximum possible number of outer
electrons in the orbit is 8 (2s + 6p electrons). So, out of the 4N electrons, 2N
electrons are in the 2N s-states (orbital quantum number l = 0) and 2N
electrons are in the available 6N p-states. Obviously, some p-electron states
are empty as shown in the extreme right of Figure. This is the case of well
separated or isolated atoms [region A of Figure].

Suppose these atoms start coming nearer to each other to form a solid. The
energies of these electrons in the outermost orbit may change (both increase
and decrease) due to the interaction between the electrons of different atoms.
The 6N states for l = 1, which originally had identical energies in the isolated
atoms, spread out and form an energy band [region B in Figure]. Similarly, the
2N states for l = 0, having identical energies in the isolated atoms, split into a
second band (carefully see the region B of Figure) separated from the first one
by an energy gap.
At still smaller spacing, however, there comes a region in which the bands
merge with each other. The lowest energy state that is a split from the upper
atomic level appears to drop below the upper state that has come from the
lower atomic level. In this region (region C in Figure), no energy gap exists
where the upper and lower energy states get mixed.
Finally, if the distance between the atoms further decreases, the energy bands
again split apart and are separated by an energy gap Eg (region D in Figure).
The total number of available energy states 8N has been re-apportioned
between the two bands (4N states each in the lower and upper energy bands).
Here the significant point is that there are exactly as many states in the lower
band (4N) as there are available valence electrons from the atoms (4N).
Therefore, this band (called the valence band) is completely filled while the
upper band is completely empty. The upper band is called the conduction
band.

The lowest energy level in the conduction band is shown as EC and


highest energy level in the valence band is shown as EV. Above EC
and below EV there are a large number of closely spaced energy
levels, as shown in Fig. 14.1.

Figure 14.1 The energy band positions in a semiconductor at 0 K. The upper band,
called the conduction band, consists of infinitely large number of closely spaced
energy states. The lower band, called the valence band, consists of closely spaced
completely filled energy states.
The gap between the top of the valence band and bottom of the
conduction band is called the energy band gap (Energy gap Eg). It
may be large, small, or zero, depending upon the material. These
different situations, are depicted in Fig. 14.2 and discussed below:
Case I: This refers to a situation, as shown in Fig. 14.2(a). One can
have a metal either when the conduction band is partially filled and the
balanced band is partially empty or when the conduction and valance
bands overlap. When there is overlap electrons from valence band
can easily move into the conduction band. This situation makes a
large number of electrons available for electrical conduction. When the
valence band is partially empty, electrons from its lower level can
move to higher level making conduction possible. Therefore, the
resistance of such materials is low or the conductivity is high.

Figure 14.2 Difference between energy bands of (a) metals, (b) insulators and (c)
semiconductors.

Case II: In this case, as shown in Fig. 14.2(b), a large band gap Eg
exists (Eg > 3 eV). There are no electrons in the conduction band, and
therefore no electrical conduction is possible. Note that the energy
gap is so large that electrons cannot be excited from the valence band
to the conduction band by thermal excitation. This is the case of
insulators.
Case III: This situation is shown in Fig. 14.2(c). Here a finite but small
band gap (Eg < 3 eV) exists. Because of the small band gap, at room
temperature some electrons from valence band can acquire enough
energy to cross the energy gap and enter the conduction band. These
electrons (though small in numbers) can move in the conduction band.
Hence, the resistance of semiconductors is not as high as that of the
insulators.
In this section we have made a broad classification of metals,
conductors and semiconductors. In the section which follows you will
learn the conduction process in semiconductors.

14.3 INTRINSIC SEMICONDUCTOR


Figure 14.3 Three-dimensional diamond-like crystal structure for Carbon, Silicon or
Germanium with respective lattice spacing a equal to 3.56, 5.43 and 5.66 .

We shall take the most common case of Ge and Si whose lattice


structure is shown in Fig. 14.3. These structures are called the
diamond-like structures. Each atom is surrounded by four nearest
neighbours. We know that Si and Ge have four valence electrons. In
its crystalline structure, every Si or Ge atom tends to share one of its
four valence electrons with each of its four nearest neighbour atoms,
and also to take share of one electron from each such neighbour.
These shared electron pairs are referred to as forming a covalent
bond or simply a valence bond. The two shared electrons can be
assumed to shuttle back-and-forth between the associated atoms
holding them together strongly. Figure 14.4 schematically shows the
2-dimensional representation of Si or Ge structure shown in Fig. 14.3
which overemphasises the covalent bond. It shows an idealised
picture in which no bonds are broken (all bonds are intact). Such a
situation arises at low temperatures. As the temperature increases,
more thermal energy becomes available to these electrons and some
of these electrons may breakaway (becoming free electrons
contributing to conduction). The thermal energy effectively ionises only
a few atoms in the crystalline lattice and creates a vacancy in the
bond as shown in Fig. 14.5(a). The neighbourhood, from which the
free electron (with charge q) has come out leaves a vacancy with an
effective charge (+q). This vacancy with the effective positive
electronic charge is called a hole. The hole behaves as an apparent
free particle with effective positive charge.
In intrinsic semiconductors, the number of free electrons, ne is equal
to the number of holes, nh. That is
ne = nh = ni (14.1)
where ni is called intrinsic carrier concentration.
Semiconductors posses the unique property in which, apart from
electrons, the holes also move. Suppose there is a hole at site 1 as
shown in Fig. 14.5(a). The movement of holes can be visualised as
shown in Fig. 14.5(b). An electron from the covalent bond at site 2
may jump to the vacant site 1 (hole).
Figure 14.4 Schematic two-dimensional representation of Si or Ge structure
showing covalent bonds at low temperature (all bonds intact). +4 symbol indicates
inner cores of Si or Ge.

Thus, after such a jump, the hole is at site 2 and the site 1 has now an
electron. Therefore, apparently, the hole has moved from site 1 to site
2. Note that the electron originally set free [Fig. 14.5(a)] is not involved
in this process of hole motion. The free electron moves completely
independently as conduction electron and gives rise to an electron
current, Ie under an applied electric field. Remember that the motion
of hole is only a convenient way of describing the actual motion of
bound electrons, whenever there is an empty bond anywhere in the
crystal. Under the action of an electric field, these holes move towards
negative potential giving the hole current, Ih. The total current, I is thus
the sum of the electron current Ie and the hole current Ih:

I = Ie + Ih (14.2)
It may be noted that apart from the process of generation of
conduction electrons and holes, a simultaneous process of
recombination occurs in which the electrons recombine with the holes.
At equilibrium, the rate of generation is equal to the rate of
recombination of charge carriers. The recombination occurs due to an
electron colliding with a hole.

(a)
(b)

Figure 14.5 (a) Schematic model of generation of hole at site 1 and conduction
electron due to thermal energy at moderate temperatures. (b) Simplified
representation of possible thermal motion of a hole. The electron from the lower
left hand covalent bond (site 2) goes to the earlier hole site1, leaving a hole at its
site indicating an
apparent movement of the hole from site 1 to site 2.

An intrinsic semiconductor will behave like an insulator at T = 0 K as


shown in Fig. 14.6(a). It is the thermal energy at higher temperatures
(T > 0K), which excites some electrons from the valence band to the
conduction band. These thermally excited electrons at
T > 0 K, partially occupy the conduction band. Therefore, the energy-
band diagram of an intrinsic semiconductor will be as shown in Fig.
14.6(b). Here, some electrons are shown in the conduction band.
These have come from the valence band leaving equal number of
holes there.

Figure 14.6 (a) An intrinsic semiconductor at T = 0 K behaves like insulator. (b) At


T > 0 K, four thermally generated electron-hole pairs. The filled circles ( )
represent electrons and empty fields ( ) represent holes.

Example 14.1 C, Si and Ge have same lattice structure. Why is C insulator


while Si and Ge intrinsic semiconductors?
Solution The 4 bonding electrons of C, Si or Ge lie, respectively, in the
second, third and fourth orbit. Hence, energy required to take out an electron
from these atoms (i.e., ionisation energy Eg) will be least for Ge, followed by Si
and highest for C. Hence, number of free electrons for conduction in Ge and Si
are significant but negligibly small for C.
14.4 EXTRINSIC SEMICONDUCTOR

The conductivity of an intrinsic semiconductor depends on its


temperature, but at room temperature its conductivity is very low. As
such, no important electronic devices can be developed using these
semiconductors. Hence there is a necessity of improving their
conductivity. This can be done by making use of impurities.

When a small amount, say, a few parts per million (ppm), of a suitable
impurity is added to the pure semiconductor, the conductivity of the
semiconductor is increased manifold. Such materials are known as
extrinsic semiconductors or impurity semiconductors. The deliberate
addition of a desirable impurity is called doping and the impurity atoms
are called dopants. Such a material is also called a doped
semiconductor. The dopant has to be such that it does not distort the
original pure semiconductor lattice. It occupies only a very few of the
original semiconductor atom sites in the crystal. A necessary condition
to attain this is that the sizes of the dopant and the semiconductor
atoms should be nearly the same.

There are two types of dopants used in doping the tetravalent Si or


Ge:
(i) Pentavalent (valency 5); like Arsenic (As), Antimony (Sb),
Phosphorous (P), etc.
(ii) Trivalent (valency 3); like Indium (In), Boron (B), Aluminium (Al),
etc.
We shall now discuss how the doping changes the number of charge
carriers (and hence the conductivity) of semiconductors. Si or Ge
belongs to the fourth group in the Periodic table and, therefore, we
choose the dopant element from nearby fifth or third group, expecting
and taking care that the size of the dopant atom is nearly the same as
that of Si or Ge. Interestingly, the pentavalent and trivalent dopants in
Si or Ge give two entirely different types of semiconductors as
discussed below.
(i) n-type semiconductor
Figure 14.7 (a) Pentavalent donor atom (As, Sb, P, etc.) doped for tetravalent Si or
Ge giving n-type semiconductor, and (b) Commonly used schematic
representation of n-type material which shows only the fixed cores of the
substituent donors with one additional effective positive charge and its associated
extra electron.
Suppose we dope Si or Ge with a pentavalent element as shown in
Fig. 14.7. When an atom of +5 valency element occupies the position
of an atom in the crystal lattice of Si, four of its electrons bond with the
four silicon neighbours while the fifth remains very weakly bound to its
parent atom. This is because the four electrons participating in
bonding are seen as part of the effective core of the atom by the fifth
electron. As a result the ionisation energy required to set this electron
free is very small and even at room temperature it will be free to move
in the lattice of the semiconductor. For example, the energy required
is ~ 0.01 eV for germanium, and 0.05 eV for silicon, to separate this
electron from its atom. This is in contrast to the energy required to
jump the forbidden band (about 0.72 eV for germanium and about 1.1
eV for silicon) at room temperature in the intrinsic semiconductor.
Thus, the pentavalent dopant is donating one extra electron for
conduction and hence is known as donor impurity. The number of
electrons made available for conduction by dopant atoms depends
strongly upon the doping level and is independent of any increase in
ambient temperature. On the other hand, the number of free electrons
(with an equal number of holes) generated by Si atoms, increases
weakly with temperature.
In a doped semiconductor the total number of conduction electrons ne
is due to the electrons contributed by donors and those generated
intrinsically, while the total number of holes nh is only due to the holes
from the intrinsic source. But the rate of recombination of holes would
increase due to the increase in the number of electrons. As a result,
the number of holes would get reduced further.
Thus, with proper level of doping the number of conduction electrons
can be made much larger than the number of holes. Hence in an
extrinsic semiconductor doped with pentavalent impurity, electrons
become the majority carriers and holes the minority carriers. These
semiconductors are, therefore, known as n-type semiconductors. For
n-type semiconductors, we have,
ne >> nh (14.3)

(ii) p-type semiconductor


Figure 14.8 (a) Trivalent acceptor atom (In, Al, B etc.) doped in tetravalent Si or Ge
lattice giving p-type semiconductor. (b) Commonly used schematic representation
of p-type material which shows only the fixed core of the substituent acceptor with
one effective additional negative charge and its associated hole.
This is obtained when Si or Ge is doped with a trivalent impurity like
Al, B, In, etc. The dopant has one valence electron less than Si or Ge
and, therefore, this atom can form covalent bonds with neighbouring
three Si atoms but does not have any electron to offer to the fourth Si
atom. So the bond between the fourth neighbour and the trivalent
atom has a vacancy or hole as shown in Fig. 14.8. Since the
neighbouring Si atom in the lattice wants an electron in place of a
hole, an electron in the outer orbit of an atom in the neighbourhood
may jump to fill this vacancy, leaving a vacancy or hole at its own site.
Thus the hole is available for conduction. Note that the trivalent
foreign atom becomes effectively negatively charged when it shares
fourth electron with neighbouring Si atom. Therefore, the dopant atom
of p-type material can be treated as core of one negative charge along
with its associated hole as shown in Fig. 14.8(b). It is obvious that one
acceptor atom gives one hole. These holes are in addition to the
intrinsically generated holes while the source of conduction electrons
is only intrinsic generation. Thus, for such a material, the holes are the
majority carriers and electrons are minority carriers. Therefore,
extrinsic semiconductors doped with trivalent impurity are called p-
type semiconductors. For p-type semiconductors, the recombination
process will reduce the number (ni)of intrinsically generated electrons
to ne. We have, for p-type semiconductors
nh >> ne (14.4)
Note that the crystal maintains an overall charge neutrality as the
charge of additional charge carriers is just equal and opposite to that
of the ionised cores in the lattice.
In extrinsic semiconductors, because of the abundance of majority
current carriers, the minority carriers produced thermally have more
chance of meeting majority carriers and thus getting destroyed.
Hence, the dopant, by adding a large number of current carriers of
one type, which become the majority carriers, indirectly helps to
reduce the intrinsic concentration of minority carriers.
The semiconductors energy band structure is affected by doping. In
the case of extrinsic semiconductors, additional energy states due to
donor impurities (ED) and acceptor impurities (EA) also exist. In the
energy band diagram of n-type Si semiconductor, the donor energy
level ED is slightly below the bottom EC of the conduction band and
electrons from this level move into the conduction band with very
small supply of energy. At room temperature, most of the donor atoms
get ionised but very few (~1012) atoms of Si get ionised. So the
conduction band will have most electrons coming from the donor
impurities, as shown in Fig. 14.9(a). Similarly, for p-type
semiconductor, the acceptor energy level EA is slightly above the top
EV of the valence band as shown in Fig. 14.9(b). With very small
supply of energy an electron from the valence band can jump to the
level EA and ionise the acceptor negatively. (Alternately, we can also
say that with very small supply of energy the hole from level EA sinks
down into the valence band. Electrons rise up and holes fall down
when they gain external energy.) At room temperature, most of the
acceptor atoms get ionised leaving holes in the valence band. Thus at
room temperature the density of holes in the valence band is
predominantly due to impurity in the extrinsic semiconductor. The
electron and hole concentration in a semiconductor in thermal
equilibrium is given by
nenh = ni2 (14.5)
Though the above description is grossly approximate and
hypothetical, it helps in understanding the difference between metals,
insulators and semiconductors (extrinsic and intrinsic) in a simple
manner. The difference in the resistivity of C, Si and Ge depends upon
the energy gap between their conduction and valence bands. For C
(diamond), Si and Ge, the energy gaps are 5.4 eV, 1.1 eV and 0.7 eV,
respectively. Sn also is a group IV element but it is a metal because
the energy gap in its case is 0 eV.

Figure 14.9 Energy bands of (a) n-type semiconductor at T > 0K, (b) p-
type semiconductor at T > 0K.

Example 14.2 Suppose a pure Si crystal has 5 1028 atoms m3. It is doped
by 1 ppm concentration of pentavalent As. Calculate the number of electrons
and holes. Given that ni =1.5 1016 m3.

Solution Note that thermally generated electrons (ni ~1016 m3) are negligibly
small as compared to those produced by doping.
Therefore, ne ND.
Since nenh = ni2, The number of holes
nh = (2.25 1032)/(5 1022)

~ 4.5 109 m3

14.5 p-n JUNCTION

A p-n junction is the basic building block of many semiconductor


devices like diodes, transistor, etc. A clear understanding of the
junction behaviour is important to analyse the working of other
semiconductor devices.
We will now try to understand how a junction is formed and how the
junction behaves under the influence of external applied voltage (also
called bias).

14.5.1 p-n junction formation

Consider a thin p-type silicon (p-Si) semiconductor wafer. By adding


precisely a small quantity of pentavelent impurity, part of the p-Si
wafer can be converted into n-Si. There are several processes by
which a semiconductor can be formed. The wafer now contains p-
region and n-region and a metallurgical junction between p-, and n-
region.
Two important processes occur during the formation of a p-n junction:
diffusion and drift. We know that in an n-type semiconductor, the
concentration of electrons (number of electrons per unit volume) is
more compared to the concentration of holes. Similarly, in a p-type
semiconductor, the concentration of holes is more than the
concentration of electrons. During the formation of p-n junction, and
due to the concentration gradient across p-, and n- sides, holes diffuse
from p-side to n-side (p n) and electrons diffuse from n-side to p-
side (n p). This motion of charge carries gives rise to diffusion
current across the junction.

Figure 14.10 p-n junction formation process.

When an electron diffuses from n p, it leaves behind an ionised


donor on n-side. This ionised donor (positive charge) is immobile as it
is bonded to the surrounding atoms. As the electrons continue to
diffuse from n p, a layer of positive charge (or positive space-
charge region) on n-side of the junction is developed.
Similarly, when a hole diffuses from p n due to the concentration
gradient, it leaves behind an ionised acceptor (negative charge) which
is immobile. As the holes continue to diffuse, a layer of negative
charge (or negative space-charge region) on the p-side of the junction
is developed. This space-charge region on either side of the junction
together is known as depletion region as the electrons and holes
taking part in the initial movement across the junction depleted the
region of its free charges (Fig. 14.10). The thickness of depletion
region is of the order of one-tenth of a micrometre. Due to the positive
space-charge region on n-side of the junction and negative space
charge region on p-side of the junction, an electric field directed from
positive charge towards negative charge develops. Due to this field,
an electron on p-side of the junction moves to n-side and a hole on n-
side of the junction moves to p-side. The motion of charge carriers
due to the electric field is called drift. Thus a drift current, which is
opposite in direction to the diffusion current (Fig. 14.10) starts.
Initially, diffusion current is large and drift current is small. As the
diffusion process continues, the space-charge regions on either side
of the junction extend, thus increasing the electric field strength and
hence drift current. This process continues until the diffusion current
equals the drift current. Thus a p-n junction is formed. In a p-n junction
under equilibrium there is no net current.
Figure 14.11 (a) Diode under equilibrium (V = 0), (b) Barrier potential under no
bias.

The loss of electrons from the n-region and the gain of electron by the
p-region causes a difference of potential across the junction of the two
regions. The polarity of this potential is such as to oppose further flow
of carriers so that a condition of equilibrium exists. Figure 14.11 shows
the p-n junction at equilibrium and the potential across the junction.
The
n-material has lost electrons, and p material has acquired electrons.
The n material is thus positive relative to the p material. Since this
potential tends to prevent the movement of electron from the n region
into the p region, it is often called a barrier potential.
Example 14.3 Can we take one slab of p-type semiconductor and physically
join it to another n-type semiconductor to get p-n junction?
Solution No! Any slab, howsoever flat, will have roughness much larger than
the inter-atomic crystal spacing (~2 to 3 ) and hence continuous contact at
the atomic level will not be possible. The junction will behave as a discontinuity
for the flowing charge carriers.

14.6 SEMICONDUCTOR DIODE

A semiconductor diode [Fig. 14.12(a)] is basically a p-n junction with


metallic contacts provided at the ends for the application of an
external voltage. It is a two terminal device. A p-n junction diode is
symbolically represented as shown in Fig. 14.12(b).
The direction of arrow indicates the conventional direction of current
(when the diode is under forward bias). The equilibrium barrier
potential can be altered by applying an external voltage V across the
diode. The situation of p-n junction diode under equilibrium (without
bias) is shown in Fig. 14.11(a) and (b).

14.6.1 p-n junction diode under forward bias


Figure 14.12 (a) Semiconductor diode, (b) Symbol for p-n junction diode.

When an external voltage V is applied across a semiconductor diode


such that p-side is connected to the positive terminal of the battery
and n-side to the negative terminal [Fig. 14.13(a)], it is said to be
forward biased.
The applied voltage mostly drops across the depletion region and the
voltage drop across the p-side and n-side of the junction is negligible.
(This is because the resistance of the depletion region a region
where there are no charges is very high compared to the resistance
of n-side and p-side.) The direction of the applied voltage (V) is
opposite to the built-in potential V0. As a result, the depletion layer
width decreases and the barrier height is reduced [Fig. 14.13(b)]. The
effective barrier height under forward bias is (V0 V).
If the applied voltage is small, the barrier potential will be reduced only
slightly below the equilibrium value, and only a small number of
carriers in the materialthose that happen to be in the uppermost
energy levelswill possess enough energy to cross the junction. So
the current will be small. If we increase the applied voltage
significantly, the barrier height will be reduced and more number of
carriers will have the required energy. Thus the current increases.

Figure 14.13 (a) p-n junction diode under forward bias, (b) Barrier potential (1)
without battery, (2) Low battery voltage, and (3) High voltage battery.

Due to the applied voltage, electrons from n-side cross the depletion
region and reach p-side (where they are minority carries). Similarly,
holes from p-side cross the junction and reach the n-side (where they
are minority carries). This process under forward bias is known as
minority carrier injection. At the junction boundary, on each side, the
minority carrier concentration increases significantly compared to the
locations far from the junction.

Due to this concentration gradient, the injected electrons on p-side


diffuse from the junction edge of p-side to the other end of p-side.
Likewise, the injected holes on n-side diffuse from the junction edge of
n-side to the other end of n-side
(Fig. 14.14). This motion of charged carriers on either side gives rise
to current. The total diode forward current is sum of hole diffusion
current and conventional current due to electron diffusion. The
magnitude of this current is usually in mA.

Figure 14.14 Forward bias minority carrier injection.

14.6.2 p-n junction diode under reverse bias

When an external voltage (V) is applied across the diode such that n-
side is positive and p-side is negative, it is said to be reverse biased
[Fig.14.15(a)]. The applied voltage mostly drops across the depletion
region. The direction of applied voltage is same as the direction of
barrier potential. As a result, the barrier height increases and the
depletion region widens due to the change in the electric field. The
effective barrier height under reverse bias is (V0 + V), [Fig. 14.15(b)].
This suppresses the flow of electrons from n p and holes from p
n. Thus, diffusion current, decreases enormously compared to the
diode under forward bias.

The electric field direction of the junction is such that if electrons on p-


side or holes on n-side in their random motion come close to the
junction, they will be swept to its majority zone. This drift of carriers
gives rise to current. The drift current is of the order of a few A. This
is quite low because it is due to the motion of carriers from their
minority side to their majority side across the junction. The drift current
is also there under forward bias but it is negligible (A) when
compared with current due to injected carriers which is usually in mA.
The diode reverse current is not very much dependent on the applied
voltage. Even a small voltage is sufficient to sweep the minority
carriers from one side of the junction to the other side of the junction.
The current is not limited by the magnitude of the applied voltage but
is limited due to the concentration of the minority carrier on either side
of the junction.
The current under reverse bias is essentially voltage independent upto
a critical reverse bias voltage, known as breakdown voltage (Vbr).
When V = Vbr, the diode reverse current increases sharply. Even a
slight increase in the bias voltage causes large change in the current.
If the reverse current is not limited by an external circuit below the
rated value (specified by the manufacturer) the p-n junction will get
destroyed. Once it exceeds the rated value, the diode gets destroyed
due to overheating. This can happen even for the diode under forward
bias, if the forward current exceeds the rated value.

Figure 14.15 (a) Diode under reverse bias, (b) Barrier potential under reverse bias.

The circuit arrangement for studying the V-I characteristics of a diode,


(i.e., the variation of current as a function of applied voltage) are
shown in Fig. 14.16(a) and (b). The battery is connected to the diode
through a potentiometer (or reheostat) so that the applied voltage to
the diode can be changed. For different values of voltages, the value
of the current is noted. A graph between V and I is obtained as in Fig.
14.16(c). Note that in forward bias measurement, we use a
milliammeter since the expected current is large (as explained in the
earlier section) while a micrometer is used in reverse bias to measure
the current. You can see in Fig. 14.16(c) that in forward bias, the
current first increases very slowly, almost negligibly, till the voltage
across the diode crosses a certain value. After the characteristic
voltage, the diode current increases significantly (exponentially), even
for a very small increase in the diode bias voltage. T

Figure 14.16 Experimental circuit arrangement for studying V-I characteristics of a


p-n junction diode (a) in forward bias , (b) in reverse bias. (c) Typical V-
I characteristics of a silicon diode.

his voltage is called the threshold voltage or cut-in voltage (~0.2V for
germanium diode and ~0.7 V for silicon diode).
For the diode in reverse bias, the current is very small (~A) and
almost remains constant with change in bias. It is called reverse
saturation current. However, for special cases, at very high reverse
bias (break down voltage), the current suddenly increases. This
special action of the diode is discussed later in Section 14.8. The
general purpose diode are not used beyond the reverse saturation
current region.
The above discussion shows that the p-n junction diode primerly
allows the flow of current only in one direction (forward bias). The
forward bias resistance is low as compared to the reverse bias
resistance. This property is used for rectification of ac voltages as
discussed in the next section. For diodes, we define a quantity called
dynamic resistance as the ratio of small change in voltage V to a
small change in current I:

(14.6)

Example 14.4 The V-I characteristic of a silicon diode is shown in the Fig.
14.17. Calculate the resistance of the diode at (a) ID = 15 mA and (b) VD =
10 V.
Figure 14.17
Solution Considering the diode characteristics as a straight line between I =
10 mA to I = 20 mA passing through the origin, we can calculate the resistance
using Ohms law.
(a) From the curve, at I = 20 mA, V = 0.8 V, I = 10 mA, V = 0.7 V
rfb = V/I = 0.1V/10 mA = 10
(b) From the curve at V = 10 V, I = 1 A,
Therefore,
rrb = 10 V/1A= 1.0 107

14.7 APPLICATION OF JUNCTION DIODE AS A


RECTIFIER

From the V-I characteristic of a junction diode we see that it allows


current to pass only when it is forward biased. So if an alternating
voltage is applied across a diode the current flows only in that part of
the cycle when the diode is forward biased. This property is used to
rectify alternating voltages and the circuit used for this purpose is
called a rectifier.

Figure 14.18 (a) Half-wave rectifier circuit, (b) Input ac voltage and output voltage
waveforms from the rectifier circuit.

If an alternating voltage is applied across a diode in series with a load,


a pulsating voltage will appear across the load only during the half
cycles of the ac input during which the diode is forward biased. Such
rectifier circuit, as shown in Fig. 14.18, is called a half-wave rectifier.
The secondary of a transformer supplies the desired ac voltage across
terminals A and B. When the voltage at A is positive, the diode is
forward biased and it conducts. When A is negative, the diode is
reverse-biased and it does not conduct. The reverse saturation current
of a diode is negligible and can be considered equal to zero for
practical purposes. (The reverse breakdown voltage of the diode must
be sufficiently higher than the peak ac voltage at the secondary of the
transformer to protect the diode from reverse breakdown.)
Therefore, in the positive half-cycle of ac there is a current through the
load resistor RL and we get an output voltage, as shown in Fig.
14.18(b), whereas there is no current in the negative half-cycle. In the
next positive half-cycle, again we get the output voltage. Thus, the
output voltage, though still varying, is restricted to only one direction
and is said to be rectified. Since the rectified output of this circuit is
only for half of the input ac wave it is called as half-wave rectifier.
The circuit using two diodes, shown in Fig. 14.19(a), gives output
rectified voltage corresponding to both the positive as well as negative
half of the ac cycle. Hence, it is known as full-wave rectifier. Here
the p-side of the two diodes are connected to the ends of the
secondary of the transformer. The n-side of the diodes are connected
together and the output is taken between this common point of diodes
and the midpoint of the secondary of the transformer. So for a full-
wave rectifier the secondary of the transformer is provided with a
centre tapping and so it is called centre-tap transformer. As can be
seen from Fig.14.19(c) the voltage rectified by each diode is only half
the total secondary voltage. Each diode rectifies only for half the cycle,
but the two do so for alternate cycles. Thus, the output between their
common terminals and the centre-tap of the transformer becomes a
full-wave rectifier output. (Note that there is another circuit of full wave
rectifier which does not need a centre-tap transformer but needs four
diodes.) Suppose the input voltage to A with respect to the centre tap
at any instant is positive. It is clear that, at that instant, voltage at B
being out of phase will be negative as shown in Fig.14.19(b). So,
diode D1 gets forward biased and conducts (while D2 being reverse
biased is not conducting). Hence, during this positive half cycle we get
an output current (and a output voltage across the load resistor RL) as
shown in Fig.14.19(c). In the course of the ac cycle when the voltage
at A becomes negative with respect to centre tap, the voltage at B
would be positive. In this part of the cycle diode D1 would not conduct
but diode D2 would, giving an output current and output voltage
(across RL) during the negative half cycle of the input ac. Thus, we
get output voltage during both the positive as well as the negative half
of the cycle. Obviously, this is a more efficient circuit for getting
rectified voltage or current than the half-wave rectifier
Figure 14.19 (a) A Full-wave rectifier circuit; (b) Input wave forms given to the
diode D1 at A and to the diode D2 at B; (c) Output waveform across the load RL
connected in the full-wave rectifier circuit.

The rectified voltage is in the form of pulses of the shape of half


sinusoids. Though it is unidirectional it does not have a steady value.
To get steady dc output from the pulsating voltage normally a
capacitor is connected across the output terminals (parallel to the load
RL). One can also use an inductor in series with RL for the same
purpose. Since these additional circuits appear to filter out the ac
ripple and give a pure dc voltage, so they are called filters.
Now we shall discuss the role of capacitor in filtering. When the
voltage across the capacitor is rising, it gets charged. If there is no
external load, it remains charged to the peak voltage of the rectified
output. When there is a load, it gets discharged through the load and
the voltage across it begins to fall. In the next half-cycle of rectified
output it again gets charged to the peak value (Fig. 14.20). The rate of
fall of the voltage across the capacitor depends upon the inverse
product of capacitor C and the effective resistance RL used in the
circuit and is called the time constant. To make the time constant large
value of C should be large. So capacitor input filters use large
capacitors. The output voltage obtained by using capacitor input filter
is nearer to the peak voltage of the rectified voltage. This type of filter
is most widely used in power supplies.

14.8 SPECIAL PURPOSE P-N JUNCTION DIODES


In the section, we shall discuss some devices which are basically
junction diodes but are developed for different applications.

14.8.1 Zener diode


It is a special purpose semiconductor diode, named after its
inventor C. Zener. It is designed to operate under reverse bias in the
breakdown region and used as a voltage regulator. The symbol for
Zener diode is shown in Fig. 14.21(a).

Figure 14.20 (a) A full-wave rectifier with capacitor filter, (b) Input and
output voltage of rectifier in (a).

Zener diode is fabricated by heavily doping both p-, and n- sides of the
junction. Due to this, depletion region formed is very thin (<106 m)
and the electric field of the junction is extremely high (~5106 V/m)
even for a small reverse bias voltage of about 5V. The I-V
characteristics of a Zener diode is shown in Fig. 14.21(b). It is seen
that when the applied reverse bias voltage(V) reaches the breakdown
voltage (Vz) of the Zener diode, there is a large change in the current.
Note that after the breakdown voltage Vz, a large change in the
current can be produced by almost insignificant change in the reverse
bias voltage. In other words, Zener voltage remains constant, even
though current through the Zener diode varies over a wide range. This
property of the Zener diode is used for regulating supply voltages so
that they are constant.
Let us understand how reverse current suddenly increases at the
breakdown voltage. We know that reverse current is due to the flow of
electrons (minority carriers) from p n and holes from n p. As the
reverse bias voltage is increased, the electric field at the junction
becomes significant. When the reverse bias voltage V = Vz, then the
electric field strength is high enough to pull valence electrons from the
host atoms on the p-side which are accelerated to n-side. These
electrons account for high current observed at the breakdown. The
emission of electrons from the host atoms due to the high electric field
is known as internal field emission or field ionisation. The electric field
required for field ionisation is of the order of 106 V/m.
Figure 14.21 Zener diode, (a) symbol, (b) I-V characteristics.

Zener diode as a voltage regulator


We know that when the ac input voltage of a rectifier fluctuates, its
rectified output also fluctuates. To get a constant dc voltage from the
dc unregulated output of a rectifier, we use a Zener diode. The circuit
diagram of a voltage regulator using a Zener diode is shown in Fig.
14.22.
The unregulated dc voltage (filtered output of a rectifier) is connected
to the Zener diode through a series resistance Rs such that the Zener
diode is reverse biased. If the input voltage increases, the current
through Rs and Zener diode also increases. This increases the
voltage drop across Rs without any change in the voltage across the
Zener diode. This is because in the breakdown region, Zener voltage
remains constant even though the current through the Zener diode
changes. Similarly, if the input voltage decreases, the current through
Rs and Zener diode also decreases. The voltage drop across Rs
decreases without any change in the voltage across the Zener diode.
Thus any increase/decrease in the input voltage results in,
increase/decrease of the voltage drop across Rs without any change
in voltage across the Zener diode. Thus the Zener diode acts as a
voltage regulator. We have to select the Zener diode according to the
required output voltage and accordingly the series resistance Rs.

Figure 14.22 Zener diode as DC voltage regulator


Example 14.5 In a Zener regulated power supply a Zener diode with VZ = 6.0
V is used for regulation. The load current is to be 4.0 mA and the unregulated
input is 10.0 V. What should be the value of series resistor RS?
Solution
The value of RS should be such that the current through the Zener diode is
much larger than the load current. This is to have good load regulation.
Choose Zener current as five times the load current, i.e., IZ = 20 mA. The total
current through RS is, therefore, 24 mA. The voltage drop across RS is 10.0
6.0 = 4.0 V. This gives
RS = 4.0V/(24 103) A = 167 . The nearest value of carbon resistor is 150
. So, a series resistor of 150 is appropriate. Note that slight variation in the
value of the resistor does not matter, what is important is that the current IZ
should be sufficiently larger than IL.

14.8.2 Optoelectronic junction devices

We have seen so far, how a semiconductor diode behaves under


applied electrical inputs. In this section, we learn about semiconductor
diodes in which carriers are generated by photons (photo-excitation).
All these devices are called optoelectronic devices. We shall study the
functioning of the following optoelectronic devices:
(i) Photodiodes used for detecting optical signal (photodetectors).
(ii) Light emitting diodes (LED) which convert electrical energy into
light.
(iii) Photovoltaic devices which convert optical radiation into electricity
(solar cells).
(i) Photodiode
A Photodiode is again a special purpose p-n junction diode fabricated
with a transparent window to allow light to fall on the diode. It is
operated under reverse bias. When the photodiode is illuminated with
light (photons) with energy (h) greater than the energy gap (Eg) of
the semiconductor, then electron-hole pairs are generated due to the
absorption of photons. The diode is fabricated such that the
generation of e-h pairs takes place in or near the depletion region of
the diode. Due to electric field of the junction, electrons and holes are
separated before they recombine. The direction of the electric field is
such that electrons reach n-side and holes reach p-side. Electrons are
collected on n-side and holes are collected on p-side giving rise to an
emf. When an external load is connected, current flows. The
magnitude of the photocurrent depends on the intensity of incident
light (photocurrent is proportional to incident light intensity).
It is easier to observe the change in the current with change in the
light intensity, if a reverse bias is applied. Thus photodiode can be
used as a photodetector to detect optical signals. The circuit diagram
used for the measurement of I-V characteristics of a photodiode is
shown in
Fig. 14.23(a) and a typical I-V characteristics in Fig. 14.23(b).
Figure 14.23 (a) An illuminated photodiode under reverse bias , (b) I-
V characteristics of a photodiode for different illumination intensity I4 > I3 > I2 > I1.

Example 14.6 The current in the forward bias is known to be more (~mA) than
the current in the reverse bias (~A). What is the reason then to operate the
photodiodes in reverse bias?
Solution Consider the case of an n-type semiconductor. Obviously, the
majority carrier density (n) is considerably larger than the minority hole density
p (i.e., n >> p). On illumination, let the excess electrons and holes generated
be n and p, respectively:
n = n + n
p = p + p
Here n and p are the electron and hole concentrations* at any particular
illumination and n and p are carriers concentration when there is no
illumination. Remember n = p and n >> p. Hence, the fractional change in
the majority carriers (i.e., n/n) would be much less than that in the minority
carriers (i.e., p/p). In general, we can state that the fractional change due to
the photo-effects on the minority carrier dominated reverse bias current is
more easily measurable than the fractional change in the forward bias current.
Hence, photodiodes are preferably used in the reverse bias condition for
measuring light intensity.

(ii) Light emitting diode


It is a heavily doped p-n junction which under forward bias emits
spontaneous radiation. The diode is encapsulated with a transparent
cover so that emitted light can come out.

* Note that, to create an e-h pair, we spend some energy (photoexcitation, thermal
excitation, etc.). Therefore when an electron and hole recombine the energy is
released in the form of light (radiative recombination) or heat (non-radiative
recombination). It depends on semiconductor and the method of fabrication of the
p-n junction. For the fabrication of LEDs, semiconductors like GaAs, GaAs-GaP
are used in which radiative recombination dominates.
When the diode is forward biased, electrons are sent from n p
(where they are minority carriers) and holes are sent from p n
(where they are minority carriers). At the junction boundary the
concentration of minority carriers increases compared to the
equilibrium concentration (i.e., when there is no bias). Thus at the
junction boundary on either side of the junction, excess minority
carriers are there which recombine with majority carriers near the
junction. On recombination, the energy is released in the form of
photons. Photons with energy equal to or slightly less than the band
gap are emitted. When the forward current of the diode is small, the
intensity of light emitted is small. As the forward current increases,
intensity of light increases and reaches a maximum. Further increase
in the forward current results in decrease of light intensity. LEDs are
biased such that the light emitting efficiency is maximum.
The V-I characteristics of a LED is similar to that of a Si junction diode.
But the threshold voltages are much higher and slightly different for
each colour. The reverse breakdown voltages of LEDs are very low,
typically around 5V. So care should be taken that high reverse
voltages do not appear across them.

LEDs that can emit red, yellow, orange, green and blue light are
commercially available. The semiconductor used for fabrication of
visible LEDs must at least have a band gap of 1.8 eV (spectral range
of visible light is from about 0.4 m to 0.7 m, i.e., from about 3 eV to
1.8 eV). The compound semiconductor Gallium Arsenide Phosphide
(GaAs1xPx) is used for making LEDs of different colours. GaAs0.6
P0.4 (Eg ~ 1.9 eV) is used for red LED. GaAs (Eg ~ 1.4 eV) is used for
making infrared LED. These LEDs find extensive use in remote
controls, burglar alarm systems, optical communication, etc. Extensive
research is being done for developing white LEDs which can replace
incandescent lamps.

LEDs have the following advantages over conventional incandescent


low power lamps:
(i) Low operational voltage and less power.
(ii) Fast action and no warm-up time required.
(iii) The bandwidth of emitted light is 100 to 500 or in other words
it is nearly (but not exactly) monochromatic.
(iv) Long life and ruggedness.
(v) Fast on-off switching capability.
(iii) Solar cell
A solar cell is basically a p-n junction which generates emf when solar
radiation falls on the
p-n junction. It works on the same principle (photovoltaic effect) as the
photodiode, except that no external bias is applied and the junction
area is kept much larger for solar radiation to be incident because we
are interested in more power.

A simple p-n junction solar cell is shown in Fig. 14.24.


A p-Si wafer of about 300 m is taken over which a thin layer (~0.3
m) of n-Si is grown on one-side by diffusion process. The other side
of
p-Si is coated with a metal (back contact). On the top of n-Si layer,
metal finger electrode (or metallic grid) is deposited. This acts as a
front contact. The metallic grid occupies only a very small fraction of
the cell area (<15%) so that light can be incident on the cell from the
top.
The generation of emf by a solar cell, when light falls on, it is due to
the following three basic processes: generation, separation and
collection (i) generation of e-h pairs due to light (with h > Eg) close
to the junction; (ii) separation of electrons and holes due to electric
field of the depletion region. Electrons are swept to n-side and holes to
p-side;
(iii) the electrons reaching the n-side are collected by the front contact
and holes reaching p-side are collected by the back contact. Thus p-
side becomes positive and n-side becomes negative giving rise to
photovoltage.

Figure 14.24 (a) Typical p-n junction solar cell; (b) Cross-sectional view.

When an external load is connected as shown in the Fig. 14.25(a) a


photocurrent IL flows through the load. A typical I-V characteristics of
a solar cell is shown in the Fig. 14.25(b).

Note that the I V characteristics of solar cell is drawn in the fourth


quadrant of the coordinate axes. This is because a solar cell does not
draw current but supplies the same to the load.
Semiconductors with band gap close to 1.5 eV are ideal materials for
solar cell fabrication. Solar cells are made with semiconductors like Si
(Eg = 1.1 eV), GaAs (Eg = 1.43 eV), CdTe (Eg = 1.45 eV), CuInSe2 (Eg
= 1.04 eV), etc. The important criteria for the selection of a material for
solar cell fabrication are (i) band gap (~1.0 to 1.8 eV), (ii) high optical
absorption (~104 cm1), (iii) electrical conductivity, (iv) availability of
the raw material, and (v) cost. Note that sunlight is not always required
for a solar cell. Any light with photon energies greater than the
bandgap will do. Solar cells are used to power electronic devices in
satellites and space vehicles and also as power supply to some
calculators. Production of low-cost photovoltaic cells for large-scale
solar energy is a topic
for research.
Figure 14.25 (a) A typical illuminated p-n junction solar cell; (b) I-V characteristics
of a solar cell.

Example 14.7 Why are Si and GaAs are preferred materials for solar cells?
Solution The solar radiation spectrum received by us is shown in Fig. 14.26.
Figure 14.26

The maxima is near 1.5 eV. For photo-excitation, h > Eg. Hence,
semiconductor with band gap ~1.5 eV or lower is likely to give better solar
conversion efficiency. Silicon has Eg ~ 1.1 eV while for GaAs it is ~1.53 eV. In
fact, GaAs is better (in spite of its higher band gap) than Si because of its
relatively higher absorption coefficient. If we choose materials like CdS or
CdSe (Eg ~ 2.4 eV), we can use only the high energy component of the solar
energy for photo-conversion and a significant part of energy will be of no use.
The question arises: why we do not use material like PbS (Eg ~ 0.4 eV) which
satisfy the condition h > Eg for maxima corresponding to the solar radiation
spectra? If we do so, most of the solar radiation will be absorbed on the top-
layer of solar cell and will not reach in or near the depletion region. For
effective electron-hole separation, due to the junction field, we want the photo-
generation to occur in the junction region only.
14.9 JUNCTION TRANSISTOR

The credit of inventing the transistor in the year 1947 goes to J.


Bardeen and W.H. Brattain of Bell Telephone Laboratories, U.S.A.
That transistor was a point-contact transistor. The first junction
transistor consisting of two back-to-back p-n junctions was invented by
William Schockley in 1951.
As long as only the junction transistor was known, it was known simply
as transistor. But over the years new types of transistors were
invented and to differentiate it from the new ones it is now called the
Bipolar Junction Transistor (BJT). Even now, often the word transistor
is used to mean BJT when there is no confusion. Since our study is
limited to only BJT, we shall use the word transistor for BJT without
any ambiguity.

14.9.1 Transistor: structure and action

A transistor has three doped regions forming two p-n junctions


between them. Obviously, there are two types of transistors, as shown
in Fig. 14.27.
(i) n-p-n transistor: Here two segments of n-type semiconductor
(emitter and collector) are separated by a segment of p-type
semiconductor (base).
(ii) p-n-p transistor: Here two segments of p-type semiconductor
(termed as emitter and collector) are separated by a segment of
n-type semiconductor (termed as base).
The schematic representations of an n-p-n and a p-n-p configuration
are shown in Fig. 14.27(a). All the three segments of a transistor have
different thickness and their doping levels are also different. In the
schematic symbols used for representing p-n-p and n-p-n transistors
[Fig. 14.27(b)] the arrowhead shows the direction of conventional
current in the transistor. A brief description of the three segments of a
transistor is given below:
Emitter: This is the segment on one side of the transistor shown in
Fig. 14.27(a). It is of moderate size and heavily doped. It supplies a
large number of majority carriers for the current flow through the
transistor.
Base: This is the central segment. It is very thin and lightly doped.
Collector: This segment collects a major portion of the majority
carriers supplied by the emitter. The collector side is moderately
doped and larger in size as compared to the emitter.
Figure 14.27 (a) Schematic representations of a n-p-n transistor and p-n-p
transistor, and (b) Symbols for n-p-n and p-n-p transistors.

We have seen earlier in the case of a p-n junction, that there is a


formation of depletion region acorss the junction. In case of a
transistor depletion regions are formed at the emitter base-junction
and the base-collector junction. For understanding the action of a
transistor, we have to consider the nature of depletion regions formed
at these junctions. The charge carriers move across different regions
of the transistor when proper voltages are applied across its terminals.
The biasing of the transistor is done differently for different uses. The
transistor can be used in two distinct ways. Basically, it was invented
to function as an amplifier, a device which produces a enlarged copy
of a signal. But later its use as a switch acquired equal importance.
We shall study both these functions and the ways the transistor is
biased to achieve these mutually exclusive functions.

First we shall see what gives the transistor its amplifying capabilities.
The transistor works as an amplifier, with its emitter-base junction
forward biased and the base-collector junction reverse biased. This
situation is shown in Fig. 14.28, where VCC and VEE are used for
creating the respective biasing. When the transistor is biased in this
way it is said to be in active state.We represent the voltage between
emitter and base as VEB and that between the collector and the base
as VCB. In Fig. 14.28, base is a common terminal for the two power
supplies whose other terminals are connected to emitter and collector,
respectively. So the two power supplies are represented as VEE, and
VCC, respectively. In circuits, where emitter is the common terminal,
the power supply between the base and the emitter is represented as
VBB and that between collector and emitter as VCC.
Let us see now the paths of current carriers in the transistor with
emitter-base junction forward biased and base-collector junction
reverse biased. The heavily doped emitter has a high concentration of
majority carriers, which will be holes in a p-n-p transistor and electrons
in an n-p-n transistor. These majority carriers enter the base region in
large numbers. The base is thin and lightly doped. So the majority
carriers there would be few. In a p-n-p transistor the majority carriers
in the base are electrons since base is of n-type semiconductor. The
large number of holes entering the base from the emitter swamps the
small number of electrons there. As the base collector-junction is
reverse-biased, these holes, which appear as minority carriers at the
junction, can easily cross the junction and enter the collector. The
holes in the base could move either towards the base terminal to
combine with the electrons entering from outside or cross the junction
to enter into the collector and reach the collector terminal. The base is
made thin so that most of the holes find themselves near the reverse-
biased base-collector junction and so cross the junction instead of
moving to the base terminal.
Figure 14.28 Bias Voltage applied on: (a) p-n-p transistor and (b) n-p-n transistor.

It is interesting to note that due to forward bias a large current enters


the emitter-base junction, but most of it is diverted to adjacent reverse-
biased base-collector junction and the current coming out of the base
becomes a very small fraction of the current that entered the junction.
If we represent the hole current and the electron current crossing the
forward biased junction by Ih and Ie respectively then the total current
in a forward biased diode is the sum Ih + Ie. We see that the emitter
current IE = Ih + Ie but the base current IB << Ih + Ie, because a major
part of IE goes to collector instead of coming out of the base terminal.
The base current is thus a small fraction of the emitter current.
The current entering into the emitter from outside is equal to the
emitter current IE. Similarly the current emerging from the base
terminal is IB and that from collector terminal is IC. It is obvious from
the above description and also from a straight forward application of
Kirchhoffs law to Fig. 14.28(a) that the emitter current is the sum of
collector current and base current:
IE = IC + IB (14.7)
We also see that IC IE.
Our description of the direction of motion of the holes is identical with
the direction of the conventional current. But the direction of motion of
electrons is just opposite to that of the current. Thus in a p-n-p
transistor the current enters from emitter into base whereas in a n-p-n
transistor it enters from the base into the emitter. The arrowhead in
the emitter shows the direction of the conventional current.
The description about the paths followed by the majority and minority
carriers in a n-p-n is exactly the same as that for the p-n-p transistor.
But the current paths are exactly opposite, as shown in Fig. 14.28. In
Fig. 14.28(b) the electrons are the majority carriers supplied by the n-
type emitter region. They cross the thin p-base region and are able to
reach the collector to give the collector current, IC . From the above
description we can conclude that in the active state of the transistor
the emitter-base junction acts as a low resistance while the base
collector acts as a high resistance.

14.9.2 Basic transistor circuit configurations and


transistor characteristics

In a transistor, only three terminals are available, viz., Emitter (E),


Base (B) and Collector (C). Therefore, in a circuit the input/output
connections have to be such that one of these (E, B or C) is common
to both the input and the output. Accordingly, the transistor can be
connected in either of the following three configurations:
Common Emitter (CE), Common Base (CB), Common Collector (CC)
The transistor is most widely used in the CE configuration and we
shall restrict our discussion to only this configuration. Since more
commonly used transistors are n-p-n Si transistors, we shall confine
our discussion to such transistors only. With p-n-p transistors the
polarities of the external power supplies are to be inverted.
Common emitter transistor characteristics

Figure 14.29 Circuit arrangement for studying the input and output characteristics
of n-p-n transistor in CE configuration.

When a transistor is used in CE configuration, the input is between the


base and the emitter and the output is between the collector and the
emitter. The variation of the base current IB with the base-emitter
voltage VBE is called the input characteristic. Similarly, the variation of
the collector current IC with the collector-emitter voltage VCE is called
the output characteristic. You will see that the output characteristics
are controlled by the input characteristics. This implies that the
collector current changes with the base current.
The input and the output characteristics of an n-p-n transistors can be
studied by using the circuit shown in Fig. 14.29.
To study the input characteristics of the transistor in CE configuration,
a curve is plotted between the base current IB against the base-
emitter voltage VBE. The collector-emitter voltage VCE is kept fixed
while studying the dependence of IB on VBE. We are interested to
obtain the input characteristic when the transistor is in active state. So
the collector-emitter voltage VCE is kept large enough to make the
base collector junction reverse biased. Since VCE = VCB + VBE and for
Si transistor VBE is 0.6 to 0.7 V, VCE must be sufficiently larger than
0.7 V. Since the transistor is operated as an amplifier over large range
of VCE, the reverse bias across the base-collector junction is high
most of the time. Therefore, the input characteristics may be obtained
for VCE somewhere in the range of 3 V to 20 V. Since the increase in
VCE appears as increase in VCB, its effect on IB is negligible. As a
consequence, input characteristics for various values of VCE will give
almost identical curves. Hence, it is enough to determine only one
input characteristics. The input characteristics of a transistor is as
shown in Fig. 14.30(a).
The output characteristic is obtained by observing the variation of IC
as VCE is varied keeping IB constant. It is obvious that if VBE is
increased by a small amount, both hole current from the emitter region
and the electron current from the base region will increase. As a
consequence both IB and IC will increase proportionately. This shows
that when IB increases IC also increases. The plot of IC versus VCE for
different fixed values of IB gives one output characteristic. So there will
be different output characteristics corresponding to different values of
IB as shown in Fig. 14.30(b).
Figure 14.30 (a) Typical input characteristics, and (b) Typical output
characteristics.

The linear segments of both the input and output characteristics can
be used to calculate some important ac parameters of transistors as
shown below.
(i) Input resistance (ri): This is defined as the ratio of change in
base-emitter voltage (VBE) to the resulting change in base current
(IB) at constant collector-emitter voltage (VCE). This is dynamic (ac
resistance) and as can be seen from the input characteristic, its value
varies with the operating current in the transistor:

(14.8)
The value of ri can be anything from a few hundreds to a few
thousand ohms.
(ii) Output resistance (ro): This is defined as the ratio of change in
collector-emitter voltage (VCE) to the change in collector current (IC)
at a constant base current IB.

(14.9)
The output characteristics show that initially for very small values of
VCE, IC increases almost linearly. This happens because the base-
collector junction is not reverse biased and the transistor is not in
active state. In fact, the transistor is in the saturation state and the
current is controlled by the supply voltage VCC (=VCE) in this part of
the characteristic. When VCE is more than that required to reverse bias
the base-collector junction, IC increases very little with VCE. The
reciprocal of the slope of the linear part of the output characteristic
gives the values of ro. The output resistance of the transistor is mainly
controlled by the bias of the base-collector junction. The high
magnitude of the output resistance (of the order of 100 k) is due to
the reverse-biased state of this diode. This also explains why the
resistance at the initial part of the characteristic, when the transistor is
in saturation state, is very low.
(iii) Current amplification factor (): This is defined as the ratio of
the change in collector current to the change in base current at a
constant collector-emitter voltage (VCE) when the transistor is in active
state.

(14.10)
This is also known as small signal current gain and its value is very
large.

If we simply find the ratio of IC and IB we get what is called dc of the


transistor. Hence,

(14.11)
Since IC increases with IB almost linearly and IC = 0 when IB = 0, the
values of both dc and ac are nearly equal. So, for most calculations
dc can be used. Both ac and dc vary with VCE and IB (or IC) slightly.
Example 14.8 From the output characteristics shown in Fig. 14.30(b),
calculate the values of ac and dc of the transistor when VCE is 10 V and IC
= 4.0 mA.
Solution

,
For determining ac and dc at the stated values of VCE and IC one can
proceed as follows. Consider any two characteristics for two values of IB which
lie above and below the given value of IC . Here IC = 4.0 mA. (Choose
characteristics for IB= 30 and 20 A.) At VCE = 10 V we read the two values of
IC from the graph. Then
IB = (30 20) A = 10 A, IC = (4.5 3.0) mA = 1.5 mA
Therefore, ac = 1.5 mA/ 10 A = 150
For determining dc, either estimate the value of IB corresponding to
IC = 4.0 mA at VCE = 10 V or calculate the two values of dc for the two
characteristics chosen and find their mean.
Therefore, for IC = 4.5 mA and IB = 30 A,
dc = 4.5 mA/ 30 A = 150
and for IC = 3.0 mA and IB = 20 A
dc =3.0 mA / 20 A = 150
Hence, dc =(150 + 150) /2 = 150

14.9.3 Transistor as a device

The transistor can be used as a device application depending on the


configuration used (namely CB, CC and CE), the biasing of the E-B
and B-C junction and the operation region namely cutoff, active region
and saturation. As mentioned earlier we have confined only to the CE
configuration and will be concentrating on the biasing and the
operation region to understand the working of a device.
When the transistor is used in the cutoff or saturation state it acts as a
switch. On the other hand for using the transistor as an amplifier, it
has to operate in the active region.

(i) Transistor as a switch


We shall try to understand the operation of the transistor as a switch
by analysing the behaviour of the base-biased transistor in CE
configuration as shown in Fig. 14.31(a).
Applying Kirchhoffs voltage rule to the input and output sides of this
circuit, we get
VBB = IBRB + VBE (14.12)
and
VCE = VCC ICRC. (14.13)

We shall treat VBB as the dc input voltage Vi and VCE as the dc output
voltage VO. So, we have
Vi = IBRB + VBE and
Vo = VCC ICRC.
Let us see how Vo changes as Vi increases from zero onwards. In the
case of Si transistor, as long as input Vi is less than 0.6 V, the
transistor will be in cut off state and current IC will be zero.
Hence Vo = VCC
When Vi becomes greater than 0.6 V the transistor is in active state
with some current IC in the output path and the output Vo decrease as
the term ICRC increases. With increase of Vi , IC increases almost
linearly and so Vo decreases linearly till its value becomes less than
about 1.0 V.
Beyond this, the change becomes non linear and transistor goes into
saturation state. With further increase in Vi the output voltage is found
to decrease further towards zero though it may never become zero. If
we plot the Vo vs Vi curve, [also called the transfer characteristics of
the base-biased transistor (Fig. 14.31(b)], we see that between cut off
state and active state and also between active state and saturation
state there are regions of non-linearity showing that the transition from
cutoff state to active state and from active state to saturation state are
not sharply defined.
Figure 14.31 (a) Base-biased transistor in CE configuration, (b) Transfer
characteristic.

Let us see now how the transistor is operated as a switch. As long as


Vi is low and unable to forward-bias the transistor, Vo is high (at VCC ).
If Vi is high enough to drive the transistor into saturation, then Vo is
low, very near to zero. When the transistor is not conducting it is said
to be switched off and when it is driven into saturation it is said to be
switched on. This shows that if we define low and high states as below
and above certain voltage levels corresponding to cutoff and
saturation of the transistor, then we can say that a low input switches
the transistor off and a high input switches it on. Alternatively, we can
say that a low input to the transistor gives a high output and a high
input gives a low output. The switching circuits are designed in such a
way that the transistor does not remain in active state.
(ii) Transistor as an amplifier
For using the transistor as an amplifier we will use the active region of
the Vo versus Vi curve. The slope of the linear part of the curve
represents the rate of change of the output with the input. It is
negative because the output is VCC ICRC and not ICRC. That is why
as input voltage of the CE amplifier increases its output voltage
decreases and the output is said to be out of phase with the input. If
we consider Vo and Vi as small changes in the output and input
voltages then Vo/Vi is called the small signal voltage gain AV of the
amplifier.

If the VBB voltage has a fixed value corresponding to the mid point of
the active region, the circuit will behave as a CE amplifier with voltage
gain Vo/ Vi. We can express the voltage gain AV in terms of the
resistors in the circuit and the current gain of the transistor as follows.
We have, Vo = VCC ICRC
Therefore, Vo = 0 RC IC
Similarly, from Vi = IBRB + VBE
Vi = RB IB + VBE
But VBE is negligibly small in comparison to IBRB in this circuit.

So, the voltage gain of this CE amplifier (Fig. 14.32) is given by


AV = RC IC / RB IB
= ac(RC /RB ) (14.14)
where ac is equal to IC/IB from Eq. (14.10). Thus the linear
portion of the active region of the transistor can be exploited for the
use in amplifiers. Transistor as an amplifier (CE configuration) is
discussed in detail in the next section.

14.9.4 Transistor as an Amplifier (CE-


Configuration)

To operate the transistor as an amplifier it is necessary to fix its


operating point somewhere in the middle of its active region. If we fix
the value of VBB corresponding to a point in the middle of the linear
part of the transfer curve then the dc base current IB would be
constant and corresponding collector current IC will also be constant.
The dc voltage VCE = VCC - ICRC would also remain constant. The
operating values of VCE and IB determine the operating point, of the
amplifier.
If a small sinusoidal voltage with amplitude vs is superposed on the dc
base bias by connecting the source of that signal in series with the
VBB supply, then the base current will have sinusoidal variations
superimposed on the value of IB. As a consequence the collector
current also will have sinusoidal variations superimposed on the value
of IC, producing in turn corresponding change in the value of VO. We
can measure the ac variations across the input and output terminals
by blocking the dc voltages by large capacitors.
In the discription of the amplifier given above we have not considered
any ac signal. In general, amplifiers are used to amplify alternating
signals. Now let us superimpose an ac input signal vi (to be amplified)
on the bias VBB (dc) as shown in Fig. 14.32. The output is taken
between the collector and the ground.

Figure 14.32 A simple circuit of a CE-transistor amplifier.

The working of an amplifier can be easily understood, if we first


assume that vi = 0. Then applying Kirchhoffs law to the output loop,
we get
Vcc = VCE + IcRL (14.15)
Likewise, the input loop gives
VBB = VBE + IB RB (14.16)
When vi is not zero, we get

VBE + vi = VBE + IB RB + IB (RB + ri)


The change in VBE can be related to the input resistance ri [see Eq.
(14.8)] and the change in IB. Hence
vi = IB (RB + ri)

= r IB
The change in IB causes a change in Ic. We define a parameter ac,
which is similar to the dc defined in Eq. (14.11), as

(14.17)
which is also known as the ac current gain Ai. Usually ac is close to
dc in the linear region of the output characteristics.
The change in Ic due to a change in IB causes a change in VCE and
the voltage drop across the resistor RL because VCC is fixed.
These changes can be given by Eq. (14.15) as

VCC = VCE + RL IC = 0
or VCE = RL IC
The change in VCE is the output voltage v0. From Eq. (14.10), we get
v0 = VCE = ac RL IB
The voltage gain of the amplifier is
(14.18)
The negative sign represents that output voltage is opposite with
phase with the input voltage.

From the discussion of the transistor characteristics you have seen


that there is a current gain ac in the CE configuration. Here we have
also seen the voltage gain Av. Therefore the power gain Ap can be
expressed as the product of the current gain and voltage gain.
Mathematically
Ap = ac Av (14.19)
Since ac and Av are greater than 1, we get ac power gain. However
it should be realised that transistor is not a power generating device.
The energy for the higher ac power at the output is supplied by the
battery.

Example 14.9 In Fig. 14.31(a), the VBB supply can be varied from 0V to 5.0 V.
The Si transistor has dc = 250 and RB = 100 k, RC = 1 K, VCC = 5.0V.
Assume that when the transistor is saturated, VCE = 0V and VBE = 0.8V.
Calculate (a) the minimum base current, for which the transistor will reach
saturation. Hence, (b) determine V1 when the transistor is switched on. (c)
find the ranges of V1 for which the transistor is switched off and switched on.
Solution
Given at saturation VCE = 0V, VBE = 0.8V
VCE = VCC ICRC
IC = VCC/RC = 5.0V/1.0k = 5.0 mA
Therefore IB = IC/ = 5.0 mA/250 = 20A
The input voltage at which the transistor will go into saturation is given by
VIH = VBB = IBRB +VBE
= 20A 100 k + 0.8V = 2.8V
The value of input voltage below which the transistor remains cutoff is given by
VIL = 0.6V, VIH = 2.8V
Between 0.0V and 0.6V, the transistor will be in the switched off state.
Between 2.8V and 5.0V, it will be in switched on state.
Note that the transistor is in active state when IB varies from 0.0mA to 20mA.
In this range, IC = IB is valid. In the saturation range,
IC IB.
Example 14.10 For a CE transistor amplifier, the audio signal voltage across
the collector resistance of 2.0 k is 2.0 V. Suppose the current amplification
factor of the transistor is 100, What should be the value of RB in series with
VBB supply of 2.0 V if the dc base current has to be 10 times the signal
current. Also calculate the dc drop across the collector resistance. (Refer to
Fig. 14.33).
Solution The output ac voltage is 2.0 V. So, the ac collector current iC =
2.0/2000 = 1.0 mA. The signal current through the base is, therefore given by
iB = iC / = 1.0 mA/100 = 0.010 mA. The dc base current has to be 10 0.010
= 0.10 mA.
From Eq.14.16, RB = (VBB - VBE ) /IB. Assuming VBE = 0.6 V,
RB = (2.0 0.6 )/0.10 = 14 k.
The dc collector current IC = 1000.10 = 10 mA.

14.9.5 Feedback amplifier and transistor oscillator


In an amplifier, we have seen that a sinusoidal input is given which
appears as an amplified signal in the output. This means that an
external input is necessary to sustain ac signal in the output for an
amplifier. In an oscillator, we get ac output without any external input
signal. In other words, the output in an oscillator is self-sustained. To
attain this, an amplifier is taken. A portion of the output power is
returned back (feedback) to the input in phase with the starting power
(this process is termed positive feedback) as shown in Fig. 14.33(a).
The feedback can be achieved by inductive coupling (through mutual
inductance) or LC or RC networks. Different types of oscillators
essentially use different methods of coupling the output to the input
(feedback network), apart from the resonant circuit for obtaining
oscillation at a particular frequency. For understanding the oscillator
action, we consider the circuit shown in Fig. 14.33(b) in which the
feedback is accomplished by inductive coupling from one coil winding
(T1) to another coil winding (T2). Note that the coils T2 and T1 are
wound on the same core and hence are inductively coupled through
their mutual inductance. As in an amplifier, the base-emitter junction is
forward biased while the base-collector junction is reverse biased.
Detailed biasing circuits actually used have been omitted for simplicity.
Figure 14.33 (a) Principle of a transistor amplifier with positive feedback working
as an oscillator and (b) Tuned collector oscillator, (c) Rise and fall (or built up) of
current Ic and Iedue to the inductive coupling.
Let us try to understand how oscillations are built. Suppose switch S1
is put on to apply proper bias for the first time. Obviously, a surge of
collector current flows in the transistor. This current flows through the
coil T2 where terminals are numbered 3 and 4 [Fig. 14.33(b)]. This
current does not reach full amplitude instantaneously but increases
from X to Y, as shown in Fig. [14.33(c)(i)]. The inductive coupling
between coil T2 and coil T1 now causes a current to flow in the emitter
circuit (note that this actually is the feedback from input to output). As
a result of this positive feedback, this current (in T1; emitter current)
also increases from X to Y [Fig. 14.33(c)(ii)]. The current in T2
(collector current) connected in the collector circuit acquires the value
Y when the transistor becomes saturated. This means that maximum
collector current is flowing and can increase no further. Since there is
no further change in collector current, the magnetic field around T2
ceases to grow. As soon as the field becomes static, there will be no
further feedback from T2 to T1. Without continued feedback, the
emitter current begins to fall. Consequently, collector current
decreases from Y towards Z [Fig. 14.33(c)(i)]. However, a decrease of
collector current causes the magnetic field to decay around the coil T2.
Thus, T1 is now seeing a decaying field in T2 (opposite from what it
saw when the field was growing at the initial start operation). This
causes a further decrease in the emitter current till it reaches Zwhen
the transistor is cut-off. This means that both IE and IC cease to flow.
Therefore, the transistor has reverted back to its original state (when
the power was first switched on). The whole process now repeats
itself. That is, the transistor is driven to saturation, then to cut-off, and
then back to saturation. The time for change from saturation to cut-off
and back is determined by the constants of the tank circuit or tuned
circuit (inductance L of coil T2 and C connected in parallel to it). The
resonance frequency () of this tuned circuit determines the frequency
at which the oscillator will oscillate.

(14.20)
In the circuit of Fig. 14.33(b), the tank or tuned circuit is connected in
the collector side. Hence, it is known as tuned collector oscillator. If
the tuned circuit is on the base side, it will be known as tuned base
oscillator. There are many other types of tank circuits (say RC) or
feedback circuits giving different types of oscillators like Colpitts
oscillator, Hartley oscillator, RC-oscillator.

14.10 DIGITAL ELECTRONICS AND LOGIC GATES

In electronics circuits like amplifiers, oscillators, introduced to you in


earlier sections, the signal (current or voltage) has been in the form of
continuous, time-varying voltage or current. Such signals are called
continuous or analogue signals. A typical analogue signal is shown in
Figure. 14.34(a). Fig. 14.34(b) shows a pulse waveform in which only
discrete values of voltages are possible. It is convenient to use binary
numbers to represent such signals. A binary number has only two
digits 0 (say, 0V) and 1 (say, 5V). In digital electronics we use only
these two levels of voltage as shown in Fig. 14.34(b). Such signals are
called Digital Signals. In digital circuits only two values (represented
by 0 or 1) of the input and output voltage are permissible.
This section is intended to provide the first step in our understanding
of digital electronics. We shall restrict our study to some basic building
blocks of digital electronics (called Logic Gates) which process the
digital signals in a specific manner. Logic gates are used in
calculators, digital watches, computers, robots, industrial control
systems, and in telecommunications.
A light switch in your house can be used as an example of a digital
circuit. The light is either ON or OFF depending on the switch position.
When the light is ON, the output value is 1. When the light is OFF the
output value is 0. The inputs are the position of the light switch. The
switch is placed either in the ON or OFF position to activate the light.

14.10.1 Logic gates

Figure 14.34 (a) Analogue signal, (b) Digital signal.

A gate is a digital circuit that follows curtain logical relationship


between the input and output voltages. Therefore, they are generally
known as logic gates gates because they control the flow of
information. The five common logic gates used are NOT, AND, OR,
NAND, NOR. Each logic gate is indicated by a symbol and its function
is defined by a truth table that shows all the possible input logic level
combinations with their respective output logic levels. Truth tables
help understand the behaviour of logic gates.

(b)
Figure 14.35
(a) Logic symbol,
(b) Truth table of
NOT gate.

These logic gates can be realised using semiconductor devices.


(i) NOT gate
This is the most basic gate, with one input and one output. It produces
a 1 output if the input is 0 and vice-versa. That is, it produces an
inverted version of the input at its output. This is why it is also known
as an inverter. The commonly used symbol together with the truth
table for this gate is given in Fig. 14.35.
(ii) OR Gate
An OR gate has two or more inputs with one output. The logic symbol
and truth table are shown in Fig. 14.36. The output Y is 1 when either
input A or input B or both are 1s, that is, if any of the input is high, the
output is high.

(b)

Figure 14.36 (a) Logic symbol (b) Truth table of OR gate.

Apart from carrying out the above mathematical logic operation, this
gate can be used for modifying the pulse waveform as explained in
the following example.
Example 14.11 Justify the output waveform (Y) of the OR gate for the
following inputs A and B given in Fig. 14.37.
Solution Note the following:
At t < t1; A = 0, B = 0; Hence Y = 0
For t1 to t2; A = 1, B = 0; Hence Y = 1
For t2 to t3; A = 1, B = 1; Hence Y = 1
For t3 to t4; A = 0, B = 1; Hence Y = 1
For t4 to t5; A = 0, B = 0; Hence Y = 0
For t5 to t6; A = 1, B = 0; Hence Y = 1
For t > t6; A = 0, B = 1; Hence Y = 1
Therefore the waveform Y will be as shown in the Fig. 14.37.

Figure 14.37

(iii) AND Gate


An AND gate has two or more inputs and one output. The output Y of
AND gate is 1 only when input A and input B are both 1. The logic
symbol and truth table for this gate are given in Fig. 14.38

(b)

Figure 14.38 (a) Logic symbol, (b) Truth table of AND gate.

Example 14.12 Take A and B input waveforms similar to that in Example


14.11. Sketch the output waveform obtained from AND gate.
Solution
For t t1; A = 0, B = 0; Hence Y = 0
For t1 to t2; A = 1, B = 0; Hence Y = 0
For t2 to t3; A = 1, B = 1; Hence Y = 1
For t3 to t4; A = 0, B = 1; Hence Y = 0
For t4 to t5; A = 0, B = 0; Hence Y = 0
For t5 to t6; A = 1, B = 0; Hence Y = 0
For t > t6; A = 0, B = 1; Hence Y = 0
Based on the above, the output waveform for AND gate can be drawn as given
below.
Figure 14.39

(iv) NAND Gate


This is an AND gate followed by a NOT gate. If inputs A and B are
both 1, the output Y is not 1. The gate gets its name from this NOT
AND behaviour. Figure 14.40 shows the symbol and truth table of
NAND gate.
NAND gates are also called Universal Gates since by using these
gates you can realise other basic gates like OR, AND and NOT
(Exercises 14.16 and 14.17).

(b)

Figure 14.40 (a) Logic symbol, (b) Truth table of NAND gate.
Example 14.13 Sketch the output Y from a NAND gate having inputs A and B
given below:
Solution
For t < t1; A = 1, B = 1; Hence Y = 0
For t1 to t2; A = 0, B = 0; Hence Y = 1
For t2 to t3; A = 0, B = 1; Hence Y = 1
For t3 to t4; A = 1, B = 0; Hence Y = 1
For t4 to t5; A = 1, B = 1; Hence Y = 0
For t5 to t6; A = 0, B = 0; Hence Y = 1
For t > t6; A = 0, B = 1; Hence Y = 1

Figure 14.41

(v) NOR Gate


It has two or more inputs and one output. A NOT- operation applied
after OR gate gives a NOT-OR gate (or simply NOR gate). Its output Y
is 1 only when both inputs A and B are 0, i.e., neither one input nor
the other is 1. The symbol and truth table for NOR gate is given
in Fig. 14.42.

(b)
Figure 14 .42 (a) Logic symbol, (b) Truth table of NOR gate.

NOR gates are considered as universal gates because you can obtain
all the gates like AND, OR, NOT by using only NOR gates (Exercises
14.18 and 14.19).

14.11 INTEGRATED CIRCUITS

The conventional method of making circuits is to choose components


like diodes, transistor, R, L, C etc., and connect them by soldering
wires in the desired manner. Inspite of the miniaturisation introduced
by the discovery of transistors, such circuits were still bulky. Apart
from this, such circuits were less reliable and less shock proof. The
concept of fabricating an entire circuit (consisting of many passive
components like R and C and active devices like diode and transistor)
on a small single block (or chip) of a semiconductor has revolutionised
the electronics technology. Such a circuit is known as Integrated
Circuit (IC). The most widely used technology is the Monolithic
Integrated Circuit. The word monolithic is a combination of two greek
words, monos means single and lithos means stone. This, in effect,
means that the entire circuit is formed on a single silicon crystal (or
chip). The chip dimensions are as small as 1mm 1mm or it could
even be smaller. Figure 14.43 shows a chip in its protective plastic
case, partly removed to reveal the connections coming out from the
chip to the pins that enable it to make external connections.
Depending on nature of input signals, ICs can be grouped in two
categories: (a) linear or analogue ICs and (b) digital ICs. The linear
ICs process analogue signals which change smoothly and
continuously over a range of values between a maximum and a
minimum. The output is more or less directly proportional to the input,
i.e., it varies linearly with the input. One of the most useful linear ICs
is the operational amplifier.

Figure 14.43 The casing and connection of a chip.


The digital ICs process signals that have only two values. They
contain circuits such as logic gates. Depending upon the level of
integration (i.e., the number of circuit components or logic gates), the
ICs are termed as Small Scale Integration, SSI (logic gates < 10);
Medium Scale Integration, MSI (logic gates < 100); Large Scale
Integration, LSI (logic gates < 1000); and Very Large Scale
Integration, VLSI (logic gates > 1000). The technology of fabrication is
very involved but large scale industrial production has made them very
inexpensive.

FASTER AND SMALLER: THE FUTURE OF COMPUTER TECHNOLOGY

The Integrated Chip (IC) is at the heart of all computer systems. In fact ICs are
found in almost all electrical devices like cars, televisions, CD players, cell
phones etc. The miniaturisation that made the modern personal computer
possible could never have happened without the IC. ICs are electronic devices
that contain many transistors, resistors, capacitors, connecting wires all in
one package. You must have heard of the microprocessor. The
microprocessor is an IC that processes all information in a computer, like
keeping track of what keys are pressed, running programmes, games etc. The
IC was first invented by Jack Kilky at Texas Instruments in 1958 and he was
awarded Nobel Prize for this in 2000. ICs are produced on a piece of
semiconductor crystal (or chip) by a process called photolithography. Thus, the
entire Information Technology (IT) industry hinges on semiconductors. Over
the years, the complexity of ICs has increased while the size of its features
continued to shrink. In the past five decades, a dramatic miniaturisation in
computer technology has made modern day computers faster and smaller. In
the 1970s, Gordon Moore, co-founder of INTEL, pointed out that the memory
capacity of a chip (IC) approximately doubled every one and a half years. This
is popularly known as Moores law. The number of transistors per chip has
risen exponentially and each year computers are becoming more powerful, yet
cheaper than the year before. It is intimated from current trends that the
computers available in 2020 will operate at 40 GHz (40,000 MHz) and would
be much smaller, more efficient and less expensive than present day
computers. The explosive growth in the semiconductor industry and computer
technology is best expressed by a famous quote from Gordon Moore: If the
auto industry advanced as rapidly as the semiconductor industry, a Rolls
Royce would get half a million miles per gallon, and it would be cheaper to
throw it away than to park it.

SUMMARY

1. Semiconductors are the basic materials used in the present solid state
electronic devices like diode, transistor, ICs, etc.
2. Lattice structure and the atomic structure of constituent elements decide
whether a particular material will be insulator, metal or semiconductor.

3. Metals have low resistivity (102 to 108 m), insulators have very high
resistivity (>108 m1), while semiconductors have intermediate values of
resistivity.
4. Semiconductors are elemental (Si, Ge) as well as compound (GaAs, CdS,
etc.).
5. Pure semiconductors are called intrinsic semiconductors. The presence of
charge carriers (electrons and holes) is an intrinsic property of the material
and these are obtained as a result of thermal excitation. The number of
electrons (ne) is equal to the number of holes (nh ) in intrinsic conductors.
Holes are essentially electron vacancies with an effective positive charge.
6. The number of charge carriers can be changed by doping of a suitable
impurity in pure semiconductors. Such semiconductors are known as extrinsic
semiconductors. These are of two types (n-type and p-type).
7. In n-type semiconductors, ne >> nh while in p-type semiconductors nh >>
ne.
8. n-type semiconducting Si or Ge is obtained by doping with pentavalent
atoms (donors) like As, Sb, P, etc., while p-type Si or Ge can be obtained by
doping with trivalent atom (acceptors) like B, Al, In etc.

9. nenh = ni2 in all cases. Further, the material possesses an overall charge
neutrality.
10. There are two distinct band of energies (called valence band and
conduction band) in which the electrons in a material lie. Valence band
energies are low as compared to conduction band energies. All energy levels
in the valence band are filled while energy levels in the conduction band may
be fully empty or partially filled. The electrons in the conduction band are free
to move in a solid and are responsible for the conductivity. The extent of
conductivity depends upon the energy gap (Eg) between the top of valence
band (EV) and the bottom of the conduction band EC. The electrons from
valence band can be excited by heat, light or electrical energy to the
conduction band and thus, produce a change in the current flowing in a
semiconductor.
11. For insulators Eg > 3 eV, for semiconductors Eg is 0.2 eV to 3 eV, while for
metals Eg 0.
12. p-n junction is the key to all semiconductor devices. When such a junction
is made, a depletion layer is formed consisting of immobile ion-cores devoid
of their electrons or holes. This is responsible for a junction potential barrier.
13. By changing the external applied voltage, junction barriers can be
changed. In forward bias (n-side is connected to negative terminal of the
battery and p-side is connected to the positive), the barrier is decreased while
the barrier increases in reverse bias. Hence, forward bias current is more (mA)
while it is very small (A) in a p-n junction diode.
14. Diodes can be used for rectifying an ac voltage (restricting the ac voltage
to one direction). With the help of a capacitor or a suitable filter, a dc voltage
can be obtained.
15. There are some special purpose diodes.
16. Zener diode is one such special purpose diode. In reverse bias, after a
certain voltage, the current suddenly increases (breakdown voltage) in a Zener
diode. This property has been used to obtain voltage regulation.
17. p-n junctions have also been used to obtain many photonic
or optoelectronic devices where one of the participating entity is photon: (a)
Photodiodes in which photon excitation results in a change of reverse
saturation current which helps us to measure light intensity; (b) Solar cells
which convert photon energy into electricity; (c) Light Emitting Diode and Diode
Laser in which electron excitation by a bias voltage results in the generation of
light.
18. Transistor is an n-p-n or p-n-p junction device. The central block (thin and
lightly doped) is called Base while the other electrodes are Emitter and
Collectors. The emitter-base junction is forward biased while collector-base
junction is reverse biased.
19. The transistors can be connected in such a manner that either C or E or B
is common to both the input and output. This gives the three configurations in
which a transistor is used: Common Emitter (CE), Common Collector (CC) and
Common Base (CB). The plot between IC and VCE for fixed IB is called output
characteristics while the plot between IB and VBE with fixed VCE is called input
characteristics. The important transistor parameters for CE-configuration are:

input resistance,

output resistance,

current amplification factor,

20. Transistor can be used as an amplifier and oscillator. In fact, an oscillator


can also be considered as a self-sustained amplifier in which a part of output is
fed-back to the input in the same phase (positive feed back). The voltage gain
of a transistor amplifier in common emitter configuration is:

, where RC and RB are respectively the resistances in


collector and base sides of the circuit.
21. When the transistor is used in the cutoff or saturation state, it acts as a
switch.
22. There are some special circuits which handle the digital data consisting of
0 and 1 levels. This forms the subject of Digital Electronics.
23. The important digital circuits performing special logic operations are called
logic gates. These are: OR, AND, NOT, NAND, and NOR gates.
24. In modern day circuit, many logical gates or circuits are integrated in one
single Chip. These are known as Intgrated circuits (IC).
Points to Ponder
1. The energy bands (EC or EV) in the semiconductors are space delocalised
which means that these are not located in any specific place inside the solid.
The energies are the overall averages. When you see a picture in which EC or
EV are drawn as straight lines, then they should be respectively taken simply
as the bottom of conduction band energy levels and top of valence band
energy levels.
2. In elemental semiconductors (Si or Ge), the n-type or p-type
semiconductors are obtained by introducing dopants as defects. In compound
semiconductors, the change in relative stoichiometric ratio can also change the
type of semiconductor. For example, in ideal GaAs the ratio of Ga:As is 1:1 but
in Ga-rich or As-rich GaAs it could respectively be Ga1.1 As0.9 or Ga0.9 As1.1.
In general, the presence of defects control the properties of semiconductors in
many ways.
3. In transistors, the base region is both narrow and lightly doped, otherwise
the electrons or holes coming from the input side (say, emitter in CE-
configuration) will not be able to reach the collector.
4. We have described an oscillator as a positive feedback amplifier. For stable
oscillations, the voltage feedback (Vfb) from the output voltage (Vo) should be
such that after amplification (A) it should again become Vo. If a fraction is
feedback, then Vfb = Vo. and after amplification its value A(vo.) should be
equal to Vo. This means that the criteria for stable oscillations to be sustained
is A = 1. This is known as Barkhausens Criteria.
5. In an oscillator, the feedback is in the same phase (positive feedback). If the
feedback voltage is in opposite phase (negative feedback), the gain is less
than 1 and it can never work as oscillator. It will be an amplifier with reduced
gain. However, the negative feedback also reduces noise and distortion in an
amplifier which is an advantageous feature.

Exercises

14.1 In an n-type silicon, which of the following statement is true:


(a) Electrons are majority carriers and trivalent atoms are the
dopants.

(b) Electrons are minority carriers and pentavalent atoms are the
dopants.
(c) Holes are minority carriers and pentavalent atoms are the
dopants.
(d) Holes are majority carriers and trivalent atoms are the dopants.

14.2 Which of the statements given in Exercise 14.1 is true for p-


type semiconductos.
14.3 Carbon, silicon and germanium have four valence electrons
each. These are characterised by valence and conduction bands
separated by energy band gap respectively equal to (Eg)C, (Eg)Si
and (Eg)Ge. Which of the following statements is true?
(a) (Eg)Si < (Eg)Ge < (Eg)C
(b) (Eg)C < (Eg)Ge > (Eg)Si
(c) (Eg)C > (Eg)Si > (Eg)Ge

(d) (Eg)C = (Eg)Si = (Eg)Ge


14.4 In an unbiased p-n junction, holes diffuse from the p-region to
n-region because
(a) free electrons in the n-region attract them.
(b) they move across the junction by the potential difference.
(c) hole concentration in p-region is more as compared to n-
region.
(d) All the above.
14.5 When a forward bias is applied to a p-n junction, it
(a) raises the potential barrier.

(b) reduces the majority carrier current to zero.


(c) lowers the potential barrier.
(d) None of the above.
14.6 For transistor action, which of the following statements are
correct:
(a) Base, emitter and collector regions should have similar size
and doping concentrations.
(b) The base region must be very thin and lightly doped.
(c) The emitter junction is forward biased and collector junction is
reverse biased.
(d) Both the emitter junction as well as the collector junction are
forward biased.
14.7 For a transistor amplifier, the voltage gain
(a) remains constant for all frequencies.

(b) is high at high and low frequencies and constant in the middle
frequency range.
(c) is low at high and low frequencies and constant at mid
frequencies.
(d) None of the above.
14.8 In half-wave rectification, what is the output frequency if the
input frequency is 50 Hz. What is the output frequency of a full-
wave rectifier for the same input frequency.
14.9 For a CE-transistor amplifier, the audio signal voltage across
the collected resistance of 2 k is 2 V. Suppose the current
amplification factor of the transistor is 100, find the input signal
voltage and base current, if the base resistance is 1 k.
14.10 Two amplifiers are connected one after the other in series
(cascaded). The first amplifier has a voltage gain of 10 and the
second has a voltage gain of 20. If the input signal is 0.01 volt,
calculate the output ac signal.
14.11 A p-n photodiode is fabricated from a semiconductor with
band gap of 2.8 eV. Can it detect a wavelength of 6000 nm?
Additional Exercises
14.12 The number of silicon atoms per m3 is 5 1028. This is
doped simultaneously with 5 1022 atoms per m3 of Arsenic and 5
1020 per m3 atoms of Indium. Calculate the number of electrons
and holes. Given that ni = 1.5 1016 m3. Is the material n-type or
p-type?

14.13 In an intrinsic semiconductor the energy gap Eg is 1.2eV. Its


hole mobility is much smaller than electron mobility and
independent of temperature. What is the ratio between
conductivity at 600K and that at 300K? Assume that the
temperature dependence of intrinsic carrier concentration ni is
given by

where n0 is a constant.
14.14 In a p-n junction diode, the current I can be expressed as

where I0 is called the reverse saturation current, V is the voltage


across the diode and is positive for forward bias and negative for
reverse bias, and I is the current through the diode, kB is the
Boltzmann constant (8.6105 eV/K) and T is the absolute
temperature. If for a given diode I0 = 5 1012 A and T = 300 K,
then
(a) What will be the forward current at a forward voltage of 0.6 V?
(b) What will be the increase in the current if the voltage across
the diode is increased to 0.7 V?
(c) What is the dynamic resistance?
(d) What will be the current if reverse bias voltage changes from 1
V to 2 V?
14.15 You are given the two circuits as shown in Fig. 14.44. Show
that circuit (a) acts as OR gate while the circuit (b) acts as AND
gate.

Figure 14.44

14.16 Write the truth table for a NAND gate connected as given
in Fig. 14.45.

Figure 14.45

Hence identify the exact logic operation carried out by this circuit.
14.17 You are given two circuits as shown in Fig. 14.46, which
consist of NAND gates. Identify the logic operation carried out by
the two circuits.

Figure 14.46

14.18 Write the truth table for circuit given in Fig. 14.47 below
consisting of NOR gates and identify the logic operation (OR,
AND, NOT) which this circuit is performing.
Figure 14.47

(Hint: A = 0, B = 1 then A and B inputs of second NOR gate will be


0 and hence Y=1. Similarly work out the values of Y for other
combinations of A and B. Compare with the truth table of OR,
AND, NOT gates and find the correct one.)
14.19 Write the truth table for the circuits given in Fig. 14.48
consisting of NOR gates only. Identify the logic operations (OR,
AND, NOT) performed by the two circuits.

Figure 14.48
Chapter Six

Work, Energy and Power

6.1 Introduction
6.2 Notions of work and kinetic energy : The work-energy theorem
6.3 Work
6.4 Kinetic energy
6.5 Work done by a variable force
6.6 The work-energy theorem for a variable force
6.7 The concept of potential energy
6.8 The conservation of mechanical energy
6.9 The potential energy of a spring
6.10 Various forms of energy : the law of conservation of energy
6.11 Power
6.12 Collisions
Summary
Points to ponder
Exercises
Additional exercises
Appendix 6.1

6.1 Introduction
The terms work, energy and power are frequently used in everyday
language. A farmer ploughing the field, a construction worker carrying
bricks, a student studying for a competitive examination, an artist
painting a beautiful landscape, all are said to be working. In physics,
however, the word Work covers a definite and precise meaning.
Somebody who has the capacity to work for 14-16 hours a day is said
to have a large stamina or energy. We admire a long distance runner
for her stamina or energy. Energy is thus our capacity to do work. In
Physics too, the term energy is related to work in this sense, but as
said above the term work itself is defined much more precisely. The
word power is used in everyday life with different shades of meaning.
In karate or boxing we talk of powerful punches. These are delivered
at a great speed. This shade of meaning is close to the meaning of the
word power used in physics. We shall find that there is at best a
loose correlation between the physical definitions and the
physiological pictures these terms generate in our minds. The aim of
this chapter is to develop an understanding of these three physical
quantities. Before we proceed to this task, we need to develop a
mathematical prerequisite, namely the scalar product of two vectors.

6.1.1 The Scalar Product


We have learnt about vectors and their use in Chapter 4. Physical
quantities like displacement, velocity, acceleration, force etc. are
vectors. We have also learnt how vectors are added or subtracted. We
now need to know how vectors are multiplied. There are two ways of
multiplying vectors which we shall come across : one way known as
the scalar product gives a scalar from two vectors and the other
known as the vector product produces a new vector from two vectors.
We shall look at the vector product in Chapter 7. Here we take up the
scalar product of two vectors. The scalar product or dot product of any
two vectors A and B, denoted as A.B (read A dot B) is defined as
A.B = A B cos (6.1a)
where is the angle between the two vectors as shown in Fig. 6.1(a).
Since A, B and cos are scalars, the dot product of A and B is a
scalar quantity. Each vector, A and B, has a direction but their scalar
product does not have a direction.
From Eq. (6.1a), we have
A.B = A (B cos )
= B (A cos )
Geometrically, B cos is the projection of B onto A in Fig.6.1 (b) and
A cos is the projection of A onto B in Fig. 6.1 (c). So, A.B is the
product of the magnitude of A and the component of B along A.
Alternatively, it is the product of the magnitude of B and the
component of A along B.
Equation (6.1a) shows that the scalar product follows the commutative
law :
A.B = B.A
Scalar product obeys the distributive law:
A. (B + C) = A.B + A.C
Further, A. ( B) = (A.B)
where is a real number.
The proofs of the above equations are left to you as an exercise.

For unit vectors we have

Given two vectors

their scalar product is

(6.1b)
From the definition of scalar product and (Eq. 6.1b) we have :
(i)

Or, (6.1c)
since A.A = |A ||A| cos 0 = A2.
(ii) A.B = 0, if A and B are perpendicular.

Example 6.1 Find the angle between force F =

unit and displacement d = unit. Also


find the projection of F on d.

Answer F.d =
= 3 (5) + 4 (4) + ( 5) (3)
= 16 unit

Hence F.d = = 16 unit

Now F.F =
= 9 + 16 + 25
= 50 unit

and d.d = d2 =
= 25 + 16 + 9
= 50 unit

cos = ,
= cos1 0.32 t
Fig. 6.1 (a) The scalar product of two vectors A and B is a scalar : A.B = A Bcos .
(b) B cos is the projection of B onto A. (c) A cos is the projection of A onto B.

6.2 Notions of work and kinetic energy: the work-


energy theorem
The following relation for rectilinear motion under constant
acceleration a has been encountered in Chapter 3,
v2 u2 = 2 as
where u and v are the initial and final speeds and s the distance
traversed. Multiplying both sides by m/2, we have

(6.2a)
where the last step follows from Newtons Second Law. We can
generalise Eq. (6.1) to three dimensions by employing vectors
v2 u2 = 2 a.d
Once again multiplying both sides by m/2 , we obtain

(6.2b)
The above equation provides a motivation for the definitions of work
and kinetic energy. The left side of the equation is the difference in the
quantity half the mass times the square of the speed from its initial
value to its final value. We call each of these quantities the kinetic
energy, denoted by K. The right side is a product of the displacement
and the component of the force along the displacement. This quantity
is called work and is denoted by W. Eq. (6.2b) is then
Kf Ki = W (6.3)
where Ki and Kf are respectively the initial and final kinetic energies of
the object. Work refers to the force and the displacement over which it
acts. Work is done by a force on the body over a certain
displacement.
Equation (6.2) is also a special case of the work-energy (WE) theorem
: The change in kinetic energy of a particle is equal to the work
done on it by the net force. We shall generalise the above derivation
to a varying force in a later section.

Example 6.2 It is well known that a raindrop falls under the influence of the
downward gravitational force and the opposing resistive force. The latter is
known to be proportional to the speed of the drop but is otherwise undetermined.
Consider a drop of mass 1.00 g falling from a height 1.00 km. It hits the ground
with a speed of 50.0 m s-1. (a) What is the work done by the gravitational force ?
What is the work done by the unknown resistive force?

Answer (a) The change in kinetic energy of the drop is

= 1.25 J
where we have assumed that the drop is initially at rest.
Assuming that g is a constant with a value 10 m/s2, the work done by
the gravitational force is,
Wg = mgh
= 10-3 10 103
= 10.0 J
(b) From the work-energy theorem

where Wr is the work done by the resistive force on the raindrop. Thus
Wr = K Wg
= 1.25 10
= 8.75 J
is negative. t

6.3 Work
As seen earlier, work is related to force and the displacement over
which it acts. Consider a constant force F acting on an object of mass
m. The object undergoes a displacement d in the positive x-direction
as shown in Fig. 6.2.

Fig. 6.2 An object undergoes a displacement d under the influence of the force F.

The work done by the force is defined to be the product of


component of the force in the direction of the displacement and
the magnitude of this displacement. Thus
W = (F cos )d = F.d (6.4)
We see that if there is no displacement, there is no work done even if
the force is large. Thus, when you push hard against a rigid brick wall,
the force you exert on the wall does no work. Yet your muscles are
alternatively contracting and relaxing and internal energy is being
used up and you do get tired. Thus, the meaning of work in physics is
different from its usage in everyday language.
No work is done if :
(i) the displacement is zero as seen in the example above. A
weightlifter holding a 150 kg mass steadily on his shoulder for 30 s
does no work on the load during this time.
(ii) the force is zero. A block moving on a smooth horizontal table is
not acted upon by a horizontal force (since there is no friction), but
may undergo a large displacement.
(iii) the force and displacement are mutually perpendicular. This is so
since, for = /2 rad (= 90o), cos (/2) = 0. For the block moving on a
smooth horizontal table, the gravitational force mg does no work since
it acts at right angles to the displacement. If we assume that the
moons orbits around the earth is perfectly circular then the earths
gravitational force does no work. The moons instantaneous
displacement is tangential while the earths force is radially inwards
and = /2.
Work can be both positive and negative. If is between 0o and 90o,
cos in Eq. (6.4) is positive. If is between 90o and 180o, cos is
negative. In many examples the frictional force opposes displacement
and = 180o. Then the work done by friction is negative (cos 180o =
1).
From Eq. (6.4) it is clear that work and energy have the same
dimensions, [ML2T2]. The SI unit of these is joule (J), named after the
famous British physicist James Prescott Joule (1811-1869). Since
work and energy are so widely used as physical concepts, alternative
units abound and some of these are listed in Table 6.1.

Table 6.1 Alternative Units of Work/Energy in J


Example 6.3 A cyclist comes to a skidding stop in 10 m. During this process, the
force on the cycle due to the road is 200 N and is directly opposed to the motion.
(a) How much work does the road do on the cycle ? (b) How much work does
the cycle do on the road ?

Answer Work done on the cycle by the road is the work done by the
stopping (frictional) force on the cycle due to the road.
(a) The stopping force and the displacement make an angle of 180o (
rad) with each other. Thus, work done by the road,
Wr = Fd cos
= 200 10 cos
= 2000 J
It is this negative work that brings the cycle to a halt in accordance
with WE theorem.
(b) From Newtons Third Law an equal and opposite force acts on the
road due to the cycle. Its magnitude is 200 N. However, the road
undergoes no displacement. Thus, work done by cycle on the road is
zero. t
The lesson of Example 6.3 is that though the force on a body A
exerted by the body B is always equal and opposite to that on B by A
(Newtons Third Law); the work done on A by B is not necessarily
equal and opposite to the work done on B by A.

6.4 Kinetic energy


As noted earlier, if an object of mass m has velocity v, its kinetic
energy K is

(6.5)
Kinetic energy is a scalar quantity. The kinetic energy of an object is a
measure of the work an object can do by the virtue of its motion. This
notion has been intuitively known for a long time.
Table 6.2 Typical kinetic energies (K)

The kinetic energy of a fast flowing stream has been used to grind
corn. Sailing ships employ the kinetic energy of the wind. Table 6.2
lists the kinetic energies for various objects.

Example 6.4 In a ballistics demonstration a police officer fires a bullet of mass


50.0 g with speed 200 m s-1 (see Table 6.2) on soft plywood of thickness 2.00
cm. The bullet emerges with only 10% of its initial kinetic energy. What is the
emergent speed of the bullet ?

Answer The initial kinetic energy of the bullet is mv2/2 = 1000 J. It has
a final kinetic energy of 0.11000 = 100 J. If vf is the emergent speed
of the bullet,
= 63.2 m s1
The speed is reduced by approximately 68% (not 90%). t

6.5 Work done by a variable force


A constant force is rare. It is the variable force, which is more
commonly encountered. Fig. 6.2 is a plot of a varying force in one
dimension.
If the displacement x is small, we can take the force F(x) as
approximately constant and the work done is then
W =F(x) x
This is illustrated in Fig. 6.3(a). Adding successive rectangular areas
in Fig. 6.3(a) we get the total work done as

(6.6)
where the summation is from the initial position xi to the final position
xf.
If the displacements are allowed to approach zero, then the number of
terms in the sum increases without limit, but the sum approaches a
definite value equal to the area under the curve in Fig. 6.3(b). Then
the work done is

lim
(6.7)
where lim stands for the limit of the sum when x tends to zero.
Thus, for a varying force the work done can be expressed as a definite
integral of force over displacement (see also Appendix 3.1).

Fig. 6.3(a)

Fig. 6.3 (a) The shaded rectangle represents the work done by the varying force
F(x), over the small displacement x, W = F(x) x. (b) adding the areas of all the
rectangles we find that for x 0, the area under the curve is exactly equal to the
work done by F(x).

Example 6.5 A woman pushes a trunk on a railway platform which has a rough
surface. She applies a force of 100 N over a distance of 10 m. Thereafter, she
gets progressively tired and her applied force reduces linearly with distance to
50 N. The total distance through which the trunk has been moved is 20 m. Plot
the force applied by the woman and the frictional force, which is 50 N versus
displacement. Calculate the work done by the two forces over 20 m.
Answer

Fig. 6.4 Plot of the force F applied by the woman and the opposing frictional force f
versus displacement.

The plot of the applied force is shown in Fig. 6.4. At x = 20 m, F = 50


N ( 0). We are given that the frictional force f is |f|= 50 N. It opposes
motion and acts in a direction opposite to F. It is therefore, shown on
the negative side of the force axis.
The work done by the woman is
WF area of the rectangle ABCD + area of the trapezium CEID

= 1000 + 750
= 1750 J
The work done by the frictional force is
Wf area of the rectangle AGHI
Wf = (50) 20
= 1000 J
The area on the negative side of the force axis has a negative sign. t

6.6 The work-energy theorem for a variable force


We are now familiar with the concepts of work and kinetic energy to
prove the work-energy theorem for a variable force. We confine
ourselves to one dimension. The time rate of change of kinetic energy
is

(from Newtons Second Law)

Thus
dK = Fdx
Integrating from the initial position (x i ) to final position ( x f ), we have

where, Ki and K f are the initial and final kinetic energies


corresponding to x i and x f.

or (6.8a)
From Eq. (6.7), it follows that
Kf Ki = W (6.8b)
Thus, the WE theorem is proved for a variable force.
While the WE theorem is useful in a variety of problems, it does not, in
general, incorporate the complete dynamical information of Newtons
second law. It is an integral form of Newtons second law. Newtons
second law is a relation between acceleration and force at any instant
of time. Work-energy theorem involves an integral over an interval of
time. In this sense, the temporal (time) information contained in the
statement of Newtons second law is integrated over and is not
available explicitly. Another observation is that Newtons second law
for two or three dimensions is in vector form whereas the work-energy
theorem is in scalar form. In the scalar form, information with respect
to directions contained in Newtons second law is not present.
Example 6.6 A block of mass m = 1 kg, moving on a horizontal surface with
speed vi = 2 ms1 enters a rough patch ranging from x = 0.10 m to x = 2.01 m.
The retarding force Fr on the block in this range is inversely proportional to x
over this range,

for 0.1 < x < 2.01 m


= 0 for x < 0.1m and x > 2.01 m
where k = 0.5 J. What is the final kinetic energy and speed vf of the
block as it crosses this patch ?

Answer From Eq. (6.8a)

= 2 0.5 ln (20.1)
= 2 1.5 = 0.5 J
Here, note that ln is a symbol for the natural logarithm to the base e
and not the logarithm to the base 10 [ln X = loge X = 2.303 log10 X]. t

6.7 The concept of potential energy


The word potential suggests possibility or capacity for action. The term
potential energy brings to ones mind stored energy. A stretched
bow-string possesses potential energy. When it is released, the arrow
flies off at a great speed. The earths crust is not uniform, but has
discontinuities and dislocations that are called fault lines. These fault
lines in the earths crust are like compressed springs. They possess
a large amount of potential energy. An earthquake results when these
fault lines readjust. Thus, potential energy is the stored energy by
virtue of the position or configuration of a body. The body left to itself
releases this stored energy in the form of kinetic energy. Let us make
our notion of potential energy more concrete.
The gravitational force on a ball of mass m is mg . g may be treated as
a constant near the earth surface. By near we imply that the height h
of the ball above the earths surface is very small compared to the
earths radius RE (h <<RE) so that we can ignore the variation of g
near the earths surface*. In what follows we have taken the upward
direction to be positive. Let us raise the ball up to a height h. The work
done by the external agency against the gravitational force is mgh.
This work gets stored as potential energy. Gravitational potential
energy of an object, as a function of the height h, is denoted by V(h)
and it is the negative of work done by the gravitational force in raising
the object to that height.
V (h) = mgh
If h is taken as a variable, it is easily seen that the gravitational force F
equals the negative of the derivative of V(h) with respect to h. Thus,
The negative sign indicates that the gravitational force is downward.
When released, the ball comes down with an increasing speed. Just
before it hits the ground, its speed is given by the kinematic relation,
v2 = 2gh
This equation can be written as

m v2 = m g h
which shows that the gravitational potential energy of the object at
height h, when the object is released, manifests itself as kinetic energy
of the object on reaching the ground.
* The variation of g with height is discussed in Chapter 8 on Gravitation.
Physically, the notion of potential energy is applicable only to the class
of forces where work done against the force gets stored up as
energy. When external constraints are removed, it manifests itself as
kinetic energy. Mathematically, (for simplicity, in one dimension) the
potential energy V(x) is defined if the force F(x) can be written as

This implies that

The work done by a conservative force such as gravity depends on


the initial and final positions only. In the previous chapter we have
worked on examples dealing with inclined planes. If an object of mass
m is released from rest, from the top of a smooth (frictionless) inclined
plane of height h, its speed at the bottom is irrespective of the
angle of inclination. Thus, at the bottom of the inclined plane it
acquires a kinetic energy, mgh. If the work done or the kinetic energy
did depend on other factors such as the velocity or the particular path
taken by the object, the force would be called non-conservative.
The dimensions of potential energy are [ML2T 2] and the unit is joule
(J), the same as kinetic energy or work. To reiterate, the change in
potential energy, for a conservative force, V is equal to the negative
of the work done by the force
V = F(x) x (6.9)
In the example of the falling ball considered in this section we saw
how potential energy was converted to kinetic energy. This hints at an
important principle of conservation in mechanics, which we now
proceed to examine.

6.8 the conservation of mechanical energy


For simplicity we demonstrate this important principle for one-
dimensional motion. Suppose that a body undergoes displacement x
under the action of a conservative force F. Then from the WE theorem
we have,
K = F(x) x
If the force is conservative, the potential energy function V(x) can be
defined such that
V = F(x) x
The above equations imply that
K + V = 0
(K + V ) = 0 (6.10)
which means that K + V, the sum of the kinetic and potential energies
of the body is a constant. Over the whole path, xi to xf, this means that
Ki + V(xi ) = Kf + V(xf ) (6.11)
The quantity K +V(x), is called the total mechanical energy of the
system. Individually the kinetic energy K and the potential energy V(x)
may vary from point to point, but the sum is a constant. The aptness of
the term conservative force is now clear.
Let us consider some of the definitions of a conservative force.
A force F(x) is conservative if it can be derived from a scalar quantity
V(x) by the relation given by Eq. (6.9). The three-dimensional
generalisation requires the use of a vector derivative, which is outside
the scope of this book.
The work done by the conservative force depends only on the end
points. This can be seen from the relation,
W = Kf Ki = V (xi ) V(xf )
which depends on the end points.
A third definition states that the work done by this force in a closed
path is zero. This is once again apparent from Eq. (6.11) since xi = xf .
Thus, the principle of conservation of total mechanical energy can be
stated as
The total mechanical energy of a system is conserved if the
forces, doing work on it, are conservative.
The above discussion can be made more concrete by considering the
example of the gravitational force once again and that of the spring
force in the next section. Fig. 6.5 depicts a ball of mass m being
dropped from a cliff of height H.

Fig. 6.5 The conversion of potential energy to kinetic energy for a ball of mass m
dropped from a height H.
The total mechanical energies E0, Eh, and EH of the ball at the
indicated heights zero (ground level), h and H, are
EH = mgH (6.11 a)

(6.11 b)
E0 = (1/2) mvf2 (6.11 c)
The constant force is a special case of a spatially dependent force
F(x). Hence, the mechanical energy is conserved. Thus
EH = E0

or,

a result that was obtained in section 3.7 for a freely falling body.
Further,
EH = Eh
which implies,

(6.11 d)
and is a familiar result from kinematics.
At the height H, the energy is purely potential. It is partially converted
to kinetic at height h and is fully kinetic at ground level. This illustrates
the conservation of mechanical energy.

Example 6.7 A bob of mass m is suspended by a light string of length L . It is


imparted a horizontal velocity vo at the lowest point A such that it completes a
semi-circular trajectory in the vertical plane with the string becoming slack only
on reaching the topmost point, C. This is shown in Fig. 6.6. Obtain an expression
for (i) vo; (ii) the speeds at points B and C; (iii) the ratio of the kinetic energies
(KB/KC) at B and C. Comment on the nature of the trajectory of the bob after it
reaches the point C.

Fig. 6.6

Answer (i) There are two external forces on the bob : gravity and the
tension (T) in the string. The latter does no work since the
displacement of the bob is always normal to the string. The potential
energy of the bob is thus associated with the gravitational force only.
The total mechanical energy E of the system is conserved. We take
the potential energy of the system to be zero at the lowest point A.
Thus, at A :

(6.12)

[Newtons Second Law]


where TA is the tension in the string at A. At the highest point C, the
string slackens, as the tension in the string (TC) becomes zero.
Thus, at C
(6.13)

[Newtons Second Law] (6.14)


where vC is the speed at C. From Eqs. (6.13) and (6.14)

Equating this to the energy at A

or,
(ii) It is clear from Eq. (6.14)

At B, the energy is

Equating this to the energy at A and employing the result from (i),
namely ,

(iii) The ratio of the kinetic energies at B and C


is :

At point C, the string becomes slack and the velocity of the bob is
horizontal and to the left. If the connecting string is cut at this instant,
the bob will execute a projectile motion with horizontal projection akin
to a rock kicked horizontally from the edge of a cliff. Otherwise the bob
will continue on its circular path and complete the revolution. t

6.9 The potential energy of a spring


The spring force is an example of a variable force which is
conservative. Fig. 6.7 shows a block attached to a spring and resting
on a smooth horizontal surface. The other end of the spring is
attached to a rigid wall. The spring is light and may be treated as
massless. In an ideal spring, the spring force Fs is proportional to x
where x is the displacement of the block from the equilibrium position.
The displacement could be either positive [Fig. 6.7(b)] or negative
[Fig. 6.7(c)]. This force law for the spring is called Hookes law and is
mathematically stated as
Fs = kx
The constant k is called the spring constant. Its unit is N m-1. The
spring is said to be stiff if k is large and soft if k is small.
Suppose that we pull the block outwards as in Fig. 6.7(b). If the
extension is xm, the work done by the spring force is
(6.15)
This expression may also be obtained by considering the area of the
triangle as in Fig. 6.7(d). Note that the work done by the external
pulling force F is positive since it overcomes the spring force.

(6.16)

Fig. 6.7 Illustration of the spring force with a block attached to the free end of the
spring. (a) The spring force Fs is zero when the displacement x from the
equilibrium position is zero. (b) For the stretched spring x > 0 and Fs < 0 (c) For
the compressed spring x < 0 and Fs > 0.(d) The plot of Fs versus x. The area of
the shaded triangle represents the work done by the spring force. Due to the

opposing signs of Fs and x, this work done is negative, .

The same is true when the spring is compressed with a displacement


xc (< 0). The spring force does work while the external

force F does work . If the block is moved from an initial


displacement xi to a final displacement xf , the work done by the spring
force Ws is

(6.17)
Thus the work done by the spring force depends only on the end
points. Specifically, if the block is pulled from xi and allowed to return

to xi ;
= 0 (6.18)
The work done by the spring force in a cyclic process is zero. We
have explicitly demonstrated that the spring force (i) is position
dependent only as first stated by Hooke, (Fs = kx); (ii) does work
which only depends on the initial and final positions, e.g. Eq. (6.17).
Thus, the spring force is a conservative force.
We define the potential energy V(x) of the spring to be zero when
block and spring system is in the equilibrium position. For an
extension (or compression) x the above analysis suggests that

(6.19)
You may easily verify that dV/dx = k x, the spring force. If the block
of mass m in Fig. 6.7 is extended to xm and released from rest, then
its total mechanical energy at any arbitrary point x, where x lies
between xm and + xm, will be given by

where we have invoked the conservation of mechanical energy. This


suggests that the speed and the kinetic energy will be maximum at the
equilibrium position, x = 0, i.e.,

where vm is the maximum speed.

or
Note that k/m has the dimensions of [T-2] and our equation is
dimensionally correct. The kinetic energy gets converted to potential
energy and vice versa, however, the total mechanical energy remains
constant. This is graphically depicted in Fig. 6.8.

Fig. 6.8 Parabolic plots of the potential energy V and kinetic energy K of a block
attached to a spring obeying Hookes law. The two plots are complementary, one
decreasing as the other increases. The total mechanical energy E = K + V remains
constant.

Example 6.8 To simulate car accidents, auto manufacturers study the collisions
of moving cars with mounted springs of different spring constants. Consider a
typical simulation with a car of mass 1000 kg moving with a speed 18.0 km/h on
a smooth road and colliding with a horizontally mounted spring of spring
constant 6.25 103 N m1. What is the maximum compression of the spring ?
Answer At maximum compression the kinetic energy of the car is
converted entirely into the potential energy of the spring.
The kinetic energy of the moving car is

K = 1.25 104 J
where we have converted 18 km h1 to 5 m s1 [It is useful to
remember that 36 km h1 = 10 m s1]. At maximum compression xm,
the potential energy V of the spring is equal to the kinetic energy K of
the moving car from the principle of conservation of mechanical
energy.

= 1.25 104 J
We obtain
xm = 2.00 m
We note that we have idealised the situation. The spring is
considered to be massless. The surface has been considered to
possess negligible friction. t
We conclude this section by making a few remarks on conservative
forces.
(i) Information on time is absent from the above discussions. In the
example considered above, we can calculate the compression, but not
the time over which the compression occurs. A solution of Newtons
Second Law for this system is required for temporal information.
(ii) Not all forces are conservative. Friction, for example, is a non-
conservative force. The principle of conservation of energy will have to
be modified in this case. This is illustrated in Example 6.9.
(iii) The zero of the potential energy is arbitrary. It is set according to
convenience. For the spring force we took V(x) = 0, at x = 0, i.e. the
unstretched spring had zero potential energy. For the constant
gravitational force mg, we took V = 0 on the earths surface. In a later
chapter we shall see that for the force due to the universal law of
gravitation, the zero is best defined at an infinite distance from the
gravitational source. However, once the zero of the potential energy is
fixed in a given discussion, it must be consistently adhered to
throughout the discussion. You cannot change horses in midstream !

Example 6.9 Consider Example 6.8 taking the coefficient of friction, , to be 0.5
and calculate the maximum compression of the spring.

Answer In presence of friction, both the spring force and the frictional
force act so as to oppose the compression of the spring as shown in
Fig. 6.9.
We invoke the work-energy theorem, rather than the conservation of
mechanical energy.
The change in kinetic energy is

Fig. 6.9 The forces acting on the car.

K = Kf Ki
The work done by the net force is
Equating we have

Now mg = 0.5 103 10 = 5 103 N (taking g =10.0 m s-2). After


rearranging the above equation we obtain the following quadratic
equation in the unknown xm.

where we take the positive square root since xm is positive. Putting in


numerical values we obtain
xm = 1.35 m
which, as expected, is less than the result in Example 6.8.
If the two forces on the body consist of a conservative force Fc and a
non-conservative force Fnc , the conservation of mechanical energy
formula will have to be modified. By the WE theorem
(Fc+ Fnc ) x = K
But Fc x = V
Hence, (K + V) = Fnc x
E = Fnc x
where E is the total mechanical energy. Over the path this assumes
the form
Ef Ei = Wnc
Where Wnc is the total work done by the non-conservative forces over
the path. Note that unlike the conservative force, Wnc depends on the
particular path i to f. t
6.10 Various forms of energy : the law of
conservation of energy
In the previous section we have discussed mechanical energy. We
have seen that it can be classified into two distinct categories : one
based on motion, namely kinetic energy; the other on configuration
(position), namely potential energy. Energy comes in many a forms
which transform into one another in ways which may not often be clear
to us.

6.10.1 Heat
We have seen that the frictional force is not a conservative force.
However, work is associated with the force of friction, Example 6.5. A
block of mass m sliding on a rough horizontal surface with speed v0
comes to a halt over a distance x0. The work done by the force of
kinetic friction f over x0 is fx0. By the work-energy theorem

If we confine our scope to mechanics, we would say


that the kinetic energy of the block is lost due to the frictional force.
On examination of the block and the table we would detect a slight
increase in their temperatures. The work done by friction is not lost,
but is transferred as heat energy. This raises the internal energy of the
block and the table. In winter, in order to feel warm, we generate heat
by vigorously rubbing our palms together. We shall see later that the
internal energy is associated with the ceaseless, often random, motion
of molecules. A quantitative idea of the transfer of heat energy is
obtained by noting that 1 kg of water releases about 42000 J of
energy when it cools by 10 C.

6.10.2 Chemical Energy


One of the greatest technical achievements of humankind occurred
when we discovered how to ignite and control fire. We learnt to rub
two flint stones together (mechanical energy), got them to heat up and
to ignite a heap of dry leaves (chemical energy), which then provided
sustained warmth. A matchstick ignites into a bright flame when struck
against a specially prepared chemical surface. The lighted matchstick,
when applied to a firecracker, results in a spectacular display of sound
and light.
Chemical energy arises from the fact that the molecules participating
in the chemical reaction have different binding energies. A stable
chemical compound has less energy than the separated parts. A
chemical reaction is basically a rearrangement of atoms. If the total
energy of the reactants is more than the products of the reaction, heat
is released and the reaction is said to be an exothermic reaction. If
the reverse is true, heat is absorbed and the reaction is endothermic.
Coal consists of carbon and a kilogram of it when burnt releases about
3 107 J of energy.
Chemical energy is associated with the forces that give rise to the
stability of substances. These forces bind atoms into molecules,
molecules into polymeric chains, etc. The chemical energy arising
from the combustion of coal, cooking gas, wood and petroleum is
indispensable to our daily existence.

6.10.3 Electrical Energy


The flow of electrical current causes bulbs to glow, fans to rotate and
bells to ring. There are laws governing the attraction and repulsion of
charges and currents, which we shall learn later. Energy is associated
with an electric current. An urban Indian household consumes about
200 J of energy per second on an average.

6.10.4 The Equivalence of Mass and Energy


Till the end of the nineteenth century, physicists believed that in every
physical and chemical process, the mass of an isolated system is
conserved. Matter might change its phase, e.g. glacial ice could melt
into a gushing stream, but matter is neither created nor destroyed;
Albert Einstein (1879-1955) however, showed that mass and energy
are equivalent and are related by the relation
E = m c2 (6.20)
where c, the speed of light in vacuum is approximately 3 108 m s1.
Thus, a staggering amount of energy is associated with a mere
kilogram of matter
E = 1 (3 108)2 J = 9 1016 J.
This is equivalent to the annual electrical output of a large (3000 MW)
power generating station.

6.10.5 Nuclear Energy


The most destructive weapons made by man, the fission and fusion
bombs are manifestations of the above equivalence of mass and
energy [Eq. (6.20)].
Table 6.3 Approximate energy associated with various
phenomena
On the other hand the explanation of the life-nourishing energy output
of the sun is also based on the above equation. In this case effectively
four light hydrogen nuclei fuse to form a helium nucleus whose mass
is less than the sum of the masses of the reactants. This mass
difference, called the mass defect m is the source of energy (m)c2.

In fission, a heavy nucleus like uranium , is split by a neutron into


lighter nuclei. Once again the final mass is less than the initial mass
and the mass difference translates into energy, which can be tapped
to provide electrical energy as in nuclear power plants (controlled
nuclear fission) or can be employed in making nuclear weapons
(uncontrolled nuclear fission). Strictly, the energy E released in a
chemical reaction can also be related to the mass defect m = E/c2.
However, for a chemical reaction, this mass defect is much smaller
than for a nuclear reaction. Table 6.3 lists the total energies for a
variety of events and phenomena.
Example 6.10 Examine Tables 6.1-6.3 and express (a) The energy
required to break one bond in DNA in eV; (b) The kinetic energy of an
air molecule (1021 J) in eV; (c) The daily intake of a human adult in
kilocalories.
Answer (a) Energy required to break one bond of DNA is

Note 0.1 eV = 100 meV (100 millielectron volt).


(b) The kinetic energy of an air molecule is

This is the same as 6.2 meV.


(c) The average human consumption in a day is

We point out a common misconception created by newspapers and


magazines. They mention food values in calories and urge us to
restrict diet intake to below 2400 calories. What they should be saying
is kilocalories (kcal) and not calories. A person consuming 2400
calories a day will soon starve to death! 1 food calorie is 1 kcal. t

6.10.6 The Principle of Conservation of Energy


We have seen that the total mechanical energy of the system is
conserved if the forces doing work on it are conservative. If some of
the forces involved are non-conservative, part of the mechanical
energy may get transformed into other forms such as heat, light and
sound. However, the total energy of an isolated system does not
change, as long as one accounts for all forms of energy. Energy may
be transformed from one form to another but the total energy of an
isolated system remains constant. Energy can neither be created, nor
destroyed.
Since the universe as a whole may be viewed as an isolated system,
the total energy of the universe is constant. If one part of the universe
loses energy, another part must gain an equal amount of energy.
The principle of conservation of energy cannot be proved. However,
no violation of this principle has been observed. The concept of
conservation and transformation of energy into various forms links
together various branches of physics, chemistry and life sciences. It
provides a unifying, enduring element in our scientific pursuits. From
engineering point of view all electronic, communication and
mechanical devices rely on some forms of energy transformation.

6.11 Power
Often it is interesting to know not only the work done on an object, but
also the rate at which this work is done. We say a person is physically
fit if he not only climbs four floors of a building but climbs them fast.
Power is defined as the time rate at which work is done or energy is
transferred.
The average power of a force is defined as the ratio of the work, W, to
the total time t taken

The instantaneous power is defined as the limiting value of the


average power as time interval approaches zero,

(6.21)
The work dW done by a force F for a displacement dr is dW = F.dr.
The instantaneous power can also be expressed as
= F.v (6.22)
where v is the instantaneous velocity when the force is F.
Power, like work and energy, is a scalar quantity. Its dimensions are
[ML2T3]. In the SI, its unit is called a watt (W). The watt is 1 J s1. The
unit of power is named after James Watt, one of the innovators of the
steam engine in the eighteenth century.
There is another unit of power, namely the horse-power (hp)
1 hp = 746 W
This unit is still used to describe the output of automobiles,
motorbikes, etc.
We encounter the unit watt when we buy electrical goods such as
bulbs, heaters and refrigerators. A 100 watt bulb which is on for 10
hours uses 1 kilowatt hour (kWh) of energy.
100 (watt) 10 (hour)
= 1000 watt hour
=1 kilowatt hour (kWh)
= 103 (W) 3600 (s)
= 3.6 106 J
Our electricity bills carry the energy consumption in units of kWh. Note
that kWh is a unit of energy and not of power.
Example 6.11 An elevator can carry a maximum load of 1800 kg
(elevator + passengers) is moving up with a constant speed of 2 m s
1. The frictional force opposing the motion is 4000 N. Determine the
minimum power delivered by the motor to the elevator in watts as well
as in horse power.
Answer The downward force on the elevator is
F = m g + Ff = (1800 10) + 4000 = 22000 N
The motor must supply enough power to balance this force. Hence,
P = F. v = 22000 2 = 44000 W = 59 hp t
6.12 Collisions
In physics we study motion (change in position). At the same time, we
try to discover physical quantities, which do not change in a physical
process. The laws of momentum and energy conservation are typical
examples. In this section we shall apply these laws to a commonly
encountered phenomena, namely collisions. Several games such as
billiards, marbles or carrom involve collisions.We shall study the
collision of two masses in an idealised form.
Consider two masses m1 and m2. The particle m1 is moving with
speed v1i , the subscript i implying initial. We can cosider m2 to be at
rest. No loss of generality is involved in making such a selection. In
this situation the mass m1 collides with the stationary mass m2 and
this is depicted in Fig. 6.10.

Fig. 6.10 Collision of mass m1, with a stationary mass m2.

The masses m1 and m2 fly-off in different directions. We shall see that


there are relationships, which connect the masses, the velocities and
the angles.

6.12.1 Elastic and Inelastic Collisions


In all collisions the total linear momentum is conserved; the initial
momentum of the system is equal to the final momentum of the
system. One can argue this as follows. When two objects collide, the
mutual impulsive forces acting over the collision time t cause a
change in their respective momenta :
p1 = F12 t
p2 = F21 t
where F12 is the force exerted on the first particle by the second
particle. F21 is likewise the force exerted on the second particle by the
first particle. Now from Newtons third law, F12 = F21. This implies
p1 + p2 = 0
The above conclusion is true even though the forces vary in a
complex fashion during the collision time t. Since the third law is true
at every instant, the total impulse on the first object is equal and
opposite to that on the second.
On the other hand, the total kinetic energy of the system is not
necessarily conserved. The impact and deformation during collision
may generate heat and sound. Part of the initial kinetic energy is
transformed into other forms of energy. A useful way to visualise the
deformation during collision is in terms of a compressed spring. If the
spring connecting the two masses regains its original shape without
loss in energy, then the initial kinetic energy is equal to the final kinetic
energy but the kinetic energy during the collision time t is not
constant. Such a collision is called an elastic collision. On the other
hand the deformation may not be relieved and the two bodies could
move together after the collision. A collision in which the two particles
move together after the collision is called a completely inelastic
collision. The intermediate case where the deformation is partly
relieved and some of the initial kinetic energy is lost is more common
and is appropriately called an inelastic collision.

6.12.2 Collisions in One Dimension


Consider first a completely inelastic collision in one dimension.
Then, in Fig. 6.10,
1=2=0
m1v1i = (m1+m2)vf (momentum conservation)

(6.23)
The loss in kinetic energy on collision is

[using Eq. (6.23)]

An experiment on head-on collision


In performing an experiment on collision on a horizontal surface, we face three
difficulties. One, there will be friction and bodies will not travel with uniform
velocities. Two, if two bodies of different sizes collide on a table, it would be
difficult to arrange them for a head-on collision unless their centres of mass are
at the same height above the surface. Three, it will be fairly difficult to measure
velocities of the two bodies just before and just after collision.
By performing this experiment in a vertical direction, all the three difficulties
vanish. Take two balls, one of which is heavier (basketball/football/volleyball)
and the other lighter (tennis ball/rubber ball/table tennis ball). First take only the
heavier ball and drop it vertically from some height, say 1 m. Note to which it
rises. This gives the velocities near the floor or ground, just before and just after

the bounce (by using ). Hence you will get the coefficient of
restitution.
Now take the big ball and a small ball and hold them in your hands one over the
other, with the heavier ball below the lighter one, as shown here. Drop them
together, taking care that they remain together while falling, and see what
happens. You will find that the heavier ball rises less than when it was dropped
alone, while the lighter one shoots up to about 3 m. With practice, you will be
able to hold the ball properly so that the lighter ball rises vertically up and does
not fly sideways. This is head-on collision.
You can try to find the best combination of balls which gives you the best effect.
You can measure the masses on a standard balance. We leave it to you to think
how you can determine the initial and final velocities of the balls.

which is a positive quantity as expected.


Consider next an elastic collision. Using the above nomenclature with
1 = 2 = 0, the momentum and kinetic energy conservation equations
are
m1v1i = m1v1f + m2v2f (6.24)

(6.25)
From Eqs. (6.24) and (6.25) it follows that,
or,

Hence, (6.26)
Substituting this in Eq. (6.24), we obtain

(6.27)

and (6.28)
Thus, the unknowns {v1f, v2f} are obtained in terms of the knowns
{m1, m2, v1i}. Special cases of our analysis are interesting.
Case I : If the two masses are equal
v1f = 0
v2f = v1i
The first mass comes to rest and pushes off the second mass with its
initial speed on collision.
Case II : If one mass dominates, e.g. m2 > > m1 v1f ~ v1i v2f ~ 0
The heavier mass is undisturbed while the lighter mass reverses its
velocity.
Example 6.12 Slowing down of neutrons: In a nuclear reactor a
neutron of high speed (typically 107 m s1) must be slowed to 103 m s
1 so that it can have a high probability of interacting with isotope
and causing it to fission. Show that a neutron can lose most of its
kinetic energy in an elastic collision with a light nuclei like deuterium or
carbon which has a mass of only a few times the neutron mass. The
material making up the light nuclei, usually heavy water (D2O) or
graphite, is called a moderator.
Answer The initial kinetic energy of the neutron is

while its final kinetic energy from Eq. (6.27)

The fractional kinetic energy lost is

while the fractional kinetic energy gained by the moderating nuclei K2f
/K1i is
f2 = 1 f1 (elastic collision)

One can also verify this result by substituting from Eq. (6.28).
For deuterium m2 = 2m1 and we obtain f1 = 1/9 while f2 = 8/9. Almost
90% of the neutrons energy is transferred to deuterium. For carbon f1
= 71.6% and f2 = 28.4%. In practice, however, this number is smaller
since head-on collisions are rare. t
If the initial velocities and final velocities of both the bodies are along
the same straight line, then it is called a one-dimensional collision, or
head-on collision. In the case of small spherical bodies, this is
possible if the direction of travel of body 1 passes through the centre
of body 2 which is at rest. In general, the collision is two-dimensional,
where the initial velocities and the final velocities lie in a plane.
6.12.3 Collisions in Two Dimensions
Fig. 6.10 also depicts the collision of a moving mass m1 with the
stationary mass m2. Linear momentum is conserved in such a
collision. Since momentum is a vector this implies three equations for
the three directions {x, y, z}. Consider the plane determined by the
final velocity directions of m1 and m2 and choose it to be the x-y plane.
The conservation of the z-component of the linear momentum implies
that the entire collision is in the x-y plane. The x- and y-component
equations are
m1v1i = m1v1f cos 1 + m2v2f cos 2 (6.29)
0 = m1v1f sin 1 m2v2f sin 2 (6.30)
One knows {m1, m2, v1i} in most situations. There are thus four
unknowns {v1f , v2f , 1 and 2}, and only two equations. If 1 = 2 = 0,
we regain Eq. (6.24) for one dimensional collision.
If, further the collision is elastic,

(6.31)
We obtain an additional equation. That still leaves us one equation
short. At least one of the four unknowns, say 1, must be made
known for the problem to be solvable. For example, 1 can be
determined by moving a detector in an angular fashion from the x to
the y axis. Given {m1, m2, v1i , 1} we can determine {v1f , v2f , 2} from
Eqs. (6.29)-(6.31).
Example 6.13 Consider the collision depicted in Fig. 6.10 to be
between two billiard balls with equal masses m1 = m2. The first ball is
called the cue while the second ball is called the target. The billiard
player wants to sink the target ball in a corner pocket, which is at an
angle 2 = 37. Assume that the collision is elastic and that friction and
rotational motion are not important. Obtain 1.
Answer From momentum conservation, since the masses are equal

or

(6.32)
Since the collision is elastic and m1 = m2 it follows from conservation

of kinetic energy that (6.33)


Comparing Eqs. (6.32) and (6.33), we get
cos (1 + 37) = 0
or 1 + 37 = 90
Thus, 1 = 53
This proves the following result : when two equal masses undergo a
glancing elastic collision with one of them at rest, after the collision,
they will move at right angles to each other. t
The matter simplifies greatly if we consider spherical masses with
smooth surfaces, and assume that collision takes place only when the
bodies touch each other. This is what happens in the games of
marbles, carrom and billiards.
In our everyday world, collisions take place only when two bodies
touch each other. But consider a comet coming from far distances to
the sun, or alpha particle coming towards a nucleus and going away in
some direction. Here we have to deal with forces involving action at a
distance. Such an event is called scattering. The velocities and
directions in which the two particles go away depend on their initial
velocities as well as the type of interaction between them, their
masses, shapes and sizes.
SUMMARY
1. The work-energy theorem states that the change in kinetic energy of a body is
the work done by the net force on the body.
Kf - Ki = Wnet
2. A force is conservative if (i) work done by it on an object is path independent
and depends only on the end points {xi, xj}, or (ii) the work done by the force is
zero for an arbitrary closed path taken by the object such that it returns to its
initial position.
3. For a conservative force in one dimension, we may define a potential energy
function V(x) such that

or
4. The principle of conservation of mechanical energy states that the total
mechanical energy of a body remains constant if the only forces that act on the
body are conservative.
5. The gravitational potential energy of a particle of mass m at a height x about
the earths surface is
V(x) = m g x
where the variation of g with height is ignored.
6. The elastic potential energy of a spring of force constant k and extension x is

7. The scalar or dot product of two vectors A and B is written as A.B and is a
scalar quantity given by : A.B = AB cos , where is the angle between A and
B. It can be positive, negative or zero depending upon the value of . The scalar
product of two vectors can be interpreted as the product of magnitude of one
vector and component of the other vector along the first vector. For unit vectors :

and
Scalar products obey the commutative and the distributive laws.

POINTS TO PONDER
1. The phrase calculate the work done is incomplete. We should refer (or
imply clearly by context) to the work done by a specific force or a group of
forces on a given body over a certain displacement.
2. Work done is a scalar quantity. It can be positive or negative unlike mass
and kinetic energy which are positive scalar quantities. The work done by the
friction or viscous force on a moving body is negative.
3. For two bodies, the sum of the mutual forces exerted between them is zero
from Newtons Third Law,
F12 +F21= 0
But the sum of the work done by the two forces need not always cancel, i.e.
W12+ W210
However, it may sometimes be true.
4. The work done by a force can be calculated sometimes even if the exact
nature of the force is not known. This is clear from Example 6.2 where the WE
theorem is used in such a situation.
5. The WE theorem is not independent of Newtons Second Law. The WE
theorem may be viewed as a scalar form of the Second Law. The principle of
conservation of mechanical energy may be viewed as a consequence of the
WE theorem for conservative forces.
6. The WE theorem holds in all inertial frames. It can also be extended to non-
inertial frames provided we include the pseudoforces in the calculation of the
net force acting on the body under consideration.
7. The potential energy of a body subjected to a conservative force is always
undetermined upto a constant. For example, the point where the potential
energy is zero is a matter of choice. For the gravitational potential energymgh,
the zero of the potential energy is chosen to be the ground. For the spring
potential energy kx2/2, the zero of the potential energy is the equilibrium
position of the oscillating mass.
8. Every force encountered in mechanics does not have an associated
potential energy. For example, work done by friction over a closed path is not
zero and no potential energy can be associated with friction.
9. During a collision : (a) the total linear momentum is conserved at each
instant of the collision ; (b) the kinetic energy conservation (even if the
collision is elastic) applies after the collision is over and does not hold at every
instant of the collision. In fact the two colliding objects are deformed and may
be momentarily at rest with respect to each other.

Exercises

6.1 The sign of work done by a force on a body is important to


understand. State carefully if the following quantities are positive
or negative:
(a) work done by a man in lifting a bucket out of a well by means
of a rope tied to the bucket.
(b) work done by gravitational force in the above case,
(c) work done by friction on a body sliding down an inclined plane,
(d) work done by an applied force on a body moving on a rough
horizontal plane with uniform velocity,
(e) work done by the resistive force of air on a vibrating pendulum
in bringing it to rest.
6.2 A body of mass 2 kg initially at rest moves under the action of
an applied horizontal force of 7 N on a table with coefficient of
kinetic friction = 0.1. Compute the
(a) work done by the applied force in 10 s,
(b) work done by friction in 10 s,
(c) work done by the net force on the body in 10 s,
(d) change in kinetic energy of the body in 10 s, and interpret your
results.
6.3 Given in Fig. 6.11 are examples of some potential energy
functions in one dimension. The total energy of the particle is
indicated by a cross on the ordinate axis. In each case, specify the
regions, if any, in which the particle cannot be found for the given
energy. Also, indicate the minimum total energy the particle must
have in each case. Think of simple physical contexts for which
these potential energy shapes are relevant.
Fig. 6.11
6.4 The potential energy function for a particle executing linear
simple harmonic motion is given by V(x) = kx2/2, where k is the
force constant of the oscillator. For k = 0.5 N m-1, the graph of V(x)
versus x is shown in Fig. 6.12. Show that a particle of total energy
1 J moving under this potential must turn back when it reaches x
= 2 m.
Fig. 6.12
6.5 Answer the following :
(a) The casing of a rocket in flight burns up due to friction. At
whose expense is the heat energy required for burning obtained?
The rocket or the atmosphere?
(b) Comets move around the sun in highly elliptical orbits. The
gravitational force on the comet due to the sun is not normal to the
comets velocity in general. Yet the work done by the gravitational
force over every complete orbit of the comet is zero. Why ?
(c) An artificial satellite orbiting the earth in very thin atmosphere
loses its energy gradually due to dissipation against atmospheric
resistance, however small. Why then does its speed increase
progressively as it comes closer and closer to the earth ?
(d) In Fig. 6.13(i) the man walks 2 m carrying a mass of 15 kg on
his hands. In Fig. 6.13(ii), he walks the same distance pulling the
rope behind him. The rope goes over a pulley, and a mass of 15
kg hangs at its other end. In which case is the work done greater ?

]Fig. 6.13

6.6 Underline the correct alternative :


(a) When a conservative force does positive work on a body, the
potential energy of the body increases/decreases/remains
unaltered.
(b) Work done by a body against friction always results in a loss of
its kinetic/potential energy.
(c) The rate of change of total momentum of a many-particle
system is proportional to the external force/sum of the internal
forces on the system.
(d) In an inelastic collision of two bodies, the quantities which do
not change after the collision are the total kinetic energy/total
linear momentum/total energy of the system of two bodies.
6.7 State if each of the following statements is true or false.
Give reasons for your answer.
(a) In an elastic collision of two bodies, the momentum and energy
of each body is conserved.
(b) Total energy of a system is always conserved, no matter what
internal and external forces on the body are present.
(c) Work done in the motion of a body over a closed loop is zero
for every force in nature.
(d) In an inelastic collision, the final kinetic energy is always less
than the initial kinetic energy of the system.
6.8 Answer carefully, with reasons :
(a) In an elastic collision of two billiard balls, is the total kinetic
energy conserved during the short time of collision of the balls (i.e.
when they are in contact) ?
(b) Is the total linear momentum conserved during the short time of
an elastic collision of two balls ?
(c) What are the answers to (a) and (b) for an inelastic collision ?
(d) If the potential energy of two billiard balls depends only on the
separation distance between their centres, is the collision elastic
or inelastic ? (Note, we are talking here of potential energy
corresponding to the force during collision, not gravitational
potential energy).
6.9 A body is initially at rest. It undergoes one-dimensional motion
with constant acceleration. The power delivered to it at time t is
proportional to
(i) t1/2 (ii) t (iii) t3/2 (iv) t2
6.10 A body is moving unidirectionally under the influence of a
source of constant power. Its displacement in time t is proportional
to
(i) t1/2 (ii) t (iii) t3/2 (iv) t2
6.11 A body constrained to move along the z-axis of a coordinate
system is subject to a constant force F given by

where are unit vectors along the x-, y- and z-axis of the
system respectively. What is the work done by this force in moving
the body a distance of 4 m along the z-axis ?
6.12 An electron and a proton are detected in a cosmic ray
experiment, the first with kinetic energy 10 keV, and the second
with 100 keV. Which is faster, the electron or the proton ? Obtain
the ratio of their speeds. (electron mass = 9.1110-31 kg, proton
mass = 1.671027 kg, 1 eV = 1.60 1019 J).
6.13 A rain drop of radius 2 mm falls from a height of 500 m above
the ground. It falls with decreasing acceleration (due to viscous
resistance of the air) until at half its original height, it attains its
maximum (terminal) speed, and moves with uniform speed
thereafter. What is the work done by the gravitational force on the
drop in the first and second half of its journey ? What is the work
done by the resistive force in the entire journey if its speed on
reaching the ground is 10 m s1 ?
6.14 A molecule in a gas container hits a horizontal wall with
speed 200 m s1 and angle 30 with the normal, and rebounds
with the same speed. Is momentum conserved in the collision ? Is
the collision elastic or inelastic ?
6.15 A pump on the ground floor of a building can pump up water
to fill a tank of volume 30 m3 in 15 min. If the tank is 40 m above
the ground, and the efficiency of the pump is 30%, how much
electric power is consumed by the pump ?
6.16 Two identical ball bearings in contact with each other and
resting on a frictionless table are hit head-on by another ball
bearing of the same mass moving initially with a speed V. If the
collision is elastic, which of the following (Fig. 6.14) is a possible
result after collision ?

Fig. 6.14

6.17 The bob A of a pendulum released from 30o to the vertical


hits another bob B of the same mass at rest on a table as shown
in Fig. 6.15. How high does the bob A rise after the collision?
Neglect the size of the bobs and assume the collision to be elastic.
Fig. 6.15
6.18 The bob of a pendulum is released from a horizontal position.
If the length of the pendulum is 1.5 m, what is the speed with
which the bob arrives at the lowermost point, given that it
dissipated 5% of its initial energy against air resistance ?
6.19 A trolley of mass 300 kg carrying a sandbag of 25 kg is
moving uniformly with a speed of 27 km/h on a frictionless track.
After a while, sand starts leaking out of a hole on the floor of the
trolley at the rate of 0.05 kg s1. What is the speed of the trolley
after the entire sand bag is empty ?
6.20 A body of mass 0.5 kg travels in a straight line with velocity v
=a x3/2 where a = 5 m1/2 s1. What is the work done by the net
force during its displacement from x = 0 to
x=2m?
6.21 The blades of a windmill sweep out a circle of area A. (a) If
the wind flows at a velocity v perpendicular to the circle, what is
the mass of the air passing through it in time t ? (b) What is the
kinetic energy of the air ? (c) Assume that the windmill converts
25% of the winds energy into electrical energy, and that A = 30
m2, v = 36 km/h and the density of air is 1.2 kg m3. What is the
electrical power produced ?
6.22 A person trying to lose weight (dieter) lifts a 10 kg mass, one
thousand times, to a height of 0.5 m each time. Assume that the
potential energy lost each time she lowers the mass is dissipated.
(a) How much work does she do against the gravitational force ?
(b) Fat supplies 3.8 107J of energy per kilogram which is
converted to mechanical energy with a 20% efficiency rate. How
much fat will the dieter use up?
6.23 A family uses 8 kW of power. (a) Direct solar energy is
incident on the horizontal surface at an average rate of 200 W per
square meter. If 20% of this energy can be converted to useful
electrical energy, how large an area is needed to supply 8 kW? (b)
Compare this area to that of the roof of a typical house.

Additional Exercises
6.24 A bullet of mass 0.012 kg and horizontal speed 70 m s1
strikes a block of wood of mass 0.4 kg and instantly comes to rest
with respect to the block. The block is suspended from the ceiling
by means of thin wires. Calculate the height to which the block
rises. Also, estimate the amount of heat produced in the block.
6.25 Two inclined frictionless tracks, one gradual and the other
steep meet at A from where two stones are allowed to slide down
from rest, one on each track (Fig. 6.16). Will the stones reach the
bottom at the same time? Will they reach there with the same
speed? Explain. Given 1 = 300, 2 = 600, and h = 10 m, what are
the speeds and times taken by the two stones ?

Fig. 6.16
6.26 A 1 kg block situated on a rough incline is connected to a
spring of spring constant 100 N m1 as shown in Fig. 6.17. The
block is released from rest with the spring in the unstretched
position. The block moves 10 cm down the incline before coming
to rest. Find the coefficient of friction between the block and the
incline. Assume that the spring has a negligible mass and the
pulley is frictionless.

Fig. 6.17
6.27 A bolt of mass 0.3 kg falls from the ceiling of an elevator
moving down with an uniform speed of 7 m s1. It hits the floor of
the elevator (length of the elevator = 3 m) and does not rebound.
What is the heat produced by the impact ? Would your answer be
different if the elevator were stationary ?
6.28 A trolley of mass 200 kg moves with a uniform speed of 36
km/h on a frictionless track. A child of mass 20 kg runs on the
trolley from one end to the other (10 m away) with a speed of 4 m
s1 relative to the trolley in a direction opposite to the its motion,
and jumps out of the trolley. What is the final speed of the trolley ?
How much has the trolley moved from the time the child begins to
run ?
6.29 Which of the following potential energy curves in Fig. 6.18
cannot possibly describe the elastic collision of two billiard balls ?
Here r is the distance between centres of the balls.
Fig. 6.18

6.30 Consider the decay of a free neutron at rest : n gp + e


Show that the two-body decay of this type must necessarily give
an electron of fixed energy and, therefore, cannot account for the
observed continuous energy distribution in the -decay of a
neutron or a nucleus (Fig. 6.19).
Fig. 6.19
[Note: The simple result of this exercise was one among the
several arguments advanced by W. Pauli to predict the existence
of a third particle in the decay products of -decay. This particle is
known as neutrino. We now know that it is a particle of intrinsic
spin (like e, p or n), but is neutral, and either massless or
having an extremely small mass (compared to the mass of
electron) and which interacts very weakly with matter. The correct
decay process of neutron is : n g p + e + ]

Appendix 6.1 : Power consumption in walking


The table below lists the approximate power expended by an adult
human of mass 60 kg.
Table 6.4 Approximate power consumption
Mechanical work must not be confused with the everyday usage of the
term work. A woman standing with a very heavy load on her head may
get very tired. But no mechanical work is involved. That is not to say
that mechanical work cannot be estimated in ordinary human activity.
Consider a person walking with constant speed v0. The mechanical
work he does may be estimated simply with the help of the work-
energy theorem. Assume :
(a) The major work done in walking is due to the acceleration and
deceleration of the legs with each stride (See Fig. 6.20).
(b) Neglect air resistance.
(c) Neglect the small work done in lifting the legs against gravity.
(d) Neglect the swinging of hands etc. as is common in walking.
As we can see in Fig. 6.20, in each stride the leg is brought from rest
to a speed, approximately equal to the speed of walking, and then
brought to rest again.

Fig. 6.20 An illustration of a single stride in walking. While the first leg is maximally
off the round, the second leg is on the ground and vice-versa

The work done by one leg in each stride is by the work-energy

theorem. Here ml is the mass of the leg. Note energy is


expended by one set of leg muscles to bring the foot from rest to
speed v0 while an additional is expended by a
complementary set of leg muscles to bring the foot to rest from speed
v0. Hence work done by both legs in one stride is (study Fig. 6.20
carefully)

(6.34)
Assuming ml = 10 kg and slow running of a nine-minute mile which
translates to 3 m s-1 in SI units, we obtain

If we take a stride to be 2 m long, the person covers 1.5 strides per


second at his speed of 3 m s-1. Thus the power expended

= 270 W
We must bear in mind that this is a lower estimate since several
avenues of power loss (e.g. swinging of hands, air resistance etc.)
have been ignored. The interesting point is that we did not worry about
the forces involved. The forces, mainly friction and those exerted on
the leg by the muscles of the rest of the body, are hard to estimate.
Static friction does no work and we bypassed the impossible task of
estimating the work done by the muscles by taking recourse to the
work-energy theorem. We can also see the advantage of a wheel. The
wheel permits smooth locomotion without the continual starting and
stopping in mammalian locomotion.
Chapter Nine

RAY OPTICS AND OPTICAL


INSTRUMENTS

9.1 Introduction

Nature has endowed the human eye (retina) with the sensitivity to
detect electromagnetic waves within a small range of the
electromagnetic spectrum. Electromagnetic radiation belonging to this
region of the spectrum (wavelength of about 400 nm to 750 nm) is
called light. It is mainly through light and the sense of vision that we
know and interpret the world around us.
There are two things that we can intuitively mention about light from
common experience. First, that it travels with enormous speed and
second, that it travels in a straight line. It took some time for people to
realise that the speed of light is finite and measurable. Its presently
accepted value in vacuum is c = 2.99792458 108 m s1. For many
purposes, it suffices to take c = 3 108 m s1. The speed of light in
vacuum is the highest speed attainable in nature.
The intuitive notion that light travels in a straight line seems to
contradict what we have learnt in Chapter 8, that light is an
electromagnetic wave of wavelength belonging to the visible part of
the spectrum. How to reconcile the two facts? The answer is that the
wavelength of light is very small compared to the size of ordinary
objects that we encounter commonly (generally of the order of a few
cm or larger). In this situation, as you will learn in Chapter 10, a light
wave can be considered to travel from one point to another, along a
straight line joining them. The path is called a ray of light, and a
bundle of such rays constitutes a beam of light.
In this chapter, we consider the phenomena of reflection, refraction
and dispersion of light, using the ray picture of light. Using the basic
laws of reflection and refraction, we shall study the image formation by
plane and spherical reflecting and refracting surfaces. We then go on
to describe the construction and working of some important optical
instruments, including the human eye.

Particle model of light


Newtons fundamental contributions to mathematics, mechanics, and
gravitation often blind us to his deep experimental and theoretical study of
light. He made pioneering contributions in the field of optics. He further
developed the corpuscular model of light proposed by Descartes. It presumes
that light energy is concentrated in tiny particles called corpuscles. He further
assumed that corpuscles of light were massless elastic particles. With his
understanding of mechanics, he could come up with a simple model of
reflection and refraction. It is a common observation that a ball bouncing from
a smooth plane surface obeys the laws of reflection. When this is an elastic
collision, the magnitude of the velocity remains the same. As the surface is
smooth, there is no force acting parallel to the surface, so the component of
momentum in this direction also remains the same. Only the component
perpendicular to the surface, i.e., the normal component of the momentum,
gets reversed in reflection. Newton argued that smooth surfaces like mirrors
reflect the corpuscles in a similar manner.
In order to explain the phenomena of refraction, Newton postulated that the
speed of the corpuscles was greater in water or glass than in air. However,
later on it was discovered that the speed of light is less in water or glass than
in air.
In the field of optics, Newton the experimenter, was greater than Newton
the theorist. He himself observed many phenomena, which were difficult to
understand in terms of particle nature of light. For example, the colours
observed due to a thin film of oil on water. Property of partial reflection of light
is yet another such example. Everyone who has looked into the water in a
pond sees image of the face in it, but also sees the bottom of the pond.
Newton argued that some of the corpuscles, which fall on the water, get
reflected and some get transmitted. But what property could distinguish these
two kinds of corpuscles? Newton had to postulate some kind of unpredictable,
chance phenomenon, which decided whether an individual corpuscle would be
reflected or not. In explaining other phenomena, however, the corpuscles were
presumed to behave as if they are identical. Such a dilemma does not occur in
the wave picture of light. An incoming wave can be divided into two weaker
waves at the boundary between air and water.

9.2 Reflection of Light by Spherical Mirrors

We are familiar with the laws of reflection. The angle of reflection (i.e.,
the angle between reflected ray and the normal to the reflecting
surface or the mirror) equals the angle of incidence (angle between
incident ray and the normal). Also that the incident ray, reflected ray
and the normal to the reflecting surface at the point of incidence lie in
the same plane
(Fig. 9.1). These laws are valid at each point on any reflecting surface
whether plane or curved. However, we shall restrict our discussion to
the special case of curved surfaces, that is, spherical surfaces. The
normal in this case is to be taken as normal to the tangent to surface
at the point of incidence. That is, the normal is along the radius, the
line joining the centre of curvature of the mirror to the point of
incidence.
Figure 9.1 The incident ray, reflected ray and the normal to the reflecting surface
lie in the same plane.

We have already studied that the geometric centre of a spherical


mirror is called its pole while that of a spherical lens is called its optical
centre. The line joining the pole and the centre of curvature of the
spherical mirror is known as the principal axis. In the case of spherical
lenses, the principal axis is the line joining the optical centre with its
principal focus as you will see later.
Figure 9.2 The Cartesian Sign Convention.

9.2.1 Sign convention

To derive the relevant formulae for reflection by spherical mirrors and


refraction by spherical lenses, we must first adopt a sign convention
for measuring distances. In this book, we shall follow the Cartesian
sign convention. According to this convention, all distances are
measured from the pole of the mirror or the optical centre of the lens.
The distances measured in the same direction as the incident light are
taken as positive and those measured in the direction opposite to the
direction of incident light are taken as negative (Fig. 9.2). The heights
measured upwards with respect to x-axis and normal to the principal
axis (x-axis) of the mirror/lens are taken as positive (Fig. 9.2). The
heights measured downwards are taken as negative.
With a common accepted convention, it turns out that a single formula
for spherical mirrors and a single formula for spherical lenses can
handle all different cases.

9.2.2 Focal length of spherical mirrors

Figure 9.3 shows what happens when a parallel beam of light is


incident on (a) a concave mirror, and (b) a convex mirror. We assume
that the rays are paraxial, i.e., they are incident at points close to the
pole P of the mirror and make small angles with the principal axis. The
reflected rays converge at a point F on the principal axis of a concave
mirror [Fig. 9.3(a)].
For a convex mirror, the reflected rays appear to diverge from a point
F on its principal axis [Fig. 9.3(b)]. The point F is called the principal
focus of the mirror. If the parallel paraxial beam of light were incident,
making some angle with the principal axis, the reflected rays would
converge (or appear to diverge) from a point in a plane through F
normal to the principal axis. This is called the focal plane of the mirror
[Fig. 9.3(c)].
The distance between the focus F and the pole P of the mirror is
called the focal length of the mirror, denoted by f. We now show that f
= R/2, where R is the radius of curvature of the mirror. The geometry
of reflection of an incident ray is shown in Fig. 9.4.
Figure 9.3 Focus of a concave and convex mirror.
Let C be the centre of curvature of the mirror. Consider a ray parallel
to the principal axis striking the mirror at M. Then CM will be
perpendicular to the mirror at M. Let be the angle of incidence, and
MD be the perpendicular from M on the principal axis. Then,
MCP = and MFP = 2
Now,

tan = and tan 2 = (9.1)


For small , which is true for paraxial rays, tan ,
tan 2 2. Therefore, Eq. (9.1) gives

=2

or, FD = (9.2)
Now, for small , the point D is very close to the point P. Therefore,
FD = f and CD = R. Equation (9.2) then gives

f = R/2 (9.3)

9.2.3 The mirror equation


Figure 9.4 Geometry of reflection of an incident ray on (a) concave spherical
mirror, and (b) convex spherical mirror.

If rays emanating from a point actually meet at another point after


reflection and/or refraction, that point is called the image of the first
point. The image is real if the rays actually converge to the point; it is
virtual if the rays do not actually meet but appear to diverge from the
point when produced backwards. An image is thus a point-to-point
correspondence with the object established through reflection and/or
refraction.
Figure 9.5 Ray diagram for image formation by a concave mirror.

In principle, we can take any two rays emanating from a point on an


object, trace their paths, find their point of intersection and thus, obtain
the image of the point due to reflection at a spherical mirror. In
practice, however, it is convenient to choose any two of the following
rays:
(i) The ray from the point which is parallel to the principal axis. The
reflected ray goes through the focus of the mirror.
(ii) The ray passing through the centre of curvature of a concave
mirror or appearing to pass through it for a convex mirror. The
reflected ray simply retraces the path.
(iii) The ray passing through (or directed towards) the focus of the
concave mirror or appearing to pass through (or directed towards) the
focus of a convex mirror. The reflected ray is parallel to the principal
axis.

(iv) The ray incident at any angle at the pole. The reflected ray follows
laws of reflection.
Figure 9.5 shows the ray diagram considering three rays. It shows the
image AB (in this case, real) of an object AB formed by a concave
mirror. It does not mean that only three rays emanate from the point A.
An infinite number of rays emanate from any source, in all directions.
Thus, point A is image point of A if every ray originating at point A and
falling on the concave mirror after reflection passes through the point
A.
We now derive the mirror equation or the relation between the object
distance (u), image distance (v) and the focal length (f ).
From Fig. 9.5, the two right-angled triangles ABF and MPF are
similar. (For paraxial rays, MP can be considered to be a straight line
perpendicular to CP.) Therefore,

or ( PM = AB) (9.4)
Since APB = APB, the right angled triangles ABP and ABP are
also similar. Therefore,

(9.5)
Comparing Eqs. (9.4) and (9.5), we get
(9.6)

Equation (9.6) is a relation involving magnitude of distances. We now


apply the sign convention. We note that light travels from the object to
the mirror MPn. Hence this is taken as the positive direction. To reach
the object AB, image AB as well as the focus F from the pole P, we
have to travel opposite to the direction of incident light. Hence, all the
three will have negative signs. Thus,
B P = v, FP = f, BP = u
Using these in Eq. (9.6), we get

or

(9.7)

This relation is known as the mirror equation.


The size of the image relative to the size of the object is another
important quantity to consider. We define linear magnification (m) as
the ratio of the height of the image (h) to the height of the object (h):

m= (9.8)
h and h will be taken positive or negative in accordance with the
accepted sign convention. In triangles ABP and ABP, we have,
With the sign convention, this becomes

so that

m= (9.9)
We have derived here the mirror equation, Eq. (9.7), and the
magnification formula, Eq. (9.9), for the case of real, inverted image
formed by a concave mirror. With the proper use of sign convention,
these are,in fact, valid for all the cases of reflection by a spherical
mirror (concave or convex) whether the image formed is real or virtual.
Figure 9.6 shows the ray diagrams for virtual image formed by a
concave and convex mirror. You should verify that Eqs. (9.7) and (9.9)
are valid for these cases as well.

Figure 9.6 Image formation by (a) a concave mirror with object between
P and F, and (b) a convex mirror.
Example 9.1 Suppose that the lower half of the concave mirrors reflecting
surface in Fig. 9.5 is covered with an opaque (non-reflective) material. What
effect will this have on the image of an object placed in front of the mirror?
Solution You may think that the image will now show only half of the object,
but taking the laws of reflection to be true for all points of the remaining part of
the mirror, the image will be that of the whole object. However, as the area of
the reflecting surface has been reduced, the intensity of the image will be low
(in this case, half).
Example 9.2 A mobile phone lies along the principal axis of a concave mirror,
as shown in Fig. 9.7. Show by suitable diagram, the formation of its image.
Explain why the magnification is not uniform. Will the distortion of image
depend on the location of the phone with respect to the mirror?

Figure 9.7

Solution The ray diagram for the formation of the image of the phone is shown
in Fig. 9.7. The image of the part which is on the plane perpendicular to
principal axis will be on the same plane. It will be of the same size, i.e., BC =
BC. You can yourself realise why the image is distorted.
Example 9.3 An object is placed at (i) 10 cm, (ii) 5 cm in front of a concave
mirror of radius of curvature 15 cm. Find the position, nature, and
magnification of the image in each case.
Solution The focal length f = 15/2 cm = 7.5 cm
(i) The object distance u = 10 cm. Then Eq. (9.7) gives
or = 30 cm

The image is 30 cm from the mirror on the same side as the object.

Also, magnification m =

The image is magnified, real and inverted.


(ii) The object distance u = 5 cm. Then from Eq. (9.7),

or

This image is formed at 15 cm behind the mirror. It is a virtual image.

Magnification m =

The image is magnified, virtual and erect.


Example 9.4 Suppose while sitting in a parked car, you notice a jogger
approaching towards you in the side view mirror of R = 2 m. If the jogger is
running at a speed of 5 m s1, how fast the image of the jogger appear to
move when the jogger is (a) 39 m, (b) 29 m, (c) 19 m, and (d) 9 m away.
Solution From the mirror equation, Eq. (9.7), we get

For convex mirror, since R = 2 m, f = 1 m. Then


for u = 39 m,

Since the jogger moves at a constant speed of 5 m s1, after 1 s the position of
the image v (for u = 39 + 5 = 34) is (34/35 )m.
The shift in the position of image in 1 s is

Therefore, the average speed of the image when the jogger is between
39 m and 34 m from the mirror, is (1/280) m s1
Similarly, it can be seen that for u = 29 m, 19 m and 9 m, the speed with
which the image appears to move is

respectively.
Although the jogger has been moving with a constant speed, the speed of
his/her image appears to increase substantially as he/she moves closer to the
mirror. This phenomenon can be noticed by any person sitting in a stationary
car or a bus. In case of moving vehicles, a similar phenomenon could be
observed if the vehicle in the rear is moving closer with a constant speed.

9.3 Refraction

When a beam of light encounters another transparent medium, a part


of light gets reflected back into the first medium while the rest enters
the other. A ray of light represents a beam. The direction of
propagation of an obliquely incident ray of light that enters the other
medium, changes at the interface of the two media. This phenomenon
is called refraction of light. Snell experimentally obtained the following
laws of refraction:
(i) The incident ray, the refracted ray and the normal to the interface at
the point of incidence, all lie in the same plane.
(ii) The ratio of the sine of the angle of incidence to the sine of angle of
refraction is constant. Remember that the angles of incidence (i ) and
refraction (r ) are the angles that the incident and its refracted ray
make with the normal, respectively. We have

(9.10)
where n21 is a constant, called the refractive index of the second
medium with respect to the first medium. Equation (9.10) is the well-
known Snells law of refraction. We note that n21 is a characteristic of
the pair of media (and also depends on the wavelength of light), but is
independent of the angle of incidence.
Figure 9.8 Refraction and reflection of light.

From Eq. (9.10), if n21 > 1, r < i, i.e., the refracted ray bends towards
the normal. In such a case medium 2 is said to be optically denser (or
denser, in short) than medium 1. On the other hand, if n21 <1, r > i, the
refracted ray bends away from the normal. This is the case when
incident ray in a denser medium refracts into a rarer medium.
Note: Optical density should not be confused with mass density, which
is mass per unit volume. It is possible that mass density of an optically
denser medium may be less than that of an optically rarer medium
(optical density is the ratio of the speed of light in two media). For
example, turpentine and water. Mass density of turpentine is less than
that of water but its optical density is higher.
If n21 is the refractive index of medium 2 with respect to medium 1
and n12 the refractive index of medium 1 with respect to medium 2,
then it should be clear that

(9.11)
It also follows that if n32 is the refractive index of medium 3 with
respect to medium 2 then n32 = n31 n12, where n31 is the refractive
index of medium 3 with respect to medium 1.

Figure 9.9 Lateral shift of a ray refracted through a parallel-sided slab.

Some elementary results based on the laws of refraction follow


immediately. For a rectangular slab, refraction takes place at two
interfaces (air-glass and glass-air). It is easily seen from Fig. 9.9 that
r2 = i1, i.e., the emergent ray is parallel to the incident raythere is no
deviation, but it does suffer lateral displacement/shift with respect to
the incident ray. Another familiar observation is that the bottom of a
tank filled with water appears to be raised (Fig. 9.10). For viewing near
the normal direction, it can be shown that the apparent depth, (h1) is
real depth (h2) divided by the refractive index of the medium (water).

The refraction of light through the atmosphere is responsible for many


interesting phenomena. For example, the sun is visible a little before
the actual sunrise and until a little after the actual sunset due to
refraction of light through the atmosphere (Fig. 9.11). By actual
sunrise we mean the actual crossing of the horizon by the sun. Figure
9.11 shows the actual and apparent positions of the sun with respect
to the horizon. The figure is highly exaggerated to show the effect.
The refractive index of air with respect to vacuum is 1.00029. Due to
this, the apparent shift in the direction of the sun is by about half a
degree and the corresponding time difference between actual sunset
and apparent sunset is about 2 minutes (see Example 9.5). The
apparent flattening (oval shape) of the sun at sunset and sunrise is
also due to the same phenomenon.
Figure 9.10 Apparent depth for (a) normal, and (b) oblique viewing.

Example 9.5 The earth takes 24 h to rotate once about its axis. How much
time does the sun take to shift by 1 when viewed from
the earth?
Solution
Time taken for 360 shift = 24 h
Time taken for 1 shift = 24/360 h = 4 min.
9.4 Total Internal Reflection

When light travels from an optically denser medium to a rarer medium


at the interface, it is partly reflected back into the same medium and
partly refracted to the second medium. This reflection is called the
internal reflection.

Figure 9.11 Advance sunrise and delayed sunset due to atmospheric refraction.

When a ray of light enters from a denser medium to a rarer medium, it


bends away from the normal, for example, the ray AO1 B in Fig. 9.12.
The incident ray AO1 is partially reflected (O1C) and partially
transmitted (O1B) or refracted, the angle of refraction (r) being larger
than the angle of incidence (i). As the angle of incidence increases, so
does the angle of refraction, till for the ray AO3, the angle of refraction
is /2. The refracted ray is bent so much away from the normal that it
grazes the surface at the interface between the two media. This is
shown by the ray AO3 D in Fig. 9.12. If the angle of incidence is
increased still further (e.g., the ray AO4), refraction is not possible,
and the incident ray is totally reflected. This is called total internal
reflection. When light gets reflected by a surface, normally some
fraction of it gets transmitted. The reflected ray, therefore, is always
less intense than the incident ray, howsoever smooth the reflecting
surface may be. In total internal reflection, on the other hand, no
transmission of light takes place.

The drowning child, lifeguard and Snells law

Consider a rectangular swimming pool PQSR; see figure here. A lifeguard


sitting at G outside the pool notices a child drowning at a point C.
The guard wants to reach the child in the shortest possible time. Let SR be the
side of the pool between G and C. Should he/she take a straight line path GAC
between G and C or GBC in which the path BC in water would be the shortest,
or some other path GXC? The guard knows that his/her running speed v1 on
ground is higher than his/her swimming speed v2.
Suppose the guard enters water at X. Let GX =l1 and XC =l2. Then the time
taken to reach from G to C would be

To make this time minimum, one has to differentiate it (with respect to the
coordinate of X) and find the point X when t is a minimum. On doing all this
algebra (which we skip here), we find that the guard should enter water at a
point where Snells law is satisfied. To understand this, draw a perpendicular
LM to side SR at X. Let GXM = i and CXL = r. Then it can be seen that t is
minimum when

In the case of light v1/v2, the ratio of the velocity of light in vacuum to that in
the medium, is the refractive index n of the medium.
In short, whether it is a wave or a particle or a human being, whenever two
mediums and two velocities are involved, one must follow Snells law if one
wants to take the shortest time.

The angle of incidence corresponding to an angle of refraction 90,


say AO3N, is called the critical angle (ic ) for the given pair of
media. We see from Snells law [Eq. (9.10)] that if the relative
refractive index is less than one then, since the maximum value
of sin r is unity, there is an upper limit
to the value of sin i for which the law can be satisfied, that is, i = ic
such that
sin ic = n21 (9.12)

For values of i larger than ic, Snells law of refraction cannot be


satisfied, and hence no refraction is possible.
Figure 9.12 Refraction and internal reflection of rays from a point A in the denser
medium (water) incident at different angles at the interface with a rarer medium
(air).

The refractive index of denser medium 1 with respect to rarer medium


2 will be n12 = 1/sinic. Some typical critical angles are listed in Table
9.1.
Screenshot from 2015-01-23 10:42:25

A demonstration for total internal reflection

All optical phenomena can be demonstrated very easily with the use
of a laser torch or pointer, which is easily available nowadays. Take a
glass beaker with clear water in it. Stir the water a few times with a
piece of soap, so that it becomes a little turbid. Take a laser pointer
and shine its beam through the turbid water. You will find that the path
of the beam inside the water shines brightly.
Shine the beam from below the beaker such that it strikes at the upper
water surface at the other end. Do you find that it undergoes partial
reflection (which is seen as a spot on the table below) and partial
refraction [which comes out in the air and is seen as a spot on the
roof; Fig. 9.13(a)]? Now direct the laser beam from one side of the
beaker such that it strikes the upper surface of water more obliquely
[Fig. 9.13(b)]. Adjust the direction of laser beam until you find the
angle for which the refraction above the water surface is totally absent
and the beam is totally reflected back to water. This is total internal
reflection at its simplest.
Pour this water in a long test tube and shine the laser light from top,
as shown in Fig. 9.13(c). Adjust the direction of the laser beam such
that it is totally internally reflected every time it strikes the walls of the
tube. This is similar to what happens in optical fibres.
Take care not to look into the laser beam directly and not to point it at
anybodys face.

9.4.1 Total internal reflection in nature and its


technological applications

(i) Mirage: On hot summer days, the air near the ground becomes
hotter than the air at higher levels. The refractive index of air
increases with its density. Hotter air is less dense, and has smaller
refractive index than the cooler air. If the air currents are small, that is,
the air is still, the optical density at different layers of air increases with
height. As a result, light from a tall object such as a tree, passes
through a medium whose refractive index decreases towards the
ground. Thus, a ray of light from such an object successively bends
away from the normal and undergoes total internal reflection, if the
angle of incidence for the air near the ground exceeds the critical
angle. This is shown in Fig. 9.14(b). To a distant observer, the light
appears to be coming from somewhere below the ground. The
observer naturally assumes that light is being reflected from the
ground, say, by a pool of water near the tall object. Such inverted
images of distant tall objects cause an optical illusion to the observer.
This phenomenon is called mirage. This type of mirage is especially
common in hot deserts. Some of you might have noticed that while
moving in a bus or a car during a hot summer day, a distant patch of
road, especially on a highway, appears to be wet. But, you do not find
any evidence of wetness when you reach that spot. This is also due to
mirage.
Figure 9.13 Observing total internal reflection in water with a laser beam (refraction
due to glass of beaker neglected being very thin).

(ii) Diamond: Diamonds are known for their spectacular brilliance.


Figure 9.14 (a) A tree is seen by an observer at its place when the air above the
ground is at uniform temperature, (b) When the layers of air close to the ground
have varying temperature with hottest layers near the ground, light from a distant
tree may undergo total internal reflection, and the apparent image of the tree may
create an illusion to the observer that the tree is near a pool of water.

Their brilliance is mainly due to the total internal reflection of light


inside them. The critical angle for diamond-air interface ( 24.4) is
very small, therefore once light enters a diamond, it is very likely to
undergo total internal reflection inside it. Diamonds found in nature
rarely exhibit the brilliance for which they are known. It is the technical
skill of a diamond cutter which makes diamonds to sparkle so
brilliantly. By cutting the diamond suitably, multiple total internal
reflections can be made
to occur.

Figure 9.15 Prisms designed to bend rays by 90 and 180 or to invert image
without changing its size make use of total internal reflection.

(iii) Prism: Prisms designed to bend light by 90 or by 180 make use


of total internal reflection [Fig. 9.15(a) and (b)]. Such a prism is also
used to invert images without changing their size [Fig. 9.15(c)].
In the first two cases, the critical angle ic for the material of the prism
must be less than 45. We see from Table 9.1 that this is true for both
crown glass and dense flint glass.
(iv) Optical fibres: Now-a-days optical fibres are extensively used for
transmitting audio and video signals through long distances. Optical
fibres too make use of the phenomenon of total internal reflection.
Optical fibres are fabricated with high quality composite glass/quartz
fibres. Each fibre consists of a core and cladding. The refractive index
of the material of the core is higher than that of the cladding.
When a signal in the form of light is directed at one end of the fibre at
a suitable angle, it undergoes repeated total internal reflections along
the length of the fibre and finally comes out at the other end (Fig.
9.16). Since light undergoes total internal reflection at each stage,
there is no appreciable loss in the intensity of the light signal. Optical
fibres are fabricated such that light reflected at one side of inner
surface strikes the other at an angle larger than the critical angle.
Even if the fibre is bent, light can easily travel along its length. Thus,
an optical fibre can be used to act as an optical pipe.

Figure 9.16 Light undergoes successive total internal reflections as it moves


through an optical fibre.

A bundle of optical fibres can be put to several uses. Optical fibres are
extensively used for transmitting and receiving electrical signals which
are converted to light by suitable transducers. Obviously, optical fibres
can also be used for transmission of optical signals. For example,
these are used as a light pipe to facilitate visual examination of
internal organs like esophagus, stomach and intestines. You might
have seen a commonly available decorative lamp with fine plastic
fibres with their free ends forming a fountain like structure. The other
end of the fibres is fixed over an electric lamp. When the lamp is
switched on, the light travels from the bottom of each fibre and
appears at the tip of its free end as a dot of light. The fibres in such
decorative lamps are optical fibres.
The main requirement in fabricating optical fibres is that there should
be very little absorption of light as it travels for long distances inside
them. This has been achieved by purification and special preparation
of materials such as quartz. In silica glass fibres, it is possible to
transmit more than 95% of the light over a fibre length of 1 km.
(Compare with what you expect for a block of ordinary window glass 1
km thick.)

9.5 Refraction at Spherical Surfaces and by


Lenses
We have so far considered refraction at a plane interface. We shall
now consider refraction at a spherical interface between two
transparent media. An infinitesimal part of a spherical surface can be
regarded as planar and the same laws of refraction can be applied at
every point on the surface. Just as for reflection by a spherical mirror,
the normal at the point of incidence is perpendicular to the tangent
plane to the spherical surface at that point and, therefore, passes
through its centre of curvature. We first consider refraction by a single
spherical surface and follow it by thin lenses. A thin lens is a
transparent optical medium bounded by two surfaces; at least one of
which should be spherical. Applying the formula for image formation
by a single spherical surface successively at the two surfaces of a
lens, we shall obtain the lens makers formula and then the lens
formula.

9.5.1 Refraction at a spherical surface


Figure 9.17 Refraction at a spherical surface separating two media

Figure 9.17 shows the geometry of formation of image I of an object O


on the principal axis of a spherical surface with centre of curvature C,
and radius of curvature R. The rays are incident from a medium of
refractive index n1, to another of refractive index n2. As before, we
take the aperture (or the lateral size) of the surface to be small
compared to other distances involved, so that small angle
approximation can be made. In particular, NM will be taken to be
nearly equal to the length of the perpendicular from the point N on the
principal axis. We have, for small angles,

tan NOM =

tan NCM =

tan NIM =

Now, for NOC, i is the exterior angle. Therefore, i = NOM + NCM

i= (9.13)
Similarly,
r = NCM NIM

i.e., r = (9.14)
Now, by Snells law
n1 sin i = n2 sin r

or for small angles


n1i = n2r

Substituting i and r from Eqs. (9.13) and (9.14), we get

(9.15)
Here, OM, MI and MC represent magnitudes of distances. Applying
the Cartesian sign convention,

Light sources and photometry

It is known that a body above absolute zero temperature emits electromagnetic


radiation. The wavelength region in which the body emits the radiation
depends on its absolute temperature. Radiation emitted by a hot body, for
example, a tungsten filament lamp having temperature 2850 K are partly
invisible and mostly in infrared (or heat) region. As the temperature of the body
increases radiation emitted by it is in visible region. The sun with temperature
of about 5500 K emits radiation whose energy versus wavelength graph peaks
approximately at 550 nm corresponding to green light and is almost in the
middle of the visible region. The energy versus wavelength distribution graph
for a given body peaks at some wavelength, which is inversely proportional to
the absolute temperature of that body.
The measurement of light as perceived by human eye is called photometry.
Photometry is measurement of a physiological phenomenon, being the
stimulus of light as received by the human eye, transmitted by the optic nerves
and analysed by the brain. The main physical quantities in photometry are (i)
the luminous intensity of the source, (ii) the luminous flux or flow of light from
the source, and (iii) illuminance of the surface. The SI unit of luminous intensity
(I) is candela (cd). The candela is the luminous intensity, in a given direction,
of a source that emits monochromatic radiation of frequency 540 1012 Hz
and that has a radiant intensity in that direction of 1/683 watt per steradian. If a
light source emits one candela of luminous intensity into a solid angle of one
steradian, the total luminous flux emitted into that solid angle is one lumen
(lm). A standard 100 watt incadescent light bulb emits approximately 1700
lumens.
In photometry, the only parameter, which can be measured directly is
illuminance. It is defined as luminous flux incident per unit area on a surface
(lm/m2 or lux). Most light meters measure this quantity. The illuminance E,
produced by a source of luminous intensity I, is given by E = I/r2, where r is the
normal distance of the surface from the source. A quantity named luminance
(L), is used to characterise the brightness of emitting or reflecting flat surfaces.
Its unit is cd/m2 (sometimes called nit in industry) . A good LCD computer
monitor has a brightness of about 250 nits.

OM = u, MI = +v, MC = +R
Substituting these in Eq. (9.15), we get

(9.16)
Equation (9.16) gives us a relation between object and image distance
in terms of refractive index of the medium and the radius of curvature
of the curved spherical surface. It holds for any curved spherical
surface.

Example 9.6 Light from a point source in air falls on a spherical glass surface
(n = 1.5 and radius of curvature = 20 cm). The distance of the light source from
the glass surface is 100 cm. At what position the image is formed?
Solution
We use the relation given by Eq. (9.16). Here
u = 100 cm, v = ?, R = + 20 cm, n1 = 1, and n2 = 1.5.
We then have

or v = +100 cm
The image is formed at a distance of 100 cm from the glass surface, in the
direction of incident light.

9.5.2 Refraction by a lens

Figure 9.18(a) shows the geometry of image formation by a double


convex lens. The image formation can be seen in terms of two steps:
(i) The first refracting surface forms the image I1 of the object O
[Fig. 9.18(b)]. The image I1 acts as a virtual object for the second
surface that forms the image at I [Fig. 9.18(c)]. Applying Eq. (9.15) to
the first interface ABC, we get

(9.17)
A similar procedure applied to the second interface* ADC gives,

(9.18)
* Note that now the refractive index of the medium on the right side of
ADC is n1 while on its left it is n2. Further DI1 is negative as the
distance is measured against the direction of incident light.
For a thin lens, BI1 = DI1. Adding
Eqs. (9.17) and (9.18), we get

(9.19)
Suppose the object is at infinity, i.e.,
OB and DI = f, Eq. (9.19) gives

(9.20)
The point where image of an object placed at infinity is formed is
called the focus F, of the lens and the distance f gives its focal length.
A lens has two foci, F and F, on either side of it (Fig. 9.19). By the
sign convention,
BC1 = + R1,

DC2 = R2

So Eq. (9.20) can be written as

(9.21)
Equation (9.21) is known as the lens makers formula. It is useful to
design lenses of desired focal length using surfaces of suitable radii of
curvature. Note that the formula is true for a concave lens also. In that
case R1is negative, R2 positive and therefore, f is negative.
Figure 9.18 (a) The position of object, and the image formed by a double convex
lens,
(b) Refraction at the first spherical surface and
(c) Refraction at the second spherical surface.

From Eqs. (9.19) and (9.20), we get

(9.22)
Again, in the thin lens approximation, B and D are both close to the
optical centre of the lens. Applying the sign convention,
BO = u, DI = +v, we get

(9.23)
Equation (9.23) is the familiar thin lens formula. Though we derived it
for a real image formed by a convex lens, the formula is valid for both
convex as well as concave lenses and for both real and virtual
images.
It is worth mentioning that the two foci, F and F, of a double convex or
concave lens are equidistant from the optical centre. The focus on the
side of the (original) source of light is called the first focal point,
whereas the other is called the second focal point.
To find the image of an object by a lens, we can, in principle, take any
two rays emanating from a point on an object; trace their paths using
the laws of refraction and find the point where the refracted rays meet
(or appear to meet). In practice, however, it is convenient to choose
any two of the following rays:
(i) A ray emanating from the object parallel to the principal axis of the
lens after refraction passes through the second principal focus F (in a
convex lens) or appears to diverge (in a concave lens) from the first
principal focus F.
(ii) A ray of light, passing through the optical centre of the lens,
emerges without any deviation after refraction.
(iii) A ray of light passing through the first principal focus (for a convex
lens) or appearing to meet at it (for a concave lens) emerges parallel
to the principal axis after refraction.

Figure 9.19 Tracing rays through (a) convex lens (b) concave lens.

Figures 9.19(a) and (b) illustrate these rules for a convex and a
concave lens, respectively. You should practice drawing similar ray
diagrams for different positions of the object with respect to the lens
and also verify that the lens formula, Eq. (9.23), holds good for all
cases.

Here again it must be remembered that each point on an object gives


out infinite number of rays. All these rays will pass through the same
image point after refraction at the lens.
Magnification (m) produced by a lens is defined, like that for a mirror,
as the ratio of the size of the image to that of the object. Proceeding in
the same way as for spherical mirrors, it is easily seen that for a lens

m= = (9.24)
When we apply the sign convention, we see that, for erect (and virtual)
image formed by a convex or concave lens, m is positive, while for an
inverted (and real) image, m is negative.

Example 9.7 A magician during a show makes a glass lens with


n = 1.47 disappear in a trough of liquid. What is the refractive index of the
liquid? Could the liquid be water?
Solution
The refractive index of the liquid must be equal to 1.47 in order to make the
lens disappear. This means n1 = n2.. This gives 1/f =0 or
f . The lens in the liquid will act like a plane sheet of glass. No, the liquid is
not water. It could be glycerine.

9.5.3 Power of a lens


Power of a lens is a measure of the convergence or divergence, which
a lens introduces in the light falling on it. Clearly, a lens of shorter
focal length bends the incident light more, while converging it in case
of a convex lens and diverging it in case of a concave lens. The power
P of a lens is defined as the tangent of the angle by which it
converges or diverges a beam of light falling at unit distant from the
optical centre (Fig. 9.20).

or for small value of .


Thus,

P= (9.25)
The SI unit for power of a lens is dioptre (D): 1D = 1m1. The power of
a lens of focal length of 1 metre is one dioptre. Power of a lens is
positive for a converging lens and negative for a diverging lens. Thus,
when an optician prescribes a corrective lens of power + 2.5 D, the
required lens is a convex lens of focal length + 40 cm. A lens of power
of 4.0 D means a concave lens of focal length 25 cm.
Figure 9.20 Power of a lens.

Example 9.8 (i) If f = 0.5 m for a glass lens, what is the power of the lens? (ii)
The radii of curvature of the faces of a double convex lens are 10 cm and 15
cm. Its focal length is 12 cm. What is the refractive index of glass? (iii) A
convex lens has 20 cm focal length in air. What is focal length in water?
(Refractive index of air-water = 1.33, refractive index for air-glass = 1.5.)
Solution
(i) Power = +2 dioptre.
(ii) Here, we have f = +12 cm, R1 = +10 cm, R2 = 15 cm.

Refractive index of air is taken as unity.


We use the lens formula of Eq. (9.22). The sign convention has to be applied
for f, R1 and R2.
Substituting the values, we have

This gives n = 1.5.


(iii) For a glass lens in air, n2 = 1.5, n1 = 1, f = +20 cm. Hence, the lens
formula gives
For the same glass lens in water, n2 = 1.5, n1 = 1.33. Therefore,

(9.26)
Combining these two equations, we find f = + 78.2 cm.

9.5.4 Combination of thin lenses in contact

Consider two lenses A and B of focal length f1 and f2 placed in


contact with each other. Let the object be placed at a point O beyond
the focus of the first lens A (Fig. 9.21). The first lens produces an
image at I1. Since image I1 is real, it serves as a virtual object for the
second lens B, producing the final image at I. It must, however, be
borne in mind that formation of image by the first lens is presumed
only to facilitate determination of the position of the final image. In fact,
the direction of rays emerging from the first lens gets modified in
accordance with the angle at which they strike the second lens. Since
the lenses are thin, we assume the optical centres of the lenses to be
coincident. Let this central point be denoted by P.
For the image formed by the first lens A, we get
(9.27)

For the image formed by the second lens B, we get

(9.28)
Adding Eqs. (9.27) and (9.28), we get

(9.29)
If the two lens-system is regarded as equivalent to a single lens of
focal length f, we have

so that we get

(9.30)

Figure 9.21 Image formation by a combination of two thin lenses in contact.


The derivation is valid for any number of thin lenses in contact. If
several thin lenses of focal length f1, f2, f3,... are in contact, the
effective focal length of their combination is given by

(9.31)
In terms of power, Eq. (9.31) can be written as

P = P1 + P2 + P3 + (9.32)

where P is the net power of the lens combination. Note that the sum in
Eq. (9.32) is an algebraic sum of individual powers, so some of the
terms on the right side may be positive (for convex lenses) and some
negative (for concave lenses). Combination of lenses helps to obtain
diverging or converging lenses of desired magnification. It also
enhances sharpness of the image. Since the image formed by the first
lens becomes the object for the second, Eq. (9.25) implies that the
total magnification m of the combination is a product of magnification
(m1, m2, m3,...) of individual lenses

m = m1 m2 m3 ... (9.33)

Such a system of combination of lenses is commonly used in


designing lenses for cameras, microscopes, telescopes and other
optical instruments.

Example 9.9 Find the position of the image formed by the lens combination
given in the Fig. 9.22.
Figure 9.22

Solution Image formed by the first lens

or v1 = 15 cm
The image formed by the first lens serves as the object for the second. This is
at a distance of (15 5) cm = 10 cm to the right of the second lens. Though
the image is real, it serves as a virtual object for the second lens, which means
that the rays appear to come from it for the second lens.

or v2 =
The virtual image is formed at an infinite distance to the left of the second lens.
This acts as an object for the third lens.

or
or v3 = 30 cm
The final image is formed 30 cm to the right of the third lens.

9.6 Refraction through a Prism

Figure 9.23 shows the passage of light through a triangular prism


ABC. The angles of incidence and refraction at the first face AB are i
and r1, while the angle of incidence (from glass to air) at the second
face AC is r2 and the angle of refraction or emergence e. The angle
between the emergent ray RS and the direction of the incident ray PQ
is called the angle of deviation, .
In the quadrilateral AQNR, two of the angles (at the vertices Q and R)
are right angles. Therefore, the sum of the other angles of the
quadrilateral is 180.
A + QNR = 180
From the triangle QNR,
r1 + r2 + QNR = 180

Comparing these two equations, we get


r1 + r2 = A (9.34)

The total deviation is the sum of deviations at the two faces,


= (i r1 ) + (e r2 )

that is,
= i + e A (9.35)
Thus, the angle of deviation depends on the angle of incidence. A plot
between the angle of deviation and angle of incidence is shown in Fig.
9.24.

Figure 9.23 A ray of light passing through a triangular glass prism.

You can see that, in general, any given value of , except for i = e,
corresponds to two values i and hence of e. This, in fact, is expected
from the symmetry of i and e in Eq. (9.35), i.e., remains the same if i
and e are interchanged. Physically, this is related to the fact that the
path of ray in Fig. 9.23 can be traced back, resulting in the same angle
of deviation. At the minimum deviation Dm, the refracted ray inside the
prism becomes parallel to its base. We have
= Dm, i = e which implies r1 = r2.

Equation (9.34) gives


2r = A or r = (9.36)

In the same way, Eq. (9.35) gives


Dm = 2i A, or i = (A + Dm)/2 (9.37)

The refractive index of the prism is

(9.38)
The angles A and Dm can be measured experimentally. Equation
(9.38) thus provides a method of determining refractive index of the
material of the prism.
For a small angle prism, i.e., a thin prism, Dm is also very small, and
we get

Dm = (n211)A

It implies that, thin prisms do not deviate light much.


Figure 9.24 Plot of angle of deviation () versus angle of incidence (i) for a
triangular prism.

9.7 Dispersion by a Prism

It has been known for a long time that when a narrow beam of
sunlight, usually called white light, is incident on a glass prism, the
emergent light is seen to be consisting of several colours. There is
actually a continuous variation of colour, but broadly, the different
component colours that appear in sequence are: violet, indigo, blue,
green, yellow, orange and red (given by the acronym VIBGYOR). The
red light bends the least, while the violet light bends the most (Fig.
9.25).

The phenomenon of splitting of light into its component colours is


known as dispersion. The pattern of colour components of light is
called the spectrum of light. The word spectrum is now used in a much
more general sense: we discussed in Chapter 8 the electro- magnetic
spectrum over the large range of wavelengths, from -rays to radio
waves, of which the spectrum of light (visible spectrum) is only a small
part.
Though the reason for appearance of spectrum is now common
knowledge, it was a matter of much debate in the history of physics.
Does the prism itself create colour in some way or does it only
separate the colours already present in white light?

Figure 9.25 Dispersion of sunlight or white light on passing through a glass prism.
The relative deviation of different colours shown is highly exaggerated.
In a classic experiment known for its simplicity but great significance,
Isaac Newton settled the issue once for all. He put another similar
prism, but in an inverted position, and let the emergent beam from the
first prism fall on the second prism (Fig. 9.26). The resulting emergent
beam was found to be white light. The explanation was clear the
first prism splits the white light into its component colours, while the
inverted prism recombines them to give white light. Thus, white light
itself consists of light of different colours, which are separated by the
prism.
It must be understood here that a ray of light, as defined
mathematically, does not exist. An actual ray is really a beam of many
rays of light. Each ray splits into component colours when it enters the
glass prism. When those coloured rays come out on the other side,
they again produce a white beam.

Figure 9.26 Schematic diagram of Newtons classic experiment on


dispersion of white light.
We now know that colour is associated with wavelength of light. In the
visible spectrum, red light is at the long wavelength end (~700 nm)
while the violet light is at the short wavelength end (~ 400 nm).
Dispersion takes place because the refractive index of medium for
different wavelengths (colours) is different. For example, the bending
of red component of white light is least while it is most for the violet.
Equivalently, red light travels faster than violet light in a glass prism.
Table 9.2 gives the refractive indices for different wavelength for
crown glass and flint glass. Thick lenses could be assumed as made
of many prisms, therefore, thick lenses show chromatic aberration due
to dispersion of light.
Screenshot from 2015-01-23 11:23:00

The variation of refractive index with wavelength may be more


pronounced in some media than the other. In vacuum, of course, the
speed of light is independent of wavelength. Thus, vacuum (or air
approximately) is a non-dispersive medium in which all colours travel
with the same speed. This also follows from the fact that sunlight
reaches us in the form of white light and not as its components. On
the other hand, glass is a dispersive medium.

9.8 Some Natural Phenomena due to Sunlight

The interplay of light with things around us gives rise to several


beautiful phenomena. The spectacle of colour that we see around us
all the time is possible only due to sunlight. The blue of the sky, white
clouds, the red-hue at sunrise and sunset, the rainbow, the brilliant
colours of some pearls, shells, and wings of birds, are just a few of the
natural wonders we are used to. We describe some of them here from
the point of view of physics.

9.8.1 The rainbow

The rainbow is an example of the dispersion of sunlight by the water


drops in the atmosphere. This is a phenomenon due to combined
effect of dispersion, refraction and reflection of sunlight by spherical
water droplets of rain. The conditions for observing a rainbow are that
the sun should be shining in one part of the sky (say near western
horizon) while it is raining in the opposite part of the sky (say eastern
horizon).
An observer can therefore see a rainbow only when his back is
towards the sun.
In order to understand the formation of rainbows, consider Fig.
(9.27(a). Sunlight is first refracted as it enters a raindrop, which
causes the different wavelengths (colours) of white light to separate.
Longer wangelength of light (red) are bent the least while the shorter
wavelength (violet) are bent the most. Next, these component rays
strike the inner surface of the water drop and get internally reflected if
the angle between the refracted ray and normal to the drop surface is
greater then the critical angle (48, in this case). The reflected light is
refracted again as it comes out of the drop as shown in the figure. It is
found that the violet light emerges at an angle of 40 related to the
incoming sunlight and red light emerges at an angle of 42. For other
colours, angles lie in between these two values.
Figure 9.27(b) explains the formation of primary rainbow. We see that
red light from drop 1 and violet light from drop 2 reach the observers
eye. The violet from drop 1 and red light from drop 2 are directed at
level above or below the observer. Thus the observer sees a rainbow
with red colour on the top and violet on the bottom. Thus, the primary
rainbow is a result of three-step process, that is, refraction, reflection
and refraction.
When light rays undergoes two internal reflections inside a raindrop,
instead of one as in the primary rainbow, a secondary rainbow is
formed as shown in Fig. 9.27(c). It is due to four-step process. The
intensity of light is reduced at the second reflection and hence the
secondary rainbow is fainter than the primary rainbow. Further, the
order of the colours is reversed in it as is clear from Fig. 9.27(c).
Figure 9.27 Rainbow: (a) The sun rays incident on a water drop get refracted twice
and reflected internally by a drop; (b) Enlarge view of internal reflection and
refraction of a ray of light inside a drop form primary rainbow; and (c) secondary
rainbow is formed by rays undergoing internal reflection twice inside the drop.

9.8.2 Scattering of light


As sunlight travels through the earths atmosphere, it gets scattered
(changes its direction) by the atmospheric particles. Light of shorter
wavelengths is scattered much more than light of longer wavelengths.
(The amount of scattering is inversely proportional to the fourth power
of the wavelength. This is known as Rayleigh scattering). Hence, the
bluish colour predominates in a clear sky, since blue has a shorter
wave length than red and is scattered much more strongly. In fact,
violet gets scattered even more than blue, having a shorter
wavelength.
But since our eyes are more sensitive to blue than violet, we see the
sky blue.
Large particles like dust and water droplets present in the atmosphere
behave differently. The relevant quantity here is the relative size of the
wavelength of light , and the scatterer (of typical size, say, a). For a
<< , one has Rayleigh scattering which is proportional to 1/4. For a
>> , i.e., large scattering objects (for example, raindrops, large dust
or ice particles) this is not true; all wavelengths are scattered nearly
equally. Thus, clouds which have droplets of water with a >> are
generally white.

At sunset or sunrise, the suns rays have to pass through a larger


distance in the atmosphere (Fig. 9.28). Most of the blue and other
shorter wavelengths are removed by scattering. The least scattered
light reaching our eyes, therefore, the sun looks reddish. This explains
the reddish appearance of the sun and full moon near the horizon.

9.9 Optical Instruments


Figure 9.28 Sunlight travels through a longer distance in the atmosphere at sunset
and sunrise.

A number of optical devices and instruments have been designed


utilising reflecting and refracting properties of mirrors, lenses and
prisms. Periscope, kaleidoscope, binoculars, telescopes, microscopes
are some examples of optical devices and instruments that are in
common use. Our eye is, of course, one of the most important optical
device the nature has endowed us with. Starting with the eye, we then
go on to describe the principles of working of the microscope and the
telescope.

9.9.1 The eye

Figure 9.29 (a) shows the eye. Light enters the eye through a curved
front surface, the cornea. It passes through the pupil which is the
central hole in the iris. The size of the pupil can change under control
of muscles. The light is further focussed by the eye lens on the retina.
The retina is a film of nerve fibres covering the curved back surface of
the eye. The retina contains rods and cones which sense light
intensity and colour, respectively, and transmit electrical signals via
the optic nerve to the brain which finally processes this information.
The shape (curvature) and therefore the focal length of the lens can
be modified somewhat by the ciliary muscles. For example, when the
muscle is relaxed, the focal length is about 2.5 cm and objects at
infinity are in sharp focus on the retina. When the object is brought
closer to the eye, in order to maintain the same image-lens distance (
2.5 cm), the focal length of the eye lens becomes shorter by the action
of the ciliary muscles. This property of the eye is called
accommodation. If the object is too close to the eye, the lens cannot
curve enough to focus the image on to the retina, and the image is
blurred. The closest distance for which the lens can focus light on the
retina is called the least distance of distinct vision, or the near point.
The standard value for normal vision is taken as 25 cm. (Often the
near point is given the symbol D.) This distance increases with age,
because of the decreasing effectiveness of the ciliary muscle and the
loss of flexibility of the lens. The near point may be as close as about
7 to 8 cm in a child ten years of age, and may increase to as much as
200 cm at 60 years of age. Thus, if an elderly person tries to read a
book at about 25 cm from the eye, the image appears blurred. This
condition (defect of the eye) is called presbyopia. It is corrected by
using a converging lens for reading.
Thus, our eyes are marvellous organs that have the capability to
interpret incoming electromagnetic waves as images through a
complex process. These are our greatest assets and we must take
proper care to protect them. Imagine the world without a pair of
functional eyes. Yet many amongst us bravely face this challenge by
effectively overcoming their limitations to lead a normal life. They
deserve our appreciation for their courage and conviction.
In spite of all precautions and proactive action, our eyes may develop
some defects due to various reasons. We shall restrict our discussion
to some common optical defects of the eye. For example, the light
from a distant object arriving at the eye-lens may get converged at a
point in front of the retina. This type of defect is called
nearsightedness or myopia. This means that the eye is producing too
much convergence in the incident beam. To compensate this, we
interpose a concave lens between the eye and the object, with the
diverging effect desired to get the image focussed on the retina [Fig.
9.29(b)].
Figure 9.29 (a) The structure of the eye; (b) shortsighted or myopic eye and its
correction;
(c) farsighted or hypermetropic eye and its correction; and (d) astigmatic eye and
its correction.

Similarly, if the eye-lens focusses the incoming light at a point behind


the retina, a convergent lens is needed to compensate for the defect
in vision. This defect is called farsightedness or hypermetropia [Fig.
9.29(c)].
Another common defect of vision is called astigmatism. This occurs
when the cornea is not spherical in shape. For example, the cornea
could have a larger curvature in the vertical plane than in the
horizontal plane or vice-versa. If a person with such a defect in eye-
lens looks at a wire mesh or a grid of lines, focussing in either the
vertical or the horizontal plane may not be as sharp as in the other
plane. Astigmatism results in lines in one direction being well focussed
while those in a perpendicular direction may appear distorted [Fig.
9.29(d)]. Astigmatism can be corrected by using a cylindrical lens of
desired radius of curvature with an appropriately directed axis. This
defect can occur along with myopia or hypermetropia.

Example 9.10 What focal length should the reading spectacles have for a
person for whom the least distance of distinct vision is 50 cm?
Solution The distance of normal vision is 25 cm. So if a book is at
u = 25 cm, its image should be formed at v = 50 cm. Therefore, the desired
focal length is given by

or
or f = + 50 cm (convex lens).

Example 9.11
(a) The far point of a myopic person is 80 cm in front of the eye. What is the
power of the lens required to enable him to see very distant objects clearly?
(b) In what way does the corrective lens help the above person? Does the lens
magnify very distant objects? Explain carefully.
(c) The above person prefers to remove his spectacles while reading a book.
Explain why?
Solution
(a) Solving as in the previous example, we find that the person should use a
concave lens of focal length = 80 cm, i.e., of power = 1.25 dioptres.
(b) No. The concave lens, in fact, reduces the size of the object, but the angle
subtended by the distant object at the eye is the same as the angle subtended
by the image (at the far point) at the eye. The eye is able to see distant objects
not because the corrective lens magnifies the object, but because it brings the
object (i.e., it produces virtual image of the object) at the far point of the eye
which then can be focussed by the eye-lens on the retina.
(c) The myopic person may have a normal near point, i.e., about
25 cm (or even less). In order to read a book with the spectacles, such a
person must keep the book at a distance greater than
25 cm so that the image of the book by the concave lens is produced not
closer than 25 cm. The angular size of the book (or its image) at the greater
distance is evidently less than the angular size when the book is placed at 25
cm and no spectacles are needed. Hence, the person prefers to remove the
spectacles while reading.
Example 9.12 (a) The near point of a hypermetropic person is 75 cm from the
eye. What is the power of the lens required to enable the person to read
clearly a book held at 25 cm from the eye? (b) In what way does the corrective
lens help the above person? Does the lens magnify objects held near the eye?
(c) The above person prefers to remove the spectacles while looking at the
sky. Explain why?
Solution
(a) u = 25 cm, v = 75 cm
1/f = 1/25 1/75, i.e., f = 37.5 cm.
The corrective lens needs to have a converging power of +2.67 dioptres.
(b) The corrective lens produces a virtual image (at 75 cm) of an object at 25
cm. The angular size of this image is the same as that of the object. In this
sense the lens does not magnify the object but merely brings the object to the
near point of the hypermetric eye, which then gets focussed on the retina.
However, the angular size is greater than that of the same object at the near
point (75 cm) viewed without the spectacles.
(c) A hypermetropic eye may have normal far point i.e., it may have enough
converging power to focus parallel rays from infinity on the retina of the
shortened eyeball. Wearing spectacles of converging lenses (used for near
vision) will amount to more converging power than needed for parallel rays.
Hence the person prefers not to use the spectacles for far objects.

9.9.2 The microscope

A simple magnifier or microscope is a converging lens of small focal


length (Fig. 9.30). In order to use such a lens as a microscope, the
lens is held near the object, one focal length away or less, and the eye
is positioned close to the lens on the other side. The idea is to get an
erect, magnified and virtual image of the object at a distance so that it
can be viewed comfortably, i.e., at 25 cm or more. If the object is at a
distance f, the image is at infinity. However, if the object is at a
distance slightly less than the focal length of the lens, the image is
virtual and closer than infinity. Although the closest comfortable
distance for viewing the image is when it is at the near point (distance
D 25 cm), it causes some strain on the eye. Therefore, the image
formed at infinity is often considered most suitable for viewing by the
relaxed eye. We show both cases, the first in Fig. 9.30(a), and the
second in Fig. 9.30(b) and (c).
The linear magnification m, for the image formed at the near point D,
by a simple microscope can be obtained by using the relation

Now according to our sign convention, v is negative, and is equal in


magnitude to D. Thus, the magnification is

(9.39)
Since D is about 25 cm, to have a magnification of six, one needs a
convex lens of focal length, f = 5 cm.
Note that m = h/h where h is the size of the object and h the size of
the image. This is also the ratio of the angle subtended by the image
to that subtended by the object, if placed at D for comfortable viewing.
(Note that this is not the angle actually subtended by the object at the
eye, which is h/u.) What a single-lens simple magnifier achieves is
that it allows the object to be brought closer to the eye than D.
Figure 9.30 A simple microscope; (a) the magnifying lens is located such that the
image is at the near point, (b) the angle subtanded by the object, is the same as
that at the near point, and (c) the object near the focal point of the lens; the image
is far off but closer than infinity.

We will now find the magnification when the image is at infinity. In this
case we will have to obtained the angular magnification. Suppose the
object has a height h. The maximum angle it can subtend, and be
clearly visible (without a lens), is when it is at the near point, i.e., a
distance D. The angle subtended is then given by

tan o (9.40)

We now find the angle subtended at the eye by the image when the
object is at u. From the relations

we have the angle subtended by the image

tan . The angle subtended by the object,


when it is at u = f.

(9.41)
as is clear from Fig. 9.29(c). The angular magnification is, therefore

(9.42)
This is one less than the magnification when the image is at the near
point, Eq. (9.39), but the viewing is more comfortable and the
difference in magnification is usually small. In subsequent discussions
of optical instruments (microscope and telescope) we shall assume
the image to be at infinity.
A simple microscope has a limited maximum magnification ( 9) for
realistic focal lengths. For much larger magnifications, one uses two
lenses, one compounding the effect of the other. This is known as a
compound microscope. A schematic diagram of a compound
microscope is shown in Fig. 9.31. The lens nearest the object, called
the objective, forms a real, inverted, magnified image of the object.
This serves as the object for the second lens, the eyepiece, which
functions essentially like a simple microscope or magnifier, produces
the final image, which is enlarged and virtual. The first inverted image
is thus near (at or within) the focal plane of the eyepiece, at a distance
appropriate for final image formation at infinity, or a little closer for
image formation at the near point. Clearly, the final image is inverted
with respect to the original object.
We now obtain the magnification due to a compound microscope. The
ray diagram of
Fig. 9.31 shows that the (linear) magnification due to the objective,
namely h/h, equals

(9.43)
where we have used the result
Here h is the size of the first image, the object size being h and fo
being the focal length of the objective. The first image is formed near
the focal point of the eyepiece. The distance L, i.e., the distance
between the second focal point of the objective and the first focal point
of the eyepiece (focal length fe) is called the tube length of the
compound microscope.

Figure 9.31 Ray diagram for the formation of image by a compound microscope.

As the first inverted image is near the focal point of the eyepiece, we
use the result from the discussion above for the simple microscope to
obtain the (angular) magnification me due to it [Eq. (9.39)], when the
final image is formed at the near point, is
[9.44(a)]

When the final image is formed at infinity, the angular magnification


due to the eyepiece [Eq. (9.42)] is
me = (D/fe) [9.44(b)]

Thus, the total magnification [(according to Eq. (9.33)], when the


image is formed at infinity, is

(9.45)
Clearly, to achieve a large magnification of a small object (hence the
name microscope), the objective and eyepiece should have small
focal lengths. In practice, it is difficult to make the focal length much
smaller than 1 cm. Also large lenses are required to make L large.
For example, with an objective with fo = 1.0 cm, and an eyepiece with
focal length fe = 2.0 cm, and a tube length of 20 cm, the magnification
is

Various other factors such as illumination of the object, contribute to


the quality and visibility of the image. In modern microscopes, multi-
component lenses are used for both the objective and the eyepiece to
improve image quality by minimising various optical aberrations
(defects) in lenses.

9.9.3 Telescope

The telescope is used to provide angular magnification of distant


objects (Fig. 9.32). It also has an objective and an eyepiece. But here,
the objective has a large focal length and a much larger aperture than
the eyepiece. Light from a distant object enters the objective and a
real image is formed in the tube at its second focal point. The
eyepiece magnifies this image producing a final inverted image. The
magnifying power m is the ratio of the angle subtended at the eye by
the final image to the angle which the object subtends at the lens or
the eye. Hence

(9.46)

In this case, the length of the telescope tube is fo + fe.

Terrestrial telescopes have, in addition, a pair of inverting lenses to


make the final image erect. Refracting telescopes can be used both
for terrestrial and astronomical observations. For example, consider a
telescope whose objective has a focal length of 100 cm and the
eyepiece a focal length of 1 cm. The magnifying power of this
telescope is m = 100/1 = 100.
Let us consider a pair of stars of actual separation 1 (one minute of
arc). The stars appear as though they are separated by an angle of
100 1 = 100 =1.67.

The main considerations with an astronomical telescope are its light


gathering power and its resolution or resolving power. The former
clearly depends on the area of the objective. With larger diameters,
fainter objects can be observed. The resolving power, or the ability to
observe two objects distinctly, which are in very nearly the same
direction, also depends on the diameter of the objective. So, the
desirable aim in optical telescopes is to make them with objective of
large diameter. The largest lens objective in use has a diameter of 40
inch (~1.02 m). It is at the Yerkes Observatory in Wisconsin, USA.
Such big lenses tend to be very heavy and therefore, difficult to make
and support by their edges. Further, it is rather difficult and expensive
to make such large sized lenses which form images that are free from
any kind of chromatic aberration and distortions.
Figure 9.32 A refracting telescope.

For these reasons, modern telescopes use a concave mirror rather


than a lens for the objective. Telescopes with mirror objectives are
called reflecting telescopes. They have several advantages. First,
there is no chromatic aberration in a mirror. Second, if a parabolic
reflecting surface is chosen, spherical aberration is also removed.
Mechanical support is much less of a problem since a mirror weighs
much less than a lens of equivalent optical quality, and can be
supported over its entire back surface, not just over its rim. One
obvious problem with a reflecting telescope is that the objective mirror
focusses light inside the telescope tube. One must have an eyepiece
and the observer right there, obstructing some light (depending on the
size of the observer cage). This is what is done in the very large 200
inch (~5.08 m) diameters, Mt. Palomar telescope, California. The
viewer sits near the focal point of the mirror, in a small cage. Another
solution to the problem is to deflect the light being focussed by
another mirror. One such arrangement using a convex secondary
mirror to focus the incident light, which now passes through a hole in
the objective primary mirror, is shown in Fig. 9.33. This is known as a
Cassegrain telescope, after its inventor. It has the advantages of a
large focal length in a short telescope. The largest telescope in India is
in Kavalur, Tamil Nadu. It is a 2.34 m diameter reflecting telescope
(Cassegrain). It was ground, polished, set up, and is being used by
the Indian Institute of Astrophysics, Bangalore. The largest reflecting
telescopes in the world are the pair of Keck telescopes in Hawaii,
USA, with a reflector of 10 metre in diameter.
Figure 9.33 Schematic diagram of a reflecting telescope (Cassegrain).

Summary
1. Reflection is governed by the equation i = r and refraction by the Snells
law, sini/sinr = n, where the incident ray, reflected ray, refracted ray and
normal lie in the same plane. Angles of incidence, reflection and refraction are
i, r and r, respectively.
2. The critical angle of incidence ic for a ray incident from a denser to rarer
medium, is that angle for which the angle of refraction is 90. For
i > ic, total internal reflection occurs. Multiple internal reflections in diamond (ic
24.4), totally reflecting prisms and mirage, are some examples of total
internal reflection. Optical fibres consist of glass fibres coated with a thin layer
of material of lower refractive index. Light incident at an angle at one end
comes out at the other, after multiple internal reflections, even if the fibre is
bent.
3. Cartesian sign convention: Distances measured in the same direction as the
incident light are positive; those measured in the opposite direction are
negative. All distances are measured from the pole/optic centre of the
mirror/lens on the principal axis. The heights measured upwards above x-axis
and normal to the principal axis of the mirror/lens are taken as positive. The
heights measured downwards are taken as negative.
4. Mirror equation:
where u and v are object and image distances, respectively and f is the focal
length of the mirror. f is (approximately) half the radius of curvature R. f is
negative for concave mirror; f is positive for a convex mirror.
5. For a prism of the angle A, of refractive index n2 placed in a medium of
refractive index n1,

where Dm is the angle of minimum deviation.


6. For refraction through a spherical interface (from medium 1 to 2 of refractive
index n1 and n2, respectively)

Thin lens formula

Lens makers formula

R1 and R2 are the radii of curvature of the lens surfaces. f is positive for a
converging lens; f is negative for a diverging lens. The power of a lens P = 1/f.

The SI unit for power of a lens is dioptre (D): 1 D = 1 m1.


If several thin lenses of focal length f1, f2, f3,.. are in contact, the effective
focal length of their combination, is given by


The total power of a combination of several lenses is
P = P1 + P2 + P3 +
7. Dispersion is the splitting of light into its constituent colours.
8. The Eye: The eye has a convex lens of focal length about 2.5 cm. This focal
length can be varied somewhat so that the image is always formed on the
retina. This ability of the eye is called accommodation. In a defective eye, if the
image is focussed before the retina (myopia), a diverging corrective lens is
needed; if the image is focussed beyond the retina (hypermetropia), a
converging corrective lens is needed. Astigmatism is corrected by using
cylindrical lenses.
9. Magnifying power m of a simple microscope is given by m = 1 + (D/f), where
D = 25 cm is the least distance of distinct vision and f is the focal length of the
convex lens. If the image is at infinity, m = D/f. For a compound microscope,
the magnifying power is given by
m = me m0 where me = 1 + (D/fe), is the magnification due to the eyepiece
and mo is the magnification produced by the objective. Approximately,

where fo and fe are the focal lengths of the objective and eyepiece,
respectively, and L is the distance between their focal points.
10. Magnifying power m of a telescope is the ratio of the angle subtended at
the eye by the image to the angle subtended at the eye by the object.

where f0 and fe are the focal lengths of the objective and eyepiece,
respectively.

Points to Ponder
1. The laws of reflection and refraction are true for all surfaces and pairs of
media at the point of the incidence.
2. The real image of an object placed between f and 2f from a convex lens can
be seen on a screen placed at the image location. If the screen is removed, is
the image still there? This question puzzles many, because it is difficult to
reconcile ourselves with an image suspended in air without a screen. But the
image does exist. Rays from a given point on the object are converging to an
image point in space and diverging away. The screen simply diffuses these
rays, some of which reach our eye and we see the image. This can be seen by
the images formed in air during a laser show.
3. Image formation needs regular reflection/refraction. In principle, all rays from
a given point should reach the same image point. This is why you do not see
your image by an irregular reflecting object, say the page of a book.
4. Thick lenses give coloured images due to dispersion. The variety in colour
of objects we see around us is due to the constituent colours of the light
incident on them. A monochromatic light may produce an entirely different
perception about the colours on an object as seen in white light.
5. For a simple microscope, the angular size of the object equals the angular
size of the image. Yet it offers magnification because we can keep the small
object much closer to the eye than 25 cm and hence have it subtend a large
angle. The image is at 25 cm which we can see. Without the microscope, you
would need to keep the small object at 25 cm which would subtend a very
small angle.

Exercises

9.1 A small candle, 2.5 cm in size is placed at 27 cm in front of a


concave mirror of radius of curvature 36 cm. At what distance from
the mirror should a screen be placed in order to obtain a sharp
image? Describe the nature and size of the image. If the candle is
moved closer to the mirror, how would the screen have to be
moved?
9.2 A 4.5 cm needle is placed 12 cm away from a convex mirror of
focal length 15 cm. Give the location of the image and the
magnification. Describe what happens as the needle is moved
farther from the mirror.

9.3 A tank is filled with water to a height of 12.5 cm. The apparent
depth of a needle lying at the bottom of the tank is measured by a
microscope to be 9.4 cm. What is the refractive index of water? If
water is replaced by a liquid of refractive index 1.63 up to the
same height, by what distance would the microscope have to be
moved to focus on the needle again?

9.4 Figures 9.34(a) and (b) show refraction of a ray in air incident
at 60 with the normal to a glass-air and water-air interface,
respectively. Predict the angle of refraction in glass when the
angle of incidence in water is 45 with the normal to a water-glass
interface [Fig. 9.34(c)].

Figure 9.34

9.5 A small bulb is placed at the bottom of a tank containing water


to a depth of 80cm. What is the area of the surface of water
through which light from the bulb can emerge out? Refractive
index of water is 1.33. (Consider the bulb to be a point source.)
9.6 A prism is made of glass of unknown refractive index. A
parallel beam of light is incident on a face of the prism. The angle
of minimum deviation is measured to be 40. What is the refractive
index of the material of the prism? The refracting angle of the
prism is 60. If the prism is placed in water (refractive index 1.33),
predict the new angle of minimum deviation of a parallel beam of
light.
9.7 Double-convex lenses are to be manufactured from a glass of
refractive index 1.55, with both faces of the same radius of
curvature. What is the radius of curvature required if the focal
length is to be 20cm?
9.8 A beam of light converges at a point P. Now a lens is placed in
the path of the convergent beam 12cm from P. At what point does
the beam converge if the lens is (a) a convex lens of focal length
20cm, and (b) a concave lens of focal length 16cm?

9.9 An object of size 3.0cm is placed 14cm in front of a concave


lens of focal length 21cm. Describe the image produced by the
lens. What happens if the object is moved further away from the
lens?
9.10 What is the focal length of a convex lens of focal length 30cm
in contact with a concave lens of focal length 20cm? Is the system
a converging or a diverging lens? Ignore thickness of the lenses.
9.11 A compound microscope consists of an objective lens of focal
length 2.0cm and an eyepiece of focal length 6.25cm separated by
a distance of 15cm. How far from the objective should an object be
placed in order to obtain the final image at (a) the least distance of
distinct vision (25cm), and (b) at infinity? What is the magnifying
power of the microscope in each case?
9.12 A person with a normal near point (25cm) using a compound
microscope with objective of focal length 8.0 mm and an eyepiece
of focal length 2.5cm can bring an object placed at 9.0mm from
the objective in sharp focus. What is the separation between the
two lenses? Calculate the magnifying power of the microscope,
9.13 A small telescope has an objective lens of focal length 144cm
and an eyepiece of focal length 6.0cm. What is the magnifying
power of the telescope? What is the separation between the
objective and the eyepiece?
9.14 (a) A giant refracting telescope at an observatory has an
objective lens of focal length 15m. If an eyepiece of focal length
1.0cm is used, what is the angular magnification of the telescope?
(b) If this telescope is used to view the moon, what is the diameter
of the image of the moon formed by the objective lens? The
diameter of the moon is 3.48 106m, and the radius of lunar orbit
is 3.8 108m.

9.15 Use the mirror equation to deduce that:


(a) an object placed between f and 2f of a concave mirror
produces a real image beyond 2f.
(b) a convex mirror always produces a virtual image independent
of the location of the object.
(c) the virtual image produced by a convex mirror is always
diminished in size and is located between the focus and
the pole.
(d) an object placed between the pole and focus of a concave
mirror produces a virtual and enlarged image.
[Note: This exercise helps you deduce algebraically properties of
images that one obtains from explicit ray diagrams.]

9.16 A small pin fixed on a table top is viewed from above from a
distance of 50cm. By what distance would the pin appear to be
raised if it is viewed from the same point through a 15cm thick
glass slab held parallel to the table? Refractive index of glass =
1.5. Does the answer depend on the location of the slab?
9.17 (a) Figure 9.35 shows a cross-section of a light pipe made of
a glass fibre of refractive index 1.68. The outer covering of the
pipe is made of a material of refractive index 1.44. What is the
range of the angles of the incident rays with the axis of the pipe for
which total reflections inside the pipe take place, as shown in the
figure.

(b) What is the answer if there is no outer covering of the pipe?

Figure 9.35
9.18 Answer the following questions:

(a) You have learnt that plane and convex mirrors produce virtual
images of objects. Can they produce real images under some
circumstances? Explain.
(b) A virtual image, we always say, cannot be caught on a screen.
Yet when we see a virtual image, we are obviously bringing it on
to the screen (i.e., the retina) of our eye. Is there a contradiction?
(c) A diver under water, looks obliquely at a fisherman standing on
the bank of a lake. Would the fisherman look taller or shorter to the
diver than what he actually is?
(d) Does the apparent depth of a tank of water change if viewed
obliquely? If so, does the apparent depth increase or decrease?
(e) The refractive index of diamond is much greater than that of
ordinary glass. Is this fact of some use to a diamond cutter?
9.19 The image of a small electric bulb fixed on the wall of a room
is to be obtained on the opposite wall 3m away by means of a
large convex lens. What is the maximum possible focal length of
the lens required for the purpose?
9.20 A screen is placed 90cm from an object. The image of the
object on the screen is formed by a convex lens at two different
locations separated by 20cm. Determine the focal length of the
lens.
9.21 (a) Determine the effective focal length of the combination of
the two lenses in Exercise 9.10, if they are placed 8.0cm apart
with their principal axes coincident. Does the answer depend on
which side of the combination a beam of parallel light is incident?
Is the notion of effective focal length of this system useful at all?
(b) An object 1.5 cm in size is placed on the side of the convex
lens in the arrangement (a) above. The distance between the
object and the convex lens is 40cm. Determine the magnification
produced by the two-lens system, and the size of the image.

9.22 At what angle should a ray of light be incident on the face of a


prism of refracting angle 60 so that it just suffers total internal
reflection at the other face? The refractive index of the material of
the prism is 1.524.
9.23 You are given prisms made of crown glass and flint glass with
a wide variety of angles. Suggest a combination of prisms which
will
(a) deviate a pencil of white light without much dispersion,
(b) disperse (and displace) a pencil of white light without much
deviation.

9.24 For a normal eye, the far point is at infinity and the near point
of distinct vision is about 25cm in front of the eye. The cornea of
the eye provides a converging power of about 40 dioptres, and the
least converging power of the eye-lens behind the cornea is about
20 dioptres. From this rough data estimate the range of
accommodation (i.e., the range of converging power of the eye-
lens) of a normal eye.
9.25 Does short-sightedness (myopia) or long-sightedness (hyper-
metropia) imply necessarily that the eye has partially lost its ability
of accommodation? If not, what might cause these defects of
vision?

9.26 A myopic person has been using spectacles of power 1.0


dioptre for distant vision. During old age he also needs to use
separate reading glass of power + 2.0 dioptres. Explain what may
have happened.
9.27 A person looking at a person wearing a shirt with a pattern
comprising vertical and horizontal lines is able to see the vertical
lines more distinctly than the horizontal ones. What is this defect
due to? How is such a defect of vision corrected?
9.28 A man with normal near point (25 cm) reads a book with
small print using a magnifying glass: a thin convex lens of focal
length 5 cm.
(a) What is the closest and the farthest distance at which he
should keep the lens from the page so that he can read the book
when viewing through the magnifying glass?
(b) What is the maximum and the minimum angular magnification
(magnifying power) possible using the above simple microscope?

9.29 A card sheet divided into squares each of size 1 mm2 is


being viewed at a distance of 9 cm through a magnifying glass (a
converging lens of focal length 9 cm) held close to the eye.
(a) What is the magnification produced by the lens? How much is
the area of each square in the virtual image?
(b) What is the angular magnification (magnifying power) of the
lens?
(c) Is the magnification in (a) equal to the magnifying power in (b)?
Explain.

9.30 (a) At what distance should the lens be held from the figure in
Exercise 9.29 in order to view the squares distinctly with the
maximum possible magnifying power?
(b) What is the magnification in this case?
(c) Is the magnification equal to the magnifying power in this case?
Explain.
9.31 What should be the distance between the object in Exercise
9.30 and the magnifying glass if the virtual image of each square
in the figure is to have an area of 6.25 mm2. Would you be able to
see the squares distinctly with your eyes very close to the
magnifier?
[Note: Exercises 9.29 to 9.31 will help you clearly understand the
difference between magnification in absolute size and the angular
magnification (or magnifying power) of an instrument.]
9.32 Answer the following questions:

(a) The angle subtended at the eye by an object is equal to the


angle subtended at the eye by the virtual image produced by a
magnifying glass. In what sense then does a magnifying glass
provide angular magnification?
(b) In viewing through a magnifying glass, one usually positions
ones eyes very close to the lens. Does angular magnification
change if the eye is moved back?
(c) Magnifying power of a simple microscope is inversely
proportional to the focal length of the lens. What then stops us
from using a convex lens of smaller and smaller focal length and
achieving greater and greater magnifying power?
(d) Why must both the objective and the eyepiece of a compound
microscope have short focal lengths?
(e) When viewing through a compound microscope, our eyes
should be positioned not on the eyepiece but a short distance
away from it for best viewing. Why? How much should be that
short distance between the eye and eyepiece?
9.33 An angular magnification (magnifying power) of 30X is
desired using an objective of focal length 1.25cm and an eyepiece
of focal length
5cm. How will you set up the compound microscope?
9.34 A small telescope has an objective lens of focal length 140cm
and an eyepiece of focal length 5.0cm. What is the magnifying
power of the telescope for viewing distant objects when
(a) the telescope is in normal adjustment (i.e., when the final
image is at infinity)?

(b) the final image is formed at the least distance of distinct vision
(25cm)?
9.35 (a) For the telescope described in Exercise 9.34 (a), what is
the separation between the objective lens and the eyepiece?
(b) If this telescope is used to view a 100 m tall tower 3 km away,
what is the height of the image of the tower formed by the
objective lens?
(c) What is the height of the final image of the tower if it is formed
at 25cm?

9.36 A Cassegrain telescope uses two mirrors as shown in Fig.


9.33. Such a telescope is built with the mirrors 20mm apart. If the
radius of curvature of the large mirror is 220mm and the small
mirror is
140mm, where will the final image of an object at infinity be?

9.37 Light incident normally on a plane mirror attached to a


galvanometer coil retraces backwards as shown in Fig. 9.36. A
current in the coil produces a deflection of 3.5o of the mirror. What
is the displacement of the reflected spot of light on a screen
placed 1.5 m away?

Figure 9.36

9.38 Figure 9.37 shows an equiconvex lens (of refractive index


1.50) in contact with a liquid layer on top of a plane mirror. A small
needle with its tip on the principal axis is moved along the axis
until its inverted image is found at the position of the needle. The
distance of the needle from the lens is measured to be 45.0cm.
The liquid is removed and the experiment is repeated. The new
distance is measured to be 30.0cm. What is the refractive index of
the liquid?

Figure 9.37
Chapter Ten

Wave Optics

10.1 Introduction

In 1637 Descartes gave the corpuscular model of light and derived


Snells law. It explained the laws of reflection and refraction of light at
an interface. The corpuscular model predicted that if the ray of light
(on refraction) bends towards the normal then the speed of light would
be greater in the second medium. This corpuscular model of light was
further developed by Isaac Newton in his famous book entitled
OPTICKS and because of the tremendous popularity of this book, the
corpuscular model is very often attributed to Newton.
In 1678, the Dutch physicist Christiaan Huygens put forward the wave
theory of light it is this wave model of light that we will discuss in this
chapter. As we will see, the wave model could satisfactorily explain
the phenomena of reflection and refraction; however, it predicted that
on refraction if the wave bends towards the normal then the speed of
light would be less in the second medium. This is in contradiction to
the prediction made by using the corpuscular model of light. It was
much later confirmed by experiments where it was shown that the
speed of light in water is less than the speed in air confirming the
prediction of the wave model; Foucault carried out this experiment in
1850.
The wave theory was not readily accepted primarily because of
Newtons authority and also because light could travel through
vacuum and it was felt that a wave would always require a medium to
propagate from one point to the other. However, when Thomas Young
performed his famous interference experiment in 1801, it was firmly
established that light is indeed a wave phenomenon. The wavelength
of visible light was measured and found to be extremely small; for
example, the wavelength of yellow light is about 0.5 m. Because of
the smallness of the wavelength of visible light (in comparison to the
dimensions of typical mirrors and lenses), light can be assumed to
approximately travel in straight lines. This is the field of geometrical
optics, which we had discussed in the previous chapter. Indeed, the
branch of optics in which one completely neglects the finiteness of the
wavelength is called geometrical optics and a ray is defined as the
path of energy propagation in the limit of wavelength tending to zero.
After the interference experiment of Young in 1801, for the next 40
years or so, many experiments were carried out involving the
interference and diffraction of lightwaves; these experiments could
only be satisfactorily explained by assuming a wave model of light.
Thus, around the middle of the nineteenth century, the wave theory
seemed to be very well established. The only major difficulty was that
since it was thought that a wave required a medium for its
propagation, how could light waves propagate through vacuum. This
was explained when Maxwell put forward his famous electromagnetic
theory of light. Maxwell had developed a set of equations describing
the laws of electricity and magnetism and using these equations he
derived what is known as the wave equation from which he predicted
the existence of electromagnetic waves*. From the wave equation,
Maxwell could calculate the speed of electromagnetic waves in free
space and he found that the theoretical value was very close to the
measured value of speed of light. From this, he propounded that light
must be an electromagnetic wave. Thus, according to Maxwell, light
waves are associated with changing electric and magnetic fields;
changing electric field produces a time and space varying magnetic
field and a changing magnetic field produces a time and space varying
electric field. The changing electric and magnetic fields result in the
propagation of electromagnetic waves (or light waves) even in
vacuum.
In this chapter we will first discuss the original formulation of the
Huygens principle and derive the laws of reflection and refraction. In
Sections 10.4 and 10.5, we will discuss the phenomenon of
interference which is based on the principle of superposition. In
Section 10.6 we will discuss the phenomenon of diffraction which is
based on Huygens-Fresnel principle. Finally in Section 10.7 we will
discuss the phenomenon of polarisation which is based on the fact
that the light waves are transverse electromagnetic waves.
* Maxwell had predicted the existence of electromagnetic waves
around 1855; it was much later (around 1890) that Heinrich Hertz
produced radiowaves in the laboratory. J.C. Bose and G. Marconi
made practical applications of the Hertzian waves

Does light travel in a straight line?

Light travels in a straight line in Class VI; it does not do so in Class XII and
beyond! Surprised, arent you?
In school, you are shown an experiment in which you take three cardboards
with pinholes in them, place a candle on one side and look from the other side.
If the flame of the candle and the three pinholes are in a straight line, you can
see the candle. Even if one of them is displaced a little, you cannot see the
candle. This proves, so your teacher says, that light travels in a straight line.
In the present book, there are two consecutive chapters, one on ray optics and
the other on wave optics. Ray optics is based on rectilinear propagation of
light, and deals with mirrors, lenses, reflection, refraction, etc. Then you come
to the chapter on wave optics, and you are told that light travels as a wave,
that it can bend around objects, it can diffract and interfere, etc.
In optical region, light has a wavelength of about half a micrometre. If it
encounters an obstacle of about this size, it can bend around it and can be
seen on the other side. Thus a micrometre size obstacle will not be able to
stop a light ray. If the obstacle is much larger, however, light will not be able to
bend to that extent, and will not be seen on the other side.
This is a property of a wave in general, and can be seen in sound waves too.
The sound wave of our speech has a wavelength of about 50cm to 1 m. If it
meets an obstacle of the size of a few metres, it bends around it and reaches
points behind the obstacle. But when it comes across a larger obstacle of a
few hundred metres, such as a hillock, most of it is reflected and is heard as
an echo.
Then what about the primary school experiment? What happens there is that
when we move any cardboard, the displacement is of the order of a few
millimetres, which is much larger than the wavelength of light. Hence the
candle cannot be seen. If we are able to move one of the cardboards by a
micrometer or less, light will be able to diffract, and the candle will still be seen.
One could add to the first sentence in this box: It learns how to bend as it
grows up!

10.2 Huygens Principle


Figure 10.1 (a) A diverging spherical wave emanating from a point source. The
wavefronts are spherical.

We would first define a wavefront: when we drop a small stone on a


calm pool of water, waves spread out from the point of impact. Every
point on the surface starts oscillating with time. At any instant, a
photograph of the surface would show circular rings on which the
disturbance is maximum. Clearly, all points on such a circle are
oscillating in phase because they are at the same distance from the
source. Such a locus of points, which oscillate in phase is called a
wavefront; thus a wavefront is defined as a surface of constant phase.
The speed with which the wavefront moves outwards from the source
is called the speed of the wave. The energy of the wave travels in a
direction perpendicular to the wavefront.
If we have a point source emitting waves uniformly in all directions,
then the locus of points which have the same amplitude and vibrate in
the same phase are spheres and we have what is known as a
spherical wave as shown in Fig. 10.1(a). At a large distance from the
source, a small portion of the sphere can be considered as a plane
and we have what is known as a plane wave [Fig. 10.1(b)].
Figure 10.1 (b) At a large distance from the source, a small portion of the spherical
wave can be approximated by a plane wave.

Now, if we know the shape of the wavefront at t = 0, then Huygens


principle allows us to determine the shape of the wavefront at a
later time . Thus, Huygens principle is essentially a geometrical
construction, which given the shape of the wafefront at any time
allows us to determine the shape of the wavefront at a later time. Let
us consider a diverging wave and let F1F2 represent a portion of the
spherical wavefront at t = 0
(Fig. 10.2). Now, according to Huygens principle, each point of the
wavefront is the source of a secondary disturbance and the wavelets
emanating from these points spread out in all directions with the
speed of the wave. These wavelets emanating from the wavefront are
usually referred to as secondary wavelets and if we draw a common
tangent to all these spheres, we obtain the new position of the
wavefront at a later time.
Figure 10.2 F1F2 represents the spherical wavefront (with O as centre) at t = 0.
The envelope of the secondary wavelets emanating from F1F2 produces the
forward moving wavefront G1G2. The backwave D1D2 does not exist.

Thus, if we wish to determine the shape of the wavefront at t = , we


draw spheres of radius v from each point on the spherical wavefront
where v represents the speed of the waves in the medium. If we now
draw a common tangent to all these spheres, we obtain the new
position of the wavefront at t = . The new wavefront shown as G1G2
in Fig. 10.2 is again spherical with point O as the centre.
Figure 10.3 Huygens geometrical construction for a plane wave propagating to the
right. F1 F2 is the plane wavefront at t = 0 and G1G2 is the wavefront at a later
time . The lines A1A2, B1B2 etc, are normal to both F1F2 and G1G2 and
represent rays.

The above model has one shortcoming: we also have a backwave


which is shown as D1D2 in Fig. 10.2. Huygens argued that the
amplitude of the secondary wavelets is maximum in the forward
direction and zero in the backward direction; by making this adhoc
assumption, Huygens could explain the absence of the backwave.
However, this adhoc assumption is not satisfactory and the absence
of the backwave is really justified from more rigorous wave theory.
In a similar manner, we can use Huygens principle to determine the
shape of the wavefront for a plane wave propagating through a
medium (Fig. 10.3).

10.3 Refraction and Reflection of Plane Waves


using Huygens Principle

10.3.1 Refraction of a plane wave

We will now use Huygens principle to derive the laws of refraction. Let
PP represent the surface separating medium 1 and medium 2, as
shown in Fig. 10.4.

Figure 10.4 A plane wave AB is incident at an angle i on the surface PP separating


medium 1 and medium 2. The plane wave undergoes refraction and CE represents
the refracted wavefront. The figure corresponds to v2 v1 so that the refracted
waves bends towards the normal.

Let v1 and v2 represent the speed of light in medium 1 and medium 2,


respectively. We assume a plane wavefront AB propagating in the
direction AA incident on the interface at an angle i as shown in the
figure. Let be the time taken by the wavefront to travel the distance
BC. Thus,
BC = v1

Christiaan Huygens (1629 1695) Dutch physicist, astronomer, mathematician


and the founder of the wave theory of light. His book, Treatise on light, makes
fascinating reading even today. He brilliantly explained the double refraction
shown by the mineral calcite in this work in addition to reflection and refraction.
He was the first to analyse circular and simple harmonic motion and designed
and built improved clocks and telescopes. He discovered the true geometry of
Saturns rings.
In order to determine the shape of the refracted wavefront, we draw a
sphere of radius v2 from the point A in the second medium (the speed
of the wave in the second medium is v2). Let CE represent a tangent
plane drawn from the point C on to the sphere. Then, AE = v2 and
CE would represent the refracted wavefront. If we now consider the
triangles ABC and AEC, we readily obtain

sin i = (10.1)

and

sin r = (10.2)
where i and r are the angles of incidence and refraction, respectively.
Thus we obtain

(10.3)
From the above equation, we get the important result that if r < i (i.e., if
the ray bends toward the normal), the speed of the light wave in the
second medium (v2) will be less then the speed of the light wave in the
first medium (v1). This prediction is opposite to the prediction from the
corpuscular model of light and as later experiments showed, the
prediction of the wave theory is correct. Now, if c represents the speed
of light in vacuum, then,

(10.4)
and

n2 = (10.5)
are known as the refractive indices of medium 1 and medium 2,
respectively. In terms of the refractive indices, Eq. (10.3) can
be written as
n1 sin i = n2 sin r (10.6)
This is the Snells law of refraction. Further, if 1 and 2 denote the
wavelengths of light in medium 1 and medium 2, respectively and if
the distance BC is equal to 1 then the distance AE will be equal to
2 (because if the crest from B has reached C in time , then the crest
from A should have also reached E in time ); thus,

or
(10.7)

The above equation implies that when a wave gets refracted into a
denser medium (v1 > v2) the wavelength and the speed of propagation
decrease but the frequency (= v/) remains the same.

10.3.2 Refraction at a rarer medium


We now consider refraction of a plane wave at a rarer medium, i.e.,
v2 > v1. Proceeding in an exactly similar manner we can construct a
refracted wavefront as shown in Fig. 10.5. The angle of refraction
will now be greater than angle of incidence; however, we will still have
n1 sin i = n2 sin r . We define an angle ic by the following equation

(10.8)
Thus, if i = ic then sin r = 1 and r = 90. Obviously, for i > ic, there can
not be any refracted wave. The angle ic is known as the critical angle
and for all angles of incidence greater than the critical angle, we will
not have any refracted wave and the wave will undergo what is known
as total internal reflection. The phenomenon of total internal reflection
and its applications was discussed in Section 9.4.

10.3.3 Reflection of a plane wave by a plane


surface
We next consider a plane wave AB incident at an angle i on a
reflecting surface MN. If v represents the speed of the wave in the
medium and if represents the time taken by the wavefront to
advance from the point B to C then the distance

BC = v

Figure 10.5 Refraction of a plane wave incident on a rarer medium for


which v2 > v1. The plane wave bends away from the normal.

In order the construct the reflected wavefront we draw a sphere of


radius v from the point A as shown in Fig. 10.6. Let CE represent the
tangent plane drawn from the point C to this sphere. Obviously
AE = BC = v
If we now consider the triangles EAC and BAC we will find that they
are congruent and therefore, the angles i and r (as shown in Fig. 10.6)
would be equal. This is the law of reflection.
Figure 10.6 Reflection of a plane wave AB by the reflecting surface MN. AB and
CE represent incident and reflected wavefronts.

Once we have the laws of reflection and refraction, the behaviour of


prisms, lenses, and mirrors can be understood. These phenomena
were discussed in detail in Chapter 9 on the basis of rectilinear
propagation of light. Here we just describe the behaviour of the
wavefronts as they undergo reflection or refraction. In Fig. 10.7(a) we
consider a plane wave passing through a thin prism. Clearly, since the
speed of light waves is less in glass, the lower portion of the incoming
wavefront (which travels through the greatest thickness of glass) will
get delayed resulting in a tilt in the emerging wavefront as shown in
the figure. In Fig. 10.7(b) we consider a plane wave incident on a thin
convex lens; the central part of the incident plane wave traverses the
thickest portion of the lens and is delayed the most. The emerging
wavefront has a depression at the centre and therefore the wavefront
becomes spherical and converges to the point F which is known as
the focus. In Fig. 10.7(c) a plane wave is incident on a concave mirror
and on reflection we have a spherical wave converging to the focal
point F. In a similar manner, we can understand refraction and
reflection by concave lenses and convex mirrors.
From the above discussion it follows that the total time taken from a
point on the object to the corresponding point on the image is the
same measured along any ray. For example, when a convex lens
focusses light to form a real image, although the ray going through the
centre traverses a shorter path, but because of the slower speed in
glass, the time taken is the same as for rays travelling near the edge
of the lens.

10.3.4 The doppler effect

Figure 10.7 Refraction of a plane wave by (a) a thin prism, (b) a convex lens. (c)
Reflection of a plane wave by a concave mirror.

We should mention here that one should be careful in constructing the


wavefronts if the source (or the observer) is moving. For example, if
there is no medium and the source moves away from the observer,
then later wavefronts have to travel a greater distance to reach the
observer and hence take a longer time. The time taken between the
arrival of two successive wavefronts is hence longer at the observer
than it is at the source. Thus, when the source moves away from the
observer the frequency as measured by the source will be smaller.
This is known as the Doppler effect. Astronomers call the increase in
wavelength due to doppler effect as red shift since a wavelength in the
middle of the visible region of the spectrum moves towards the red
end of the spectrum. When waves are received from a source moving
towards the observer, there is an apparent decrease in wavelength,
this is referred to as blue shift.
You have already encountered Doppler effect for sound waves in
Chapter 15 of Class XI textbook. For velocities small compared to the
speed of light, we can use the same formulae which we use for sound
waves. The fractional change in frequency / is given by vradial/c,
where vradial is the component of the source velocity along the line
joining the observer to the source relative to the observer; vradial is
considered positive when the source moves away from the observer.
Thus, the Doppler shift can be expressed as:

(10.9)

The formula given above is valid only when the speed of the source is
small compared to that of light. A more accurate formula for the
Doppler effect which is valid even when the speeds are close to that of
light, requires the use of Einsteins special theory of relativity. The
Doppler effect for light is very important in astronomy. It is the basis
for the measurements of the radial velocities of distant galaxies.

Example 10.1 What speed should a galaxy move with respect


to us so that the sodium line at 589.0 nm is observed
at 589.6 nm?
Solution Since = c, (for small changes in and ). For

= 589.6 589.0 = + 0.6 nm


we get [using Eq. (10.9)]

or, vradial
= 306 km/s
Therefore, the galaxy is moving away from us.

Example 10.2
(a) When monochromatic light is incident on a surface separating two media,
the reflected and refracted light both have the same frequency as the incident
frequency. Explain why?
(b) When light travels from a rarer to a denser medium, the speed decreases.
Does the reduction in speed imply a reduction in the energy carried by the light
wave?
(c) In the wave picture of light, intensity of light is determined by the square of
the amplitude of the wave. What determines the intensity of light in the photon
picture of light.
Solution
(a) Reflection and refraction arise through interaction of incident light with the
atomic constituents of matter. Atoms may be viewed as
oscillators, which take up the frequency of the external agency (light) causing
forced oscillations. The frequency of light emitted by a charged oscillator
equals its frequency of oscillation. Thus, the frequency of scattered light equals
the frequency of incident light.
(b) No. Energy carried by a wave depends on the amplitude of the wave, not
on the speed of wave propagation.
(c) For a given frequency, intensity of light in the photon picture is determined
by the number of photons crossing an unit area per unit time.
10.4 Coherent and Incoherent Addition of Waves

In this section we will discuss the interference pattern produced by the


superposition of two waves. You may recall that we had discussed the
superposition principle in Chapter 15 of your Class XI textbook.
Indeed the entire field of interference is based on the superposition
principle according to which at a particular point in the medium, the
resultant displacement produced by a number of waves is the vector
sum of the displacements produced by each of the waves.

(a)
(b)

Figure 10.8 (a) Two needles oscillating in phase in water represent two coherent
sources. (b) The pattern of displacement of water molecules at an instant on the
surface of water showing nodal N (no displacement) and antinodal A (maximum
displacement) lines.

Consider two needles S1 and S2 moving periodically up and down in


an identical fashion in a trough of water [Fig. 10.8(a)]. They produce
two water waves, and at a particular point, the phase difference
between the displacements produced by each of the waves does not
change with time; when this happens the two sources are said to be
coherent. Figure 10.8(b) shows the position of crests (solid circles)
and troughs (dashed circles) at a given instant of time. Consider a
point P for which
S1 P = S2 P
Since the distances S1 P and S2 P are equal, waves from S1 and S2
will take the same time to travel to the point P and waves that
emanate from S1 and S2 in phase will also arrive, at the point P, in
phase.

Thus, if the displacement produced by the source S1 at the point P is


given by
y1 = a cos t
then, the displacement produced by the source S2 (at the point P) will
also be given by

y2 = a cos t
Thus, the resultant of displacement at P would be given by
y = y1 + y2 = 2 a cos t
Since the intensity is the proportional to the square of the amplitude,
the resultant intensity will be given by
I = 4 I0
where I0 represents the intensity produced by each one of the
individual sources; I0 is proportional to a2. In fact at any point on the
perpendicular bisector of S1S2, the intensity will be 4I0. The two
sources are said to interfere constructively and we have what is
referred to as constructive interference. We next consider a point Q
[Fig. 10.9(a)]
for which
S2Q S1Q = 2
The waves emanating from S1 will arrive exactly two cycles earlier
than the waves from S2 and will again be in phase [Fig. 10.9(a)]. Thus,
if the displacement produced by S1 is given by
y1 = a cos t
then the displacement produced by S2 will be given by

y2 = a cos (t 4) = a cos t
where we have used the fact that a path difference of 2 corresponds
to a phase difference of 4. The two displacements are once again in
phase and the intensity will again be 4 I0 giving rise to constructive
interference. In the above analysis we have assumed that the
distances S1Q and S2Q are much greater than d (which represents
the distance between S1 and S2) so that although S1Q and S2Q are
not equal, the amplitudes of the displacement produced by each wave
are very nearly the same.
We next consider a point R [Fig. 10.9(b)] for which
S2R S1R = 2.5
The waves emanating from S1 will arrive exactly two and a half cycles
later than the waves from S2 [Fig. 10.10(b)]. Thus if the displacement
produced by S1 is given by
y1 = a cos t

then the displacement produced by S2 will be given by


y2 = a cos (t + 5) = a cos t
where we have used the fact that a path difference of 2.5
corresponds to a phase difference of 5. The two displacements are
now out of phase and the two displacements will cancel out to give
zero intensity. This is referred to as destructive interference.
Figure 10.9 (a) Constructive interference at a point Q for which the path difference
is 2. (b) Destructive interference at a point R for which the path difference is 2.5 .

To summarise: If we have two coherent sources S1 and S2 vibrating in


phase, then for an arbitrary point P whenever the path difference,
S1P ~ S2P = n (n = 0, 1, 2, 3,...)
(10.10)
we will have constructive interference and the resultant intensity will
be 4I0; the sign ~ between S1P and S2 P represents the difference
between S1P and S2 P. On the other hand, if the point P is such that
the path difference,

Figure 10.10 Locus of points for which S1P S2P is equal to zero, , 2, 3.

S1P ~ S2P = (n+ ) (n = 0, 1, 2, 3, ...)


(10.11)
we will have destructive interference and the resultant intensity will be
zero. Now, for any other arbitrary point G (Fig. 10.10) let the phase
difference between the two displacements be . Thus, if the
displacement produced by S1 is given by

y1 = a cos t
then, the displacement produced by S2 would be
y2 = a cos (t + )
and the resultant displacement will be given by
y = y1 + y2
= a [cos t + cos (t +)]
= 2 a cos (/2) cos (t + /2)
The amplitude of the resultant displacement is 2a cos (/2) and
therefore the intensity at that point will be
I = 4 I0 cos2 (/2) (10.12)
If = 0, 2 , 4 , which corresponds to the condition given by
Eq. (10.10) we will have constructive interference leading to maximum
intensity. On the other hand, if = , 3, 5 [which
corresponds to the condition given by Eq. (10.11)] we will have
destructive interference leading to zero intensity. Now if the two
sources are coherent (i.e., if the two needles are going up and down
regularly) then the phase difference at any point will not change with
time and we will have a stable interference pattern; i.e., the positions
of maxima and minima will not change with time. However, if the two
needles do not maintain a constant phase difference, then the
interference pattern will also change with time and, if the phase
difference changes very rapidly with time, the positions of maxima and
minima will also vary rapidly with time and we will see a time-
averaged intensity distribution. When this happens, we will observe
an average intensity that will be given by

(10.13)
where angular brackets represent time averaging. Indeed it is shown
in Section 7.2 that if (t) varies randomly with time, the time-averaged
quantity < cos2 (/2) > will be 1/2. This is also intuitively obvious
because the function cos2 (/2) will randomly vary between 0 and 1
and the average value will be 1/2. The resultant intensity will be given
by
I = 2 I0 (10.14)
at all points.

When the phase difference between the two vibrating sources


changes rapidly with time, we say that the two sources are incoherent
and when this happens the intensities just add up. This is indeed what
happens when two separate light sources illuminate a wall.

10.5 Interference of Light Waves and Youngs


Experiment

We will now discuss interference using light waves. If we use two


sodium lamps illuminating two pinholes (Fig. 10.11) we will not
observe any interference fringes. This is because of the fact that the
light wave emitted from an ordinary source (like a sodium lamp)
undergoes abrupt phase changes in times of the order of 1010
seconds. Thus the light waves coming out from two independent
sources of light will not have any fixed phase relationship and would
be incoherent, when this happens, as discussed in the previous
section, the intensities on the screen will add up.
The British physicist Thomas Young used an ingenious technique to
lock the phases of the waves emanating from S1 and S2. He made
two pinholes S1 and S2 (very close to each other) on an opaque
screen [Fig. 10.12(a)]. These were illuminated by another pinholes
that was in turn, lit by a bright source. Light waves spread out from S
and fall on both S1 and S2. S1 and S2 then behave like two coherent
sources because light waves coming out from S1 and S2 are derived
from the same original source and any abrupt phase change in S will
manifest in exactly similar phase changes in the light coming out from
S1 and S2. Thus, the two sources S1 and S2 will be locked in phase;
i.e., they will be coherent like the two vibrating needle in our water
wave example[Fig. 10.8(a)].

Figure 10.11 If two sodium lamps illuminate two pinholes S1 and S2, the intensities
will add up and no interference fringes will be observed on the screen.

Thus spherical waves emanating from S1 and S2 will produce


interference fringes on the screen GG, as shown in Fig. 10.12(b). The
positions of maximum and minimum intensities can be calculated by
using the analysis given in Section 10.4 where we had shown that for
an arbitrary point P on the line GG [Fig. 10.12(b)] to correspond to a
maximum, we must have
S2P S1P = n; n = 0, 1, 2 ... (10.15)
Now,
(S2P)2 (S1P)2 = = 2x d
where S1S2 = d and OP = x . Thus

S2P S1P = (10.16)


(a) (b)

Figure 10.12 Youngs arrangement to produce interference pattern.

If x, d<<D then negligible error will be introduced if S2P + S1P (in the
denominator) is replaced by 2D. For example, for d = 0.1 cm, D = 100
cm, OP = 1 cm (which correspond to typical values for an interference
experiment using light waves), we have S2P + S1P = [(100)2 +
(1.05)2] + [(100)2 + (0.95)2]
200.01 cm
Thus if we replace S2P + S1P by 2 D, the error involved is about
0.005%. In this approximation, Eq. (10.16) becomes

S2P S1P (10.17)


Hence we will have constructive interference resulting in a bright
region when

x = xn = ; n = 0, 1, 2, ... (10.18)

On the other hand, we will have a dark region near

x = xn = (n+ ) (10.19)
Thus dark and bright bands appear on the screen, as shown in Fig.
10.13. Such bands are called fringes. Equations (10.18) and (10.19)
show that dark and bright fringes are equally spaced and the distance
between two consecutive bright and dark fringes is given by
= xn+1 xn

or = (10.20)
which is the expression for the fringe width. Obviously, the central
point O (in Fig. 10.12) will be bright because S1O = S2O and it will
correspond to n = 0. If we consider the line perpendicular to the plane
of the paper and passing through O [i.e., along the y-axis] then all
points on this line will be equidistant from S1 and S2 and we will have
a bright central fringe which is a straight line as shown in Fig. 10.13. In
order to determine the shape of the interference pattern on the screen
we note that a particular fringe would correspond to the locus of points
with a constant value of S2P S1P. Whenever this constant is an
integral multiple of , the fringe will be bright and whenever it is an odd
integral multiple of /2 it will be a dark fringe. Now, the locus of the
point P lying in the x-y plane such that S2P S1P (= ) is a constant,
is a hyperbola. Thus the fringe pattern will strictly be a hyperbola;
however, if the distance D is very large compared to the fringe width,
the fringes will be very nearly straight lines as shown in Fig. 10.13.

In the double-slit experiment shown in Fig. 10.12, we have taken the


source hole S on the perpendicular bisector of the two slits, which is
shown as the line SO. What happens if the source S is slightly away
from the perpendicular bisector. Consider that the source is moved to
some new point S and suppose that Q is the mid-point of S1 and S2. If
the angle SQS is , then the central bright fringe occurs at an angle
, on the other side. Thus, if the source S is on the perpendicular
bisector, then the central fringe occurs at O, also on the perpendicular
bisector. If S is shifted by an angle to point S, then the central fringe
appears at a point O at an angle , which means that it is shifted by
the same angle on the other side of the bisector. This also means that
the source S, the mid-point Q and the point O of the central fringe are
in a straight line.
Thomas Young(1773 1829) English physicist, physician and Egyptologist.
Young worked on a wide variety of scientific problems, ranging from the
structure of the eye and the mechanism of vision to the decipherment of the
Rosetta stone. He revived the wave theory of light and recognised that
interference phenomena provide proof of the wave properties of light.

We end this section by quoting from the Nobel lecture of Dennis


Gabor*

The wave nature of light was demonstrated convincingly for the first
time in 1801 by Thomas Young by a wonderfully simple experiment.
He let a ray of sunlight into a dark room, placed a dark screen in front
of it, pierced with two small pinholes, and beyond this, at some
distance, a white screen. He then saw two darkish lines at both sides
of a bright line, which gave him sufficient encouragement to repeat the
experiment, this time with spirit flame as light source, with a little salt in
it to produce the bright yellow sodium light. This time he saw a number
of dark lines, regularly spaced; the first clear proof that light added to
light can produce darkness. This phenomenon is called interference.
Thomas Young had expected it because he believed in the wave
theory of light.

Figure 10.13 Computer generated fringe pattern produced by two point source
S1 and S2 on the screen GG(Fig. 10.12); (a) and (b) correspond to d = 0.005 mm
and 0.025 mm, respectively (both figures correspond to D= 5 cm and = 5 10
5 cm.) (Adopted from OPTICS by A. Ghatak, Tata McGraw Hill Publishing Co. Ltd.,

New Delhi, 2000.)

We should mention here that the fringes are straight lines although S1
and S2 are point sources. If we had slits instead of the point sources
(Fig. 10.14), each pair of points would have produced straight line
fringes resulting in straight line fringes with increased intensities.
* Dennis Gabor received the 1971 Nobel Prize in Physics for
discovering the principles of holography.

Example 10.3 Two slits are made one millimetre apart and the screen is
placed one metre away. What is the fringe separation when blue-green light of
wavelength 500 nm is used?

Solution Fringe spacing =


= 5 104 m = 0.5 mm

Figure 10.14 Photograph and the graph of the intensity distribution in Youngs double-slit
experiment.

Example 10.4 What is the effect on the interference fringes in a Youngs


double-slit experiment due to each of the following operations:
(a) the screen is moved away from the plane of the slits;
(b) the (monochromatic) source is replaced by another (monochromatic)
source of shorter wavelength;
(c) the separation between the two slits is increased;
(d) the source slit is moved closer to the double-slit plane;
(e) the width of the source slit is increased;
(f) the monochromatic source is replaced by a source of white
light?
(In each operation, take all parameters, other than the one specified, to remain
unchanged.)
Solution
(a) Angular separation of the fringes remains constant
(= /d). The actual separation of the fringes increases in proportion to the
distance of the screen from the plane of the
two slits.
(b) The separation of the fringes (and also angular separation) decreases.
See, however, the condition mentioned in (d) below.
(c) The separation of the fringes (and also angular separation) decreases.
See, however, the condition mentioned in (d) below.
(d) Let s be the size of the source and S its distance from the plane of the two
slits. For interference fringes to be seen, the condition
s/S < /d should be satisfied; otherwise, interference patterns produced by
different parts of the source overlap and no fringes are seen. Thus, as S
decreases (i.e., the source slit is brought closer), the interference pattern gets
less and less sharp, and when the source is brought too close for this condition
to be valid, the fringes disappear. Till this happens, the fringe separation
remains fixed.
(e) Same as in (d). As the source slit width increases, fringe pattern gets less
and less sharp. When the source slit is so wide that the condition s/S /d is
not satisfied, the interference pattern disappears.
(f) The interference patterns due to different component colours of white light
overlap (incoherently). The central bright fringes for different colours are at the
same position. Therefore, the central fringe is white. For a point P for which
S2P S1P = b/2, where b
( 4000 ) represents the wavelength for the blue colour, the blue component
will be absent and the fringe will appear red in colour. Slightly farther away
where S2QS1Q = b = r/2 where r ( 8000 ) is the wavelength for the red
colour, the fringe will be predominantly blue.
Thus, the fringe closest on either side of the central white fringe is red and the
farthest will appear blue. After a few fringes, no clear fringe pattern is seen.

10.6 DIFFRACTION
If we look clearly at the shadow cast by an opaque object, close to the
region of geometrical shadow, there are alternate dark and bright
regions just like in interference. This happens due to the phenomenon
of diffraction. Diffraction is a general characteristic exhibited by all
types of waves, be it sound waves, light waves, water waves or matter
waves. Since the wavelength of light is much smaller than the
dimensions of most obstacles; we do not encounter diffraction effects
of light in everyday observations. However, the finite resolution of our
eye or of optical instruments such as telescopes or microscopes is
limited due to the phenomenon of diffraction. Indeed the colours that
you see when a CD is viewed is due to diffraction effects. We will now
discuss the phenomenon of diffraction.

10.6.1 The single slit


In the discussion of Youngs experiment, we stated that a single
narrow slit acts as a new source from which light spreads out. Even
before Young, early experimenters including Newton had noticed
that light spreads out from narrow holes and slits. It seems to turn
around corners and enter regions where we would expect a shadow.
These effects, known as diffraction, can only be properly understood
using wave ideas. After all, you are hardly surprised to hear sound
waves from someone talking around a corner!
When the double slit in Youngs experiment is replaced by a single
narrow slit (illuminated by a monochromatic source), a broad pattern
with a central bright region is seen. On both sides, there are alternate
dark and bright regions, the intensity becoming weaker away from the
centre (Fig. 10.16). To understand this, go to Fig. 10.15, which shows
a parallel beam of light falling normally on a single slit LN of width a.
The diffracted light goes on to meet a screen. The midpoint of the slit
is M.
A straight line through M perpendicular to the slit plane meets the
screen at C. We want the intensity at any point P on the screen. As
before, straight lines joining P to the different points L,M,N, etc., can
be treated as parallel, making an angle with the normal MC.
The basic idea is to divide the slit into much smaller parts, and add
their contributions at P with the proper phase differences. We are
treating different parts of the wavefront at the slit as secondary
sources. Because the incoming wavefront is parallel to the plane of
the slit, these sources are in phase.

The path difference NP LP between the two edges of the slit can be
calculated exactly as for Youngs experiment. From Fig. 10.15,
NP LP = NQ
= a sin
a (10.21)
Similarly, if two points M1 and M2 in the slit plane are separated by y,
the path difference M2P M1P y. We now have to sum up equal,
coherent contributions from a large number of sources, each with a
different phase. This calculation was made by Fresnel using integral
calculus, so we omit it here. The main features of the diffraction
pattern can be understood by simple arguments.
At the central point C on the screen, the angle is zero. All path
differences are zero and hence all the parts of the slit contribute in
phase. This gives maximum intensity at C. Experimental observation
shown in Fig. 10.15 indicates that the intensity has a central maximum
at = 0 and other secondary maxima at l (n+1/2) /a, and has
minima (zero intensity) at l n/a,
n = 1, 2, 3, .... It is easy to see why it has minima at these values
of angle. Consider first the angle where the path difference a is .
Then,

. (10.22)
Now, divide the slit into two equal halves LM and MN each of size a/2.
For every point M1 in LM, there is a point M2 in MN such that M1M2 =
a/2. The path difference between M1 and M2 at P = M2P M1P = a/2
= /2 for the angle chosen. This means that the contributions from M1
and M2 are 180 out of phase and cancel in the direction = /a.
Contributions from the two halves of the slit LM and MN, therefore,
cancel each other. Equation (10.22) gives the angle at which the
intensity falls to zero. One can similarly show that the intensity is zero
for = n/a, with n being any integer (except zero!). Notice that the
angular size of the central maximum increases when the slit width a
decreases.
Figure 10.15 The geometry of path differences for diffraction by a single slit.

It is also easy to see why there are maxima at = (n + 1/2) /a and


why they go on becoming weaker and weaker with increasing n.
Consider an angle = 3/2a which is midway between two of the dark
fringes. Divide the slit into three equal parts. If we take the first two
thirds of the slit, the path difference between the two ends would be

(10.23)

The first two-thirds of the slit can therefore be divided into two halves
which have a /2 path difference. The contributions of these two
halves cancel in the same manner as described earlier. Only the
remaining one-third of the slit contributes to the intensity at a point
between the two minima. Clearly, this will be much weaker than the
central maximum (where the entire slit contributes in phase). One can
similarly show that there are maxima at (n + 1/2) /a with n = 2, 3, etc.
These become weaker with increasing n, since only one-fifth, one-
seventh, etc., of the slit contributes in these cases. The photograph
and intensity pattern corresponding to it is shown in Fig. 10.16.

There has been prolonged discussion about difference between


intereference and diffraction among scientists since the discovery of
these phenomena. In this context, it is interesting to note what Richard
Feynman* has said in his famous Feynman Lectures on Physics:

No one has ever been able to define the difference between


interference and diffraction satisfactorily. It is just a question of usage,
and there is no specific, important physical difference between them.
The best we can do is, roughly speaking, is to say that when there are
only a few sources, say two interfering sources, then the result is
usually called interference, but if there is a large number of them, it
seems that the word diffraction is more often used.
Figure 10.16 Intensity distribution and photograph of fringes due to diffraction at
single slit.

In the double-slit experiment, we must note that the pattern on the


screen is actually a superposition of single-slit diffraction from each slit
or hole, and the double-slit interference pattern. This is shown in Fig.
10.17. It shows a broader diffraction peak in which there appear
several fringes of smaller width due to double-slit interference. The
number of interference fringes occuring in the broad diffraction peak
depends on the ratio d/a, that is the ratio of the distance between the
two slits to the width of a slit. In the limit of a becoming very small, the
diffraction pattern will become very flat and we will obsrve the two-slit
interference pattern [see Fig. 10.13(b)].
Example 10.5 In Example 10.3, what should the width of each slit be
to obtain 10 maxima of the double slit pattern within the central
maximum of the single slit pattern?

Solution We want

Notice that the wavelength of light and distance of the screen do not
enter in the calculation of a.

Figure 10.17 The actual double-slit interference pattern. The envelope shows the
single slit diffraction.

In the double-slit interference experiment of Fig. 10.12, what happens


if we close one slit? You will see that it now amounts to a single slit.
But you will have to take care of some shift in the pattern. We now
have a source at S, and only one hole (or slit) S1 or S2. This will
produce a single-slit diffraction pattern on the screen. The centre of
the central bright fringe will appear at a point which lies on the straight
line SS1 or SS2, as the case may be.
We now compare and contrast the interference pattern with that seen
for a coherently illuminated single slit (usually called the single slit
diffraction pattern).

(i) The interference pattern has a number of equally spaced bright and
dark bands. The diffraction pattern has a central bright maximum
which is twice as wide as the other maxima. The intensity falls as we
go to successive maxima away from the centre, on either side.
(ii) We calculate the interference pattern by superposing two waves
originating from the two narrow slits. The diffraction pattern is a
superposition of a continuous family of waves originating from each
point on a single slit.
(iii) For a single slit of width a, the first null of the interference pattern
occurs at an angle of /a. At the same angle of /a, we get a maximum
(not a null) for two narrow slits separated by a distance a.

One must understand that both d and a have to be quite small, to be


able to observe good interference and diffraction patterns. For
example, the separation d between the two slits must be of the order
of a milimetre or so. The width a of each slit must be even smaller, of
the order of 0.1 or 0.2 mm.
In our discussion of Youngs experiment and the single-slit diffraction,
we have assumed that the screen on which the fringes are formed is
at a large distance. The two or more paths from the slits to the screen
were treated as parallel. This situation also occurs when we place a
converging lens after the slits and place the screen at the focus.
Parallel paths from the slit are combined at a single point on the
screen. Note that the lens does not introduce any extra path
differences in a parallel beam. This arrangement is often used since it
gives more intensity than placing the screen far away. If f is the focal
length of the lens, then we can easily work out the size of the central
bright maximum. In terms of angles, the separation of the central
maximum from the first null of the diffraction pattern is /a. Hence, the
size on the screen will be f /a.

10.6.2 Seeing the single slit diffraction pattern


It is surprisingly easy to see the single-slit diffraction pattern for
oneself. The equipment needed can be found in most homes two
razor blades and one clear glass electric bulb preferably with a
straight filament. One has to hold the two blades so that the edges are
parallel and have a narrow slit in between. This is easily done with the
thumb and forefingers (Fig. 10.18).

Keep the slit parallel to the filament, right in front of the eye. Use
spectacles if you normally do. With slight adjustment of the width of
the slit and the parallelism of the edges, the pattern should be seen
with its bright and dark bands. Since the position of all the bands
(except the central one) depends on wavelength, they will show some
colours. Using a filter for red or blue will make the fringes clearer. With
both filters available, the wider fringes for red compared to blue can be
seen.
In this experiment, the filament plays the role of the first slit S in
Fig. 10.16. The lens of the eye focuses the pattern on the screen (the
retina of the eye).

Figure 10.18 Holding two blades to form a single slit. A bulb filament viewed
through this shows clear diffraction bands.

With some effort, one can cut a double slit in an aluminium foil with a
blade. The bulb filament can be viewed as before to repeat Youngs
experiment. In daytime, there is another suitable bright source
subtending a small angle at the eye. This is the reflection of the Sun in
any shiny convex surface (e.g., a cycle bell). Do not try direct sunlight
it can damage the eye and will not give fringes anyway as the Sun
subtends an angle of (1/2).
In interference and diffraction, light energy is redistributed. If it reduces
in one region, producing a dark fringe, it increases in another region,
producing a bright fringe. There is no gain or loss of energy, which is
consistent with the principle of conservation of energy.
10.6.3 Resolving power of optical instruments
In Chapter 9 we had discussed about telescopes. The angular
resolution of the telescope is determined by the objective of the
telescope. The stars which are not resolved in the image produced by
the objective cannot be resolved by any further magnification
produced by the eyepiece. The primary purpose of the eyepiece is to
provide magnification of the image produced by the objective.

Consider a parallel beam of light falling on a convex lens. If the lens is


well corrected for aberrations, then geometrical optics tells us that the
beam will get focused to a point. However, because of diffraction, the
beam instead of getting focused to a point gets focused to a spot of
finite area. In this case the effects due to diffraction can be taken into
account by considering a plane wave incident on a circular aperture
followed by a convex lens (Fig. 10.19). The analysis of the
corresponding diffraction pattern is quite involved; however, in
principle, it is similar to the analysis carried out to obtain the single-slit
diffraction pattern. Taking into account the effects due to diffraction,
the pattern on the focal plane would consist of a central bright region
surrounded by concentric dark and bright rings (Fig. 10.19). A detailed
analysis shows that the radius of the central bright region is
approximately given by

(10.24)
where f is the focal length of the lens and 2a is the diameter of the
circular aperture or the diameter of the lens, whichever is smaller.
Typically if
0.5 m, f 20 cm and a 5 cm
we have

r0 1.2 m
Although the size of the spot is very small, it plays an important role in
determining the limit of resolution of optical instruments like a
telescope or a microscope. For the two stars to be just resolved

Figure 10.19 A parallel beam of light is incident on a convex lens. Because of


diffraction effects, the beam gets focused to a spot of radius 0.61 f/a.

implying

(10.25)
Thus will be small if the diameter of the objective is large. This
implies that the telescope will have better resolving power if a is large.
It is for this reason that for better resolution, a telescope must have a
large diameter objective.
Example 10.6 Assume that light of wavelength 6000 is coming from a star.
What is the limit of resolution of a telescope whose objective has a diameter of
100 inch?
Solution A 100 inch telescope implies that 2a = 100 inch
= 254 cm. Thus if,

6000 = 6105 cm
then

radians

We can apply a similar argument to the objective lens of a


microscope. In this case, the object is placed slightly beyond f, so that
a real image is formed at a distance v [Fig. 10.20]. The magnification
ratio of image size to object size is given by m l v/f. It can be seen
from
Fig. 10.20 that
D/f l 2 tan (10.26)

where 2 is the angle subtended by the diameter of the objective lens


at the focus of the microscope.
Figure 10.20 Real image formed by the objective lens of the microscope.

When the separation between two points in a microscopic specimen is


comparable to the wavelength of the light, the diffraction effects
become important. The image of a point object will again be a
diffraction pattern whose size in the image plane will be

(10.27)
Two objects whose images are closer than this distance will not be
resolved, they will be seen as one. The corresponding minimum
separation, dmin, in the object plane is given by

dmin =

=
= (10.28)

Now, combining Eqs. (10.26) and (10.28), we get

(10.29)
If the medium between the object and the objective lens is not air but
a medium of refractive index n, Eq. (10.29) gets modified to

(10.30)
The product n sin is called the numerical aperture and is sometimes
marked on the objective.

Determine the resolving power of your eye


You can estimate the resolving power of your eye with a simple experiment.
Make black stripes of equal width separated by white stripes; see figure here.
All the black stripes should be of equal width, while the width of the
intermediate white stripes should increase as you go from the left to the right.
For example, let all black stripes have a width of 5 mm. Let the width of the first
two white stripes be 0.5 mm each, the next two white stripes be 1 mm each,
the next two 1.5 mm each, etc. Paste this pattern on a wall in a room or
laboratory, at the height of your eye.

Now watch the pattern, preferably with one eye. By moving away or closer to
the wall, find the position where you can just see some two black stripes as
separate stripes. All the black stripes to the left of this stripe would merge into
one another and would not be distinguishable. On the other hand, the black
stripes to the right of this would be more and more clearly visible. Note the
width d of the white stripe which separates the two regions, and measure the
distance D of the wall from your eye. Then d/D is the resolution of your eye.
You have watched specks of dust floating in air in a sunbeam entering through
your window. Find the distance (of a speck) which you can clearly see and
distinguish from a neighbouring speck. Knowing the resolution of your eye and
the distance of the speck, estimate the size of the speck of dust.

The resolving power of the microscope is given by the reciprocal of


the minimum separation of two points seen as distinct. It can be seen
from Eq. (10.30) that the resolving power can be increased by
choosing a medium of higher refractive index. Usually an oil having a
refractive index close to that of the objective glass is used. Such an
arrangement is called an oil immersion objective. Notice that it is not
possible to make sin larger than unity. Thus, we see that the
resolving power of a microscope is basically determined by the
wavelength of the light used.
There is a likelihood of confusion between resolution and
magnification, and similarly between the role of a telescope and a
microscope to deal with these parameters. A telescope produces
images of far objects nearer to our eye. Therefore objects which are
not resolved at far distance, can be resolved by looking at them
through a telescope. A microscope, on the other hand, magnifies
objects (which are near to us) and produces their larger image. We
may be looking at two stars or two satellites of a far-away planet, or
we may be looking at different regions of a living cell. In this context, it
is good to remember that a telescope resolves whereas a microscope
magnifies.

10.6.4 The validity of ray optics


An aperture (i.e., slit or hole) of size a illuminated by a parallel beam
sends diffracted light into an angle of approximately /a. This is the
angular size of the bright central maximum. In travelling a distance z,
the diffracted beam therefore acquires a width z/a due to diffraction.
It is interesting to ask at what value of z the spreading due to
diffraction becomes comparable to the size a of the aperture. We thus
approximately equate z/a with a. This gives the distance beyond
which divergence of the beam of width a becomes significant.
Therefore,

(10.31)
We define a quantity zF called the Fresnel distance by the following
equation

Equation (10.31) shows that for distances much smaller than zF , the
spreading due to diffraction is smaller compared to the size of the
beam. It becomes comparable when the distance is approximately zF.
For distances much greater than zF, the spreading due to diffraction
dominates over that due to ray optics (i.e., the size a of the aperture).
Equation (10.31) also shows that ray optics is valid in the limit of
wavelength tending to zero.
Example 10.7 For what distance is ray optics a good approximation when the
aperture is 3 mm wide and the wavelength is 500 nm?

Solution
This example shows that even with a small aperture, diffraction spreading can
be neglected for rays many metres in length. Thus, ray optics is valid in many
common situations.

10.7 POLARISATION
Consider holding a long string that is held horizontally, the other end
of which is assumed to be fixed. If we move the end of the string up
and down in a periodic manner, we will generate a wave propagating
in the +x direction (Fig. 10.21). Such a wave could be described by the
following equation
y (x,t) = a sin (kx t) (10.32)
where a and (= 2) represent the amplitude and the angular
frequency of the wave, respectively; further,

(10.33)
represents the wavelength associated with the wave. We had
discussed propagation of such waves in Chapter 15 of Class XI
textbook. Since the displacement (which is along the y direction) is at
right angles to the direction of propagation of the wave, we have what
is known as a transverse wave. Also, since the displacement is in the
y direction, it is often referred to as a y-polarised wave. Since each
point on the string moves on a straight line, the wave is also referred
to as a linearly polarised wave. Further, the string always remains
confined to the x-y plane and therefore it is also referred to as a plane
polarised wave.

Figure 10.21 (a) The curves represent the displacement of a string at t = 0 and
at t = t, respectively when a sinusoidal wave is propagating in the +x-direction. (b)
The curve represents the time variation of the displacement at x = 0 when a
sinusoidal wave is propagating in the +x-direction. At x = x, the time variation of
the displacement will be slightly displaced to the right.

In a similar manner we can consider the vibration of the string in the x-


z plane generating a z-polarised wave whose displacement will be
given by
z (x,t) = a sin (kx t) (10.34)
It should be mentioned that the linearly polarised waves [described by
Eqs. (10.33) and (10.34)] are all transverse waves; i.e., the
displacement of each point of the string is always at right angles to the
direction of propagation of the wave. Finally, if the plane of vibration of
the string is changed randomly in very short intervals of time, then we
have what is known as an unpolarised wave. Thus, for an unpolarised
wave the displacement will be randomly changing with time though it
will always be perpendicular to the direction of propagation.
Light waves are transverse in nature; i.e., the electric field associated
with a propagating light wave is always at right angles to the direction
of propagation of the wave. This can be easily demonstrated using a
simple polaroid. You must have seen thin plastic like sheets, which
are called polaroids. A polaroid consists of long chain molecules
aligned in a particular direction. The electric vectors (associated with
the propagating light wave) along the direction of the aligned
molecules get absorbed. Thus, if an unpolarised light wave is incident
on such a polaroid then the light wave will get linearly polarised with
the electric vector oscillating along a direction perpendicular to the
aligned molecules; this direction is known as the pass-axis of the
polaroid.
Thus, if the light from an ordinary source (like a sodium lamp) passes
through a polaroid sheet P1, it is observed that its intensity is reduced
by half. Rotating P1 has no effect on the transmitted beam and
transmitted intensity remains constant. Now, let an identical piece of
polaroid P2 be placed before P1. As expected, the light from the lamp
is reduced in intensity on passing through P2 alone. But now rotating
P1 has a dramatic effect on the light coming from P2. In one position,
the intensity transmitted by P2 followed by P1 is nearly zero. When
turned by 90 from this position, P1 transmits nearly the full intensity
emerging from P2 (Fig. 10.22).
The above experiment can be easily understood by assuming that
light passing through the polaroid P2 gets polarised along the pass-
axis of P2. If the pass-axis of P2 makes an angle with the pass-axis
of P1, then when the polarised beam passes through the polaroid P2,
the component E cos (along the pass-axis of P2) will pass through
P2. Thus, as we rotate the polaroid P1 (or P2), the intensity will vary
as:
I = I0 cos2 (10.35)
where I0 is the intensity of the polarized light after passing through P1.
This is known as Malus law. The above discussion shows that the
intensity coming out of a single polaroid is half of the incident intensity.
By putting a second polaroid, the intensity can be further controlled
from 50% to zero of the incident intensity by adjusting the angle
between the pass-axes of two polaroids.
Figure 10.22 (a) Passage of light through two polaroids P2 and P1. The
transmitted fraction falls from 1 to 0 as the angle between them varies from 0 to
90. Notice that the light seen through a single polaroid P1 does not vary with
angle. (b) Behaviour of the electric vector when light passes through two polaroids.
The transmitted polarisation is the component parallel to the polaroid axis. The
double arrows show the oscillations of the electric vector.

Polaroids can be used to control the intensity, in sunglasses,


windowpanes, etc. Polaroids are also used in photographic cameras
and 3D movie cameras.

Example 10.8 Discuss the intensity of transmitted light when a polaroid sheet
is rotated between two crossed polaroids?
Solution Let I0 be the intensity of polarised light after passing through the first
polariser P1. Then the intensity of light after passing through second polariser
P2 will be

,
where is the angle between pass axes of P1 and P2. Since P1 and P3 are
crossed the angle between the pass axes of P2 and P3 will be (/2). Hence
the intensity of light emerging from P3 will be

= I0 cos2 sin2 =(I0/4) sin22

Therefore, the transmitted intensity will be maximum when = /4.

10.7.1 Polarisation by scattering


The light from a clear blue portion of the sky shows a rise and fall of
intensity when viewed through a polaroid which is rotated. This is
nothing but sunlight, which has changed its direction (having been
scattered) on encountering the molecules of the earths atmosphere.
As Fig. 10.23(a) shows, the incident sunlight is unpolarised. The dots
stand for polarisation perpendicular to the plane of the figure. The
double arrows show polarisation in the plane of the figure. (There is no
phase relation between these two in unpolarised light). Under the
influence of the electric field of the incident wave the electrons in the
molecules acquire components of motion in both these directions. We
have drawn an observer looking at 90 to the direction of the sun.
Clearly, charges accelerating parallel to the double arrows do not
radiate energy towards this observer since their acceleration has no
transverse component. The radiation scattered by the molecule is
therefore represented by dots. It is polarised perpendicular to the
plane of the figure. This explains the polarisation of scattered light
from the sky.

The scattering of light by molecules was intensively investigated by


C.V. Raman and his collaborators in Kolkata in the 1920s. Raman was
awarded the Nobel Prize for Physics in 1930 for this work.

10.7.2 Polarisation by reflection

Figure 10.23 (a) Polarisation of the blue scattered light from the sky. The incident
sunlight is unpolarised (dots and arrows). A typical molecule is shown. It scatters
light by 90 polarised normal to the plane of the paper (dots only). (b) Polarisation
of light reflected from a transparent medium at the Brewster angle (reflected ray
perpendicular to refracted ray).

Figure 10.23(b) shows light reflected from a transparent medium, say,


water. As before, the dots and arrows indicate that both polarisations
are present in the incident and refracted waves. We have drawn a
situation in which the reflected wave travels at right angles to the
refracted wave. The oscillating electrons in the water produce the
reflected wave. These move in the two directions transverse to the
radiation from wave in the medium, i.e., the refracted wave. The
arrows are parallel to the direction of the reflected wave. Motion in this
direction does not contribute to the reflected wave. As the figure
shows, the reflected light is therefore linearly polarised perpendicular
to the plane of the figure (represented by dots). This can be checked
by looking at the reflected light through an analyser. The transmitted
intensity will be zero when the axis of the analyser is in the plane of
the figure, i.e., the plane of incidence.
When unpolarised light is incident on the boundary between two
transparent media, the reflected light is polarised with its electric
vector perpendicular to the plane of incidence when the refracted and
reflected rays make a right angle with each other. Thus we have seen
that when reflected wave is perpendicular to the refracted wave, the
reflected wave is a totally polarised wave.

A special case of total transmission


When light is incident on an interface of two media, it is observed that some
part of it gets reflected and some part gets transmitted. Consider a related
question: Is it possible that under some conditions a monochromatic beam of
light incident on a surface (which is normally reflective) gets completely
transmitted with no reflection? To your surprise, the answer is yes.
Let us try a simple experiment and check what happens. Arrange a laser, a
good polariser, a prism and screen as shown in the figure here.
Let the light emitted by the laser source pass through the polariser and be
incident on the surface of the prism at the Brewsters angle of incidence iB.
Now rotate the polariser carefully and you will observe that for a specific
alignment of the polariser, the light incident on the prism is completely
transmitted and no light is reflected from the surface of the prism. The reflected
spot will completely vanish.

The angle of incidence in this case is called Brewsters angle and is


denoted by iB. We can see that iB is related to the refractive index of
the denser medium. Since we have iB+r = /2, we get from Snells law

(10.36)
This is known as Brewsters law.

Example 10.9 Unpolarised light is incident on a plane glass surface. What


should be the angle of incidence so that the reflected and refracted rays are
perpendicular to each other?
Solution For i + r to be equal to /2, we should have tan iB = = 1.5. This
gives iB = 57. This is the Brewsters angle for air to glass interface.

For simplicity, we have discussed scattering of light by 90, and


reflection at the Brewster angle. In this special situation, one of the
two perpendicular components of the electric field is zero. At other
angles, both components are present but one is stronger than the
other. There is no stable phase relationship between the two
perpendicular components since these are derived from two
perpendicular components of an unpolarised beam. When such light is
viewed through a rotating analyser, one sees a maximum and a
minimum of intensity but not complete darkness. This kind of light is
called partially polarised.
Let us try to understand the situation. When an unpolarised beam of
light is incident at the Brewsters angle on an interface of two media,
only part of light with electric field vector perpendicular to the plane of
incidence will be reflected. Now by using a good polariser, if we
completely remove all the light with its electric vector perpendicular to
the plane of incidence and let this light be incident on the surface of
the prism at Brewsters angle, you will then observe no reflection and
there will be total transmission of light.
We began this chapter by pointing out that there are some
phenomena which can be explained only by the wave theory. In order
to develop a proper understanding, we first described how some
phenomena like reflection and refraction, which were studied on this
basis of Ray Optics in Chapter 9, can also be understood on the basis
of Wave Optics. Then we described Youngs double slit experiment
which was a turning point in the study of optics. Finally, we described
some associated points such as diffraction, resolution, polarisation,
and validity of ray optics. In the next chapter, you will see how new
experiments led to new theories at the turn of the century around 1900
A.D.

Summary

1. Huygens principle tells us that each point on a wavefront is a source of


secondary waves, which add up to give the wavefront at a later time.
2. Huygens construction tells us that the new wavefront is the forward
envelope of the secondary waves. When the speed of light is independent of
direction, the secondary waves are spherical. The rays are then perpendicular
to both the wavefronts and the time of travel is the same measured along any
ray. This principle leads to the well known laws of reflection and refraction.
3. The principle of superposition of waves applies whenever two or more
sources of light illuminate the same point. When we consider the intensity of
light due to these sources at the given point, there is an interference term in
addition to the sum of the individual intensities. But this term is important only if
it has a non-zero average, which occurs only if the sources have the same
frequency and a stable phase difference.
4. Youngs double slit of separation d gives equally spaced fringes of angular
separation /d. The source, mid-point of the slits, and central bright fringe lie in
a straight line. An extended source will destroy the fringes if it subtends angle
more than /d at the slits.
5. A single slit of width a gives a diffraction pattern with a central maximum.

The intensity falls to zero at angles of etc., with successively


weaker secondary maxima in between. Diffraction limits the angular resolution
of a telescope to /D where D is the diameter. Two stars closer than this give
strongly overlapping images. Similarly, a microscope objective subtending
angle 2 at the focus, in a medium of refractive index n, will just separate two
objects spaced at a distance /(2n sin ), which is the resolution limit of a
microscope. Diffraction determines the limitations of the concept of light rays. A
beam of width a travels a distance a2/, called the Fresnel distance, before it
starts to spread out due to diffraction.
6. Natural light, e.g., from the sun is unpolarised. This means the electric
vector takes all possible directions in the transverse plane, rapidly and
randomly, during a measurement. A polaroid transmits only one component
(parallel to a special axis). The resulting light is called linearly polarised or
plane polarised. When this kind of light is viewed through a second polaroid
whose axis turns through 2, two maxima and minima of intensity are seen.
Polarised light can also be produced by reflection at a special angle (called the
Brewster angle) and by scattering through /2 in the earths atmosphere.

Points to Ponder
1. Waves from a point source spread out in all directions, while light was seen
to travel along narrow rays. It required the insight and experiment of Huygens,
Young and Fresnel to understand how a wave theory could explain all aspects
of the behaviour of light.
2. The crucial new feature of waves is interference of amplitudes from different
sources which can be both constructive and destructive, as shown in Youngs
experiment.
3. Even a wave falling on single slit should be regarded as a large number of
sources which interefere constructively in the forward direction ( = 0), and
destructively in other directions.
4. Diffraction phenomena define the limits of ray optics. The limit of the ability
of microscopes and telescopes to distinguish very close objects is set by the
wavelength of light.
5. Most interference and diffraction effects exist even for longitudinal waves
like sound in air. But polarisation phenomena are special to transverse waves
like light waves.

* Richand Feynman was one of the recipients of the 1965 Nobel Prize
in Physics for his fundamental work in quantum electrodynamics.
Exercises
10.1 Monochromatic light of wavelength 589 nm is incident from
air on a water surface. What are the wavelength, frequency and
speed of
(a) reflected, and (b) refracted light? Refractive index of water is
1.33.
10.2 What is the shape of the wavefront in each of the following
cases:
(a) Light diverging from a point source.
(b) Light emerging out of a convex lens when a point source is
placed at its focus.
(c) The portion of the wavefront of light from a distant star
intercepted by the Earth.
10.3 (a) The refractive index of glass is 1.5. What is the speed of
light in glass? (Speed of light in vacuum is 3.0 108 m s1)

(b) Is the speed of light in glass independent of the colour of light?


If not, which of the two colours red and violet travels slower in a
glass prism?
10.4 In a Youngs double-slit experiment, the slits are separated
by
0.28 mm and the screen is placed 1.4 m away. The distance
between the central bright fringe and the fourth bright fringe is
measured
to be 1.2 cm. Determine the wavelength of light used in the
experiment.

10.5 In Youngs double-slit experiment using monochromatic light


of wavelength , the intensity of light at a point on the screen
where path difference is , is K units. What is the intensity of light
at a point where path difference is /3?

10.6 A beam of light consisting of two wavelengths, 650 nm and


520 nm, is used to obtain interference fringes in a Youngs double-
slit experiment.
(a) Find the distance of the third bright fringe on the screen from
the central maximum for wavelength 650 nm.
(b) What is the least distance from the central maximum where the
bright fringes due to both the wavelengths coincide?
10.7 In a double-slit experiment the angular width of a fringe is
found to be 0.2 on a screen placed 1 m away. The wavelength of
light used is 600 nm. What will be the angular width of the fringe if
the entire experimental apparatus is immersed in water? Take
refractive index of water to be 4/3.
10.8 What is the Brewster angle for air to glass transition?
(Refractive index of glass = 1.5.)
10.9 Light of wavelength 5000 falls on a plane reflecting surface.
What are the wavelength and frequency of the reflected light? For
what angle of incidence is the reflected ray normal to the incident
ray?
10.10 Estimate the distance for which ray optics is good
approximation for an aperture of 4 mm and wavelength 400 nm.
Additional Exercises

10.11 The 6563 H line emitted by hydrogen in a star is found to


be red-shifted by 15 . Estimate the speed with which the star is
receding from the Earth.
10.12 Explain how Corpuscular theory predicts the speed of light
in a medium, say, water, to be greater than the speed of light in
vacuum. Is the prediction confirmed by experimental determination
of the speed of light in water? If not, which alternative picture of
light is consistent with experiment?
10.13 You have learnt in the text how Huygens principle leads to
the laws of reflection and refraction. Use the same principle to
deduce directly that a point object placed in front of a plane mirror
produces a virtual image whose distance from the mirror is equal
to the object distance from the mirror.
10.14 Let us list some of the factors, which could possibly
influence the speed of wave propagation:

(i) nature of the source.


(ii) direction of propagation.
(iii) motion of the source and/or observer.
(iv) wavelength.
(v) intensity of the wave.
On which of these factors, if any, does
(a) the speed of light in vacuum,
(b) the speed of light in a medium (say, glass or water),
depend?

10.15 For sound waves, the Doppler formula for frequency shift
differs slightly between the two situations: (i) source at rest;
observer moving, and (ii) source moving; observer at rest. The
exact Doppler formulas for the case of light waves in vacuum are,
however, strictly identical for these situations. Explain why this
should be so. Would you expect the formulas to be strictly identical
for the two situations in case of light travelling in a medium?
10.16 In double-slit experiment using light of wavelength 600 nm,
the angular width of a fringe formed on a distant screen is 0.1.
What is the spacing between the two slits?
10.17 Answer the following questions:
(a) In a single slit diffraction experiment, the width of the slit is
made double the original width. How does this affect the size and
intensity of the central diffraction band?
(b) In what way is diffraction from each slit related to the
interference pattern in a double-slit experiment?
(c) When a tiny circular obstacle is placed in the path of light from
a distant source, a bright spot is seen at the centre of the shadow
of the obstacle. Explain why?
(d) Two students are separated by a 7 m partition wall in a room
10 m high. If both light and sound waves can bend around
obstacles, how is it that the students are unable to see each other
even though they can converse easily.
(e) Ray optics is based on the assumption that light travels in a
straight line. Diffraction effects (observed when light propagates
through small apertures/slits or around small obstacles) disprove
this assumption. Yet the ray optics assumption is so commonly
used in understanding location and several other properties of
images in optical instruments. What is the justification?
10.18 Two towers on top of two hills are 40 km apart. The line
joining them passes 50 m above a hill halfway between the
towers. What is the longest wavelength of radio waves, which can
be sent between the towers without appreciable diffraction effects?
10.19 A parallel beam of light of wavelength 500 nm falls on a
narrow slit and the resulting diffraction pattern is observed on a
screen 1 m away. It is observed that the first minimum is at a
distance of 2.5 mm from the centre of the screen. Find the width of
the slit.
10.20 Answer the following questions:
(a) When a low flying aircraft passes overhead, we sometimes
notice a slight shaking of the picture on our TV screen. Suggest a
possible explanation.
(b) As you have learnt in the text, the principle of linear
superposition of wave displacement is basic to understanding
intensity distributions in diffraction and interference patterns. What
is the justification of this principle?
10.21 In deriving the single slit diffraction pattern, it was stated that
the intensity is zero at angles of n/a. Justify this by suitably
dividing the slit to bring out the cancellation.
Chapter Eleven

Thermal Properties of Matter

11.1 Introduction

11.2 Temperature and heat

11.3 Measurement of temperature

11.4 Ideal-gas equation and absolute temperature

11.5 Thermal expansion

11.6 Specific heat capacity

11.7 Calorimetry

11.8 Change of state

11.9 Heat transfer

11.10 Newtons law of cooling

Summary

Points to ponder

Exercises

11.1 INTRODUCTION
We all have common-sense notions of heat and temperature.
Temperature is a measure of hotness of a body. A kettle with boiling
water is hotter than a box containing ice. In physics, we need to define
the notion of heat, temperature, etc., more carefully. In this chapter,
you will learn what heat is and how it is measured, and study the
various proceses by which heat flows from one body to another. Along
the way, you will find out why blacksmiths heat the iron ring before
fitting on the rim of a wooden wheel of a bullock cart and why the wind
at the beach often reverses direction after the sun goes down. You will
also learn what happens when water boils or freezes, and its
temperature does not change during these processes even though a
great deal of heat is flowing into or out of it.

11.2 TEMPERATURE AND HEAT


We can begin studying thermal properties of matter with definitions of
temperature and heat. Temperature is a relative measure, or
indication of hotness or coldness. A hot utensil is said to have a high
temperature, and ice cube to have a low temperature. An object that
has a higher temperature than another object is said to be hotter. Note
that hot and cold are relative terms, like tall and short. We can
perceive temperature by touch. However, this temperature sense is
somewhat unreliable and its range is too limited to be useful for
scientific purposes.
We know from experience that a glass of ice-cold water left on a table
on a hot summer day eventually warms up whereas a cup of hot tea
on the same table cools down. It means that when the temperature of
body, ice-cold water or hot tea in this case, and its surrounding
medium are different, heat transfer takes place between the system
and the surrounding medium, until the body and the surrounding
medium are at the same temperature. We also know that in the case
of glass tumbler of ice cold water, heat flows from the environment to
the glass tumbler, whereas in the case of hot tea, it flows from the cup
of hot tea to the environment. So, we can say that heat is the form of
energy transferred between two (or more) systems or a system
and its surroundings by virtue of temperature difference. The SI
unit of heat energy transferred is expressed in joule (J) while SI unit of
temperature is kelvin (K), and C is a commonly used unit of
temperature. When an object is heated, many changes may take
place. Its temperature may rise, it may expand or change state. We
will study the effect of heat on different bodies in later sections.

11.3 MEASUREMENT OF TEMPERATURE


A measure of temperature is obtained using a thermometer. Many
physical properties of materials change sufficiently with temperature to
be used as the basis for constructing thermometers. The commonly
used property is variation of the volume of a liquid with temperature.
For example, a common thermometer (the liquid-in-glass type) with
which you are familiar. Mercury and alcohol are the liquids used in
most liquid-in-glass thermometers.

Thermometers are calibrated so that a numerical value may be


assigned to a given temperature. For the definition of any standard
scale, two fixed reference points are needed. Since all substances
change dimensions with temperature, an absolute reference for
expansion is not available. However, the necessary fixed points may
be correlated to physical phenomena that always occur at the same
temperature. The ice point and the steam point of water are two
convenient fixed points and are known as the freezing and boiling
points. These two points are the temperatures at which pure water
freezes and boils under standard pressure. The two familiar
temperature scales are the Fahrenheit temperature scale and the
Celsius temperature scale. The ice and steam point have values 32 F
and 212 F respectively, on the Fahrenheit scale and 0 C and 100 C
on the Celsius scale. On the Fahrenheit scale, there are 180 equal
intervals between two reference points, and on the celsius scale, there
are 100.

Fig. 11.1 A plot of Fahrenheit temperature (tF) versus Celsius temperature (tc).

A relationship for converting between the two scales may be obtained


from a graph of Fahrenheit temperature (tF) versus celsius
temperature (tC) in a straight line (Fig. 11.1), whose equation is

(11.1)

11.4 IDEAL-GAS EQUATION AND ABSOLUTE


TEMPERATURE
Liquid-in-glass thermometers show different readings for temperatures
other than the fixed points because of differing expansion properties.
A thermometer that uses a gas, however, gives the same readings
regardless of which gas is used. Experiments show that all gases at
low densities exhibit same expansion behaviour. The variables that
describe the behaviour of a given quantity (mass) of gas are pressure,
volume, and temperature (P, V, and T)(where T = t + 273.15; t is the
temperature in C). When temperature is held constant, the pressure
and volume of a quantity of gas are related as pv = constant. This
relationship is known as Boyles law, after Robert Boyle (1627-1691)
the English Chemist who discovered it. When the pressure is held
constant, the volume of a quantity of the gas is related to the
temperature as V/T = constant. This relationship is known as Charles
law, after the French scientist Jacques Charles (1747-1823). Low
density gases obey these laws, which may be combined into a single
relationship. Notice that since pV = constant and V/T = constant for a
given quantity of gas, then pV/T should also be a constant.

Fig. 11.2 Pressure versus temperature of a low density gas kept at constant
volume.
This relationship is known as ideal gas law. It can be written in a more
general form that applies not just to a given quantity of a single gas
but to any quantity of any dilute gas and is known as IDEAL-GAS
EQUATION:

or PV = RT (11.2)
where, is the number of moles in the sample of gas and R is called
universal gas constant:
R = 8.31 J mol1 K1
In Eq. 11.2, we have learnt that the pressure and volume are directly
proportional to temperature : PV T. This relationship allows a gas to
be used to measure temperature in a constant volume gas
thermometer. Holding the volume of a gas constant, it gives P T.
Thus, with a constant-volume gas thermometer, temperature is read in
terms of pressure. A plot of pressure versus temperature gives a
straight line in this case, as shown in Fig. 11.2.
However, measurements on real gases deviate from the values
predicted by the ideal gas law at low temperature. But the relationship
is linear over a large temperature range, and it looks as though the
pressure might reach zero with decreasing temperature if the gas
continued to be a gas. The absolute minimum temperature for an ideal
gas, therefore, inferred by extrapolating the straight line to the axis, as
in Fig. 11.3. This temperature is found to be 273.15 C and is
designated as ABSOLUTE ZERO. Absolute zero is the foundation of
the Kelvin temperature scale or absolute scale temperature named
after the British scientist Lord Kelvin. On this scale, 273.15 C is
taken as the zero point, that is 0 K (Fig. 11.4).

Fig. 11.3 A plot of pressure versus temperature and extrapolation of lines for low
density gases indicates the same absolute zero temperature.

Fig. 11.4 Comparision of the Kelvin, Celsius and Fahrenheit temperature scales.

The size of the unit for Kelvin temperature is the same celsius degree,
so temperature on these scales are related by
T = tC + 273.15 (11.3)
11.5 THERMAL EXPANSION
You may have observed that sometimes sealed bottles with metallic
lids are so tightly screwed that one has to put the lid in hot water for
sometime to open the lid. This would allow the metallic cover to
expand, thereby loosening it to unscrew easily. In case of liquids, you
may have observed that mercury in a thermometer rises, when the
thermometer is put in a slightly warm water. If we take out the
thermometer from the warm water the level of mercury falls again.
Similarly, in the case of gases, a balloon partially inflated in a cool
room may expand to full size when placed in warm water. On the
other hand, a fully inflated balloon when immersed in cold water would
start shrinking due to contraction of the air inside.
It is our common experience that most substances expand on heating
and contract on cooling. A change in the temperature of a body
causes change in its dimensions. The increase in the dimensions of a
body due to the increase in its temperature is called thermal
expansion. The expansion in length is called LINEAR EXPANSION.
The expansion in area is called AREA EXPANSION. The expansion in
volume is called VOLUME EXPANSION (Fig. 11.5).

Fig. 11.5 Thermal Expansion.


(a) Linear expansion (b) Area expansion (c) Volume expansion

If the substance is in the form of a long rod, then for small change in
temperature, T, the fractional change in length, l/l, is directly
proportional to T.

(11.4)
where 1 is known as the COEFFICIENT OF LINEAR EXPANSION
and is characteristic of the material of the rod. In Table 11.1 are given
typical average values of the coefficient of linear expansion for some
materials in the temperature range 0 C to 100 C. From this Table,
compare the value of l for glass and copper. We find that copper
expands about five times more than glass for the same rise in
temperature. Normally, metals expand more and have relatively high
values of l.
Table 11.1 Values of coefficient of linear expansion for some materials

Similarly, we consider the fractional change in volume, , of a


substance for temperature change T and define the COEFFICIENT

OF VOLUME EXPANSION, as
(11.5)

Here V is also a characteristic of the substance but is not strictly a


constant. It depends in general on temperature (Fig 11.6). It is seen
that V becomes constant only at a high temperature.

Fig. 11.6 Coefficient of volume expansion of copper as a function of temperature.

Table 11.2 gives the values of co-efficient of volume expansion of


some common substances in the temperature range 0 100 C. You
can see that thermal expansion of these substances (solids and
liquids) is rather small, with materials like pyrex glass and invar (a
special iron-nickel alloy) having particularly low values of V. From this
Table we find that the value of v for alcohol (ethyl) is more than
mercury and expands more than mercury for the same rise in
temperature.
Table 11.2 Values of coefficient of volume expansion for some
substances
Water exhibits an anomalous behavour; it contracts on heating
between 0 C and 4 C. The volume of a given amount of water
decreases as it is cooled from room temperature, until its temperature
reaches 4 C, [Fig. 11.7(a)]. Below 4 C, the volume increases, and
therefore the density decreases [Fig. 11.7(b)].
This means that water has a maximum density at 4 C. This property
has an important environmental effect: Bodies of water, such as lakes
and ponds, freeze at the top first. As a lake cools toward 4 C, water
near the surface loses energy to the atmosphere, becomes denser,
and sinks; the warmer, less dense water near the bottom rises.
However, once the colder water on top reaches temperature below 4
C, it becomes less dense and remains at the surface, where it
freezes. If water did not have this property, lakes and ponds would
freeze from the bottom up, which would destroy much of their animal
and plant life.
Gases at ordinary temperature expand more than solids and liquids.
For liquids, the coefficient of volume expansion is relatively
independent of the temperature. However, for gases it is dependent
on temperature. For an ideal gas, the coefficient of volume expansion
at constant pressure can be found from the ideal gas equation :

PV = RT
At constant pressure
PV = R T

i.e. for ideal gas (11.6)


At 0 C, v = 3.7 103 K1, which is much larger than that for solids
and liquids. Equation (11.6) shows the temperature dependence of v;
it decreases with increasing temperature. For a gas at room
temperature and constant pressure v is about 3300 106 K1, as
much as order(s) of magnitude larger than the coefficient of volume
expansion of typical liquids.

Temperature (C) Temperature (C)

(a) (b)
Fig. 11.7 Thermal expansion of water.

There is a simple relation between the coefficient of volume expansion


(v) and coefficient of linear expansion (l). Imagine a cube of length,
l, that expands equally in all directions, when its temperature
increases by T. We have
l = l l T
so, V = (l+l)3 l3 3l2 l (11.7)

In equation (11.7), terms in (l)2 and (l)3 have been neglected since
l is small compared to l. So

(11.8)
which gives
v = 3l (11.9)
What happens by preventing the thermal expansion of a rod by fixing
its ends rigidly? Clearly, the rod acquires a compressive strain due to
the external forces provided by the rigid support at the ends. The
corresponding stress set up in the rod is called THERMAL STRESS.
For example, consider a steel rail of length 5 m and area of cross
section 40 cm2 that is prevented from expanding while the
temperature rises by 10 C. The coefficient of linear expansion of steel

is l(steel) = 1.2 105 K1. Thus, the compressive strain is = l(steel)


T = 1.2 105 10=1.2 104. Youngs modulus of steel is Y (steel) =
2 1011 N m2. Therefore, the thermal stress developed is

2.4 107 N m2, which corresponds to an external force


of
F = AYsteel = 2.4 107 40 104 j 105N. If two such steel
rails, fixed at their outer ends, are in contact at their inner ends, a
force of this magnitude can easily bend the rails.

Example 11.1 Show that the coefficient of area expansions, (A/A)/T, of a


rectangular sheet of the solid is twice its linear expansivity, l.

Answer

Fig. 11.8

Consider a rectangular sheet of the solid material of length a and


breadth b (Fig. 11.8 ). When the temperature increases by T, a
increases by a = l aT and b increases by b = lb T. From Fig.
11.8, the increase in area
A = A1 +A2 + A3
A = a b + b a + (a) (b)
= a lb T + b l a T + (l)2 ab (T)2
= l ab T (2 + l T) = l A T (2 + l T)
Since l 105 K1, from Table 11.1, the product l T for fractional
temperature is small in comparision with 2 and may be neglected.

Hence,

Example 11.2 A blacksmith fixes iron ring on the rim of the wooden wheel of a
bullock cart. The diameter of the rim and the iron ring are 5.243 m and 5.231
m respectively at 27 C. To what temperature should the ring be heated so as
to fit the rim of the wheel?

Answer
Given, T1 = 27 C
LT1 = 5.231 m
LT2 = 5.243 m
So,

LT2 =LT1 [1+l (T2T1)]


5.243 m = 5.231 m [1 + 1.20105 K1 (T227 C)]
or T2 = 218 C. t

11.6 SPECIFIC HEAT CAPACITY


Take some water in a vessel and start heating it on a burner. Soon
you will notice that bubbles begin to move upward. As the temperature
is raised the motion of water particles increases till it becomes
turbulent as water starts boiling. What are the factors on which the
quantity of heat required to raise the temperature of a substance
depend? In order to answer this question in the first step, heat a given
quantity of water to raise its temperature by, say 20 C and note the
time taken. Again take the same amount of water and raise its
temperature by 40 C using the same source of heat. Note the time
taken by using a stopwatch. You will find it takes about twice the time
and therefore, double the quantity of heat required raising twice the
temperature of same amount of water.

In the second step, now suppose you take double the amount of water
and heat it, using the same heating arrangement, to raise the
temperature by 20 C, you will find the time taken is again twice that
required in the first step.
In the third step, in place of water, now heat the same quantity of
some oil, say mustard oil, and raise the temperature again by 20 C.
Now note the time by the same stopwatch. You will find the time taken
will be shorter and therefore, the quantity of heat required would be
less than that required by the same amount of water for the same rise
in temperature.

The above observations show that the quantity of heat required to


warm a given substance depends on its mass, m, the change in
temperature, T and the nature of substance. The change in
temperature of a substance, when a given quantity of heat is absorbed
or rejected by it, is characterised by a quantity called the HEAT
CAPACITY of that substance. We define heat capacity, S of a
substance as

(11.10)
where Q is the amount of heat supplied to the substance to change
its temperature from T to T + T.

You have observed that if equal amount of heat is added to equal


masses of different substances, the resulting temperature changes will
not be the same. It implies that every substance has a unique value
for the amount of heat absorbed or rejected to change the
temperature of unit mass of it by one unit. This quantity is referred to
as the SPECIFIC HEAT CAPACITY of the substance.

If Q stands for the amount of heat absorbed or rejected by a


substance of mass m when it undergoes a temperature change T,
then the specific heat capacity, of that substance is given by

(11.11)
The SPECIFIC HEAT CAPACITY is the property of the substance
which determines the change in the temperature of the substance
(undergoing no phase change) when a given quantity of heat is
absorbed (or rejected) by it. It is defined as the amount of heat per unit
mass absorbed or rejected by the substance to change its
temperature by one unit. It depends on the nature of the substance
and its temperature. The SI unit of specific heat capacity is J kg1 K1.
If the amount of substance is specified in terms of moles , instead of
mass m in kg, we can define heat capacity per mole of the substance
by

(11.12)
where C is known as MOLAR SPECIFIC HEAT CAPACITY of the
substance. Like S, C also depends on the nature of the substance and
its temperature. The SI unit of molar specific heat capacity is J mol1
K1.

However, in connection with specific heat capacity of gases, additional


conditions may be needed to define C. In this case, heat transfer can
be achieved by keeping either pressure or volume constant. If the gas
is held under constant pressure during the heat transfer, then it is
called the MOLAR SPECIFIC HEAT CAPACITY AT CONSTANT
PRESSURE and is denoted by Cp. On the other hand, if the volume of
the gas is maintained during the heat transfer, then the corresponding
molar specific heat capacity is called MOLAR SPECIFIC HEAT
CAPACITY AT CONSTANT VOLUME and is denoted by Cv. For
details see Chapter 12. Table 11.3 lists measured specific heat
capacity of some substances at atmospheric pressure and ordinary
temperature while Table 11.4 lists molar specific heat capacities of
some gases. From Table 11.3 you can note that water has the highest
specific heat capacity compared to other substances. For this reason
water is used as a coolant in automobile radiators as well as a heater
in hot water bags. Owing to its high specific heat capacity, the water
warms up much more slowly than the land during summer and
consequently wind from the sea has a cooling effect. Now, you can tell
why in desert areas, the earth surface warms up quickly during the
day and cools quickly at night.
Table 11.3 Specific heat capacity of some substances at room
temperature and atmospheric pressure
Table 11.4 Molar specific heat capacities of some gases

11.7 CALORIMETRY
A system is said to be isolated if no exchange or transfer of heat
occurs between the system and its surroundings. When different parts
of an isolated system are at different temperature, a quantity of heat
transfers from the part at higher temperature to the part at lower
temperature. The heat lost by the part at higher temperature is equal
to the heat gained by the part at lower temperature.
Calorimetry means measurement of heat. When a body at higher
temperature is brought in contact with another body at lower
temperature, the heat lost by the hot body is equal to the heat gained
by the colder body, provided no heat is allowed to escape to the
surroundings. A device in which heat measurement can be made is
called a CALORIMETER. It consists a metallic vessel and stirrer of the
same material like copper or alumiunium. The vessel is kept inside a
wooden jacket which contains heat insulating materials like glass wool
etc. The outer jacket acts as a heat shield and reduces the heat loss
from the inner vessel. There is an opening in the outer jacket through
which a mercury thermometer can be inserted into the calorimeter.
The following example provides a method by which the specific heat
capacity of a given solid can be determinated by using the principle,
heat gained is equal to the heat lost.

Example 11.3 A sphere of aluminium of 0.047 kg placed for sufficient time in a


vessel containing boiling water, so that the sphere is at 100 C. It is then
immediately transfered to 0.14 kg copper calorimeter containing 0.25 kg of
water at 20 C. The temperature of water rises and attains a steady state at 23
C. Calculate the specific heat capacity of aluminium.

Answer In solving this example we shall use the fact that at a


steady state, heat given by an aluminium sphere will be equal to the
heat absorbed by the water and calorimeter.

Mass of aluminium sphere (m1) = 0.047 kg


Initial temp. of aluminium sphere = 100 C
Final temp. = 23 C
Change in temp (T) = (100 C - 23 C) = 77 C
Let specific heat capacity of aluminium be sAl.
The amount of heat lost by the aluminium sphere =
Mass of water (m2) = 0.25 kg
Mass of calorimeter (m3) = 0.14 kg

Initial temp. of water and calorimeter = 20 C


Final temp. of the mixture = 23 C
Change in temp. (T2) = 23 C 20 C = 3 C
Specific heat capacity of water (sw)

= 4.18 103 J kg1 K1


Specific heat capacity of copper calorimeter
= 0.386 103 J kg1 K1
The amount of heat gained by water and calorimeter = m2 sw T2 +
m3scuT2
= (m2sw + m3scu) (T2)
= 0.25 kg 4.18 103 J kg1 K1 + 0.14 kg
0.386 103 J kg1 K1) (23 C 20 C)
In the steady state heat lost by the aluminium sphere = heat gained by
water + heat gained by calorimeter.
So, 0.047 kg sAl 77 C
= (0.25 kg 4.18 103 J kg1 K1+ 0.14 kg
0.386 103 J kg1 K1)(3 C)
sAl = 0.911 kJ kg1 K1 t

11.8 CHANGE OF STATE


Matter normally exists in three states: solid, liquid, and gas. A
transition from one of these states to another is called a change of
state. Two common changes of states are solid to liquid and liquid to
gas (and vice versa). These changes can occur when the exchange of
heat takes place between the substance and its surroundings. To
study the change of state on heating or cooling, let us perform the
following activity.
Take some cubes of ice in a beaker. Note the temperature of ice (0
C). Start heating it slowly on a constant heat source. Note the
temperature after every minute. Continuously stir the mixture of water
and ice. Draw a graph between temperature and time (Fig. 11.9). You
will observe no change in the temperature so long as there is ice in
the beaker. In the above process, the temperature of the system does
not change even though heat is being continuously supplied. The heat
supplied is being utilised in changing the state from solid (ice) to liquid
(water).

Fig. 11.9 A plot of temperature versus time showing the changes in the state of ice
on heating (not to scale).
The change of state from solid to liquid is called MELTING and from
liquid to solid is called FUSION. It is observed that the temperature
remains constant until the entire amount of the solid substance melts.
That is, BOTH THE SOLID AND LIQUID STATES OF THE
SUBSTANCE COEXIST IN THERMAL EQUILIBRIUM DURING THE
CHANGE OF STATES FROM SOLID TO LIQUID. The temperature at
which the solid and the liquid states of the substance in thermal
equilibrium with each other is called its MELTING POINT. It is
characteristic of the substance. It also depends on pressure. The
melting point of a substance at standard atomspheric pressure is
called its NORMAL MELTING POINT. Let us do the following activity
to understand the process of melting of ice.
Take a slab of ice. Take a metallic wire and fix two blocks, say 5 kg
each, at its ends. Put the wire over the slab as shown in Fig. 11.10.
You will observe that the wire passes through the ice slab. This
happens due to the fact that just below the wire, ice melts at lower
temperature due to increase in pressure. When the wire has passed,
water above the wire freezes again. Thus the wire passes through the
slab and the slab does not split. This phenomenon of refreezing is
called REGELATION. Skating is possible on snow due to the
formation of water below the skates. Water is formed due to the
increase of pressure and it acts as a lubricant.
Fig. 11.10

After the whole of ice gets converted into water and as we continue
further heating, we shall see that temperature begins to rise. The
temperature keeps on rising till it reaches nearly 100 C when it again
becomes steady. The heat supplied is now being utilised to change
water from liquid state to vapour or gaseous state.
The change of state from liquid to vapour (or gas) is called
VAPORISATION. It is observed that the temperature remains constant
until the entire amount of the liquid is converted into vapour. That is,
both the liquid and vapour states of the substance coexist in thermal
equilibrium, during the change of state from liquid to vapour. The
temperature at which the liquid and the vapour states of the substance
coexist is called its BOILING POINT. Let us do the following activity to
understand the process of boiling of water.

Triple Point

The temperature of a substance remains constant during its change of state


(phase change). A graph between the temperature T and the Pressure P of
the substance is called a phase diagram or P T diagram. The following figure
shows the phase diagram of water and CO2. Such a phase diagram divides
the P T plane into a solid-region, the vapour-region and the liquid-region.
The regions are separated by the curves such as sublimation curve (BO),
fusion curve (AO) and vaporisation curve (CO). The points on sublimation
curve represent states in which solid and vapour phases coexist. The point on
the sublimation curve BO represent states in which the solid and vapour
phases co-exist. Points on the fusion curve AO represent states in which solid
and liquid phase coexist. Points on the vapourisation curve CO represent
states in which the liquid and vapour phases coexist. The temperature and
pressure at which the fusion curve, the vaporisation curve and the sublimation
curve meet and all the three phases of a substance coexist is called the triple
point of the substance. For example the triple point of water is represented by
the temperature 273.16 K and pressure 6.11103 Pa.
*

(a) (b)

Pressure-temperature phase diagrams for (a) water and (b) CO2 (not to the scale).

Take a round-bottom flask, more than half filled with water. Keep it
over a burner and fix a thermometer and steam outlet through the cork
of the flask (Fig. 11.11). As water gets heated in the flask, note first
that the air, which was dissolved in the water, will come out as small
bubbles. Later, bubbles of steam will form at the bottom but as they
rise to the cooler water near the top, they condense and disappear.
Finally, as the temperature of the entire mass of the water reaches
100 C, bubbles of steam reach the surface and boiling is said to
occur. The steam in the flask may not be visible but as it comes out of
the flask, it condenses as tiny droplets of water, giving a foggy
appearance.

Fig. 11.11 Boiling process.

If now the steam outlet is closed for a few seconds to increase the
pressure in the flask, you will notice that boiling stops. More heat
would be required to raise the temperature (depending on the
increase in pressure) before boiling begins again. Thus boiling point
increases with increase in pressure.
Let us now remove the burner. Allow water to cool to about 80 C.
Remove the thermometer and steam outlet. Close the flask with the
airtight cork. Keep the flask turned upside down on the stand. Pour
ice-cold water on the flask. Water vapours in the flask condense
reducing the pressure on the water surface inside the flask. Water
begins to boil again, now at a lower temperature. Thus boiling point
decreases with decrease in pressure.
This explains why cooking is difficult on hills. At high altitudes,
atmospheric pressure is lower, reducing the boiling point of water as
compared to that at sea level. On the other hand, boiling point is
increased inside a pressure cooker by increasing the pressure. Hence
cooking is faster. The boiling point of a substance at standard
atmospheric pressure is called its NORMAL BOILING POINT.
However, all substances do not pass through the three states: solid-
liquid-gas. There are certain substances which normally pass from the
solid to the vapour state directly and vice versa. The change from
solid state to vapour state without passing through the liquid state is
called SUBLIMATION, and the substance is said to sublime. Dry ice
(solid CO2) sublimes, so also iodine. During the sublimation process
both the solid and vapour states of a substance coexist in thermal
equilibrium.

11.8.1 Latent Heat


In Section 11.8, we have learnt that certain amount of heat energy is
transferred between a substance and its surroundings when it
undergoes a change of state. The amount of heat per unit mass
transferred during change of state of the substance is called latent
heat of the substance for the process. For example, if heat is added to
a given quantity of ice at 10 C, the temperature of ice increases until
it reaches its melting point (0 C). At this temperature, the addition of
more heat does not increase the temperature but causes the ice to
melt, or changes its state. Once the entire ice melts, adding more heat
will cause the temperature of the water to rise. A similar situation
occurs during liquid gas change of state at the boiling point. Adding
more heat to boiling water causes vaporisation, without increase in
temperature.
The heat required during a change of state depends upon the heat of
transformation and the mass of the substance undergoing a change of
state. Thus, if mass m of a substance undergoes a change from one
state to the other, then the quantity of heat required is given by
Q=mL
or L = Q/m (11.13)

where L is known as latent heat and is a characteristic of the


substance. Its SI unit is J kg1. The value of L also depends on the
pressure. Its value is usually quoted at standard atmospheric
pressure. The latent heat for a solid-liquid state change is called the
LATENT HEAT OF FUSION (Lf), and that for a liquid-gas state change
is called the LATENT HEAT OF VAPORISATION (Lv). These are often
referred to as the heat of fusion and the heat of vaporisation. A plot of
temperature versus heat energy for a quantity of water is shown in
Fig. 11.12. The latent heats of some substances, their freezing and
boiling points, are given in Table 11.5.
Table 11.5 Temperatures of the change of state and latent heats for
various substances at
1 atm pressure

Fig. 11.12 Temperature versus heat for water at 1 atm pressure (not to scale).

Note that when heat is added (or removed) during a change of state,
the temperature remains constant. Note in Fig. 11.12 that the slopes
of the phase lines are not all the same, which indicates that specific
heats of the various states are not equal. For water, the latent heat of
fusion and vaporisation are Lf = 3.33 105 J kg1 and Lv = 22.6 105
J kg1 respectively. That is 3.33 105 J of heat are needed to melt 1
kg of ice at 0 C, and 22.6 105 J of heat are needed to convert 1 kg
of water to steam at 100 C. So, steam at 100 C carries 22.6 105 J
kg1 more heat than water at 100 C. This is why burns from steam
are usually more serious than those from boiling water.

Example 11.4 When 0.15 kg of ice of 0 C mixed with 0.30 kg of water at 50


C in a container, the resulting temperature is 6.7 C. Calculate the heat of
fusion of ice. (swater = 4186 J kg1 K1)

Answer
Heat lost by water = msw (fi)w
= (0.30 kg) (4186 J kg1 K1) (50.0 C 6.7 C)
= 54376.14 J
Heat required to melt ice = m2Lf = (0.15 kg) Lf
Heat required to raise temperature of ice water to final temperature =
mIsw (fi)I
= (0.15 kg) (4186 J kg1 K 1) (6.7 C 0 C)
= 4206.93 J
Heat lost = heat gained

54376.14 J = (0.15 kg) Lf + 4206.93 J


Lf = 3.34105 J kg1. t

Example 11.5 Calculate the heat required to convert 3 kg of ice at 12 C kept


in a calorimeter to steam at 100 C at atmospheric pressure. Given specific
heat capacity of ice = 2100 J kg1 K1, specific heat capacity of water = 4186
J kg 1 K1, latent heat of fusion of ice = 3.35 105 J kg1 and latent heat of
steam = 2.256 106 J kg1.
Answer We have
Mass of the ice, m = 3 kg

specific heat capacity of ice, sice


= 2100 J kg1 K1
specific heat capacity of water, swater
= 4186 J kg1 K1

latent heat of fusion of ice, Lf ice


= 3.35 105 J kg1
latent heat of steam, Lsteam
= 2.256 106 J kg1
Now, Q = heat required to convert 3 kg of ice at 12 C to steam at
100 C,
Q1 = heat required to convert ice at 12 C to ice at 0 C.
= m sice T1 = (3 kg) (2100 J kg1. K1) [0(12)]C = 75600 J
Q2 = heat required to melt ice at 0 C to water at 0 C

= m Lf ice = (3 kg) (3.35 105 J kg1)


= 1005000 J
Q3 = heat required to convert water at 0 C to water at 100 C.
= msw T2 = (3kg) (4186J kg1 K1) (100 C)
= 1255800 J
Q4 = heat required to convert water at 100 C to steam at 100 C.
= m Lsteam = (3 kg) (2.256106 J kg1)
= 6768000 J
So, Q = Q1 + Q2 + Q3 + Q4

= 75600J + 1005000 J
+ 1255800 J + 6768000 J
= 9.1106 J t

11.9 HEAT TRANSFER


We have seen that heat is energy transfer from one system to another
or from one part of a system to another part, arising due to
temperature difference. What are the different ways by which this
energy transfer takes place? There are three distinct modes of heat
transfer : conduction, convection and radiation (Fig. 11.13).

Fig. 11.13 Heating by conduction, convection and radiation.

11.9.1 Conduction
Conduction is the mechanism of transfer of heat between two adjacent
parts of a body because of their temperature difference. Suppose one
end of a metallic rod is put in a flame, the other end of the rod will
soon be so hot that you cannot hold it by your bare hands. Here heat
transfer takes place by conduction from the hot end of the rod through
its different parts to the other end. Gases are poor thermal conductors
while liquids have conductivities intermediate between solids and
gases.
Heat conduction may be described quantitatively as the time rate of
heat flow in a material for a given temperature difference. Consider a
metallic bar of length L and uniform cross section A with its two ends
maintained at different temperatures. This can be done, for example,
by putting the ends in thermal contact with large reservoirs at
temperatures, say, TC and TD respectively (Fig. 11.14). Let us assume
the ideal condition that the sides of the bar are fully insulated so that
no heat is exchanged between the sides and the surroundings.
After sometime, a steady state is reached; the temperature of the bar
decreases uniformly with distance from TC to TD; (TC>TD). The
reservoir at C supplies heat at a constant rate, which transfers through
the bar and is given out at the same rate to the reservoir at D. It is
found experimentally that in this steady state, the rate of flow of heat
(or heat current) H is proportional to the temperature difference (TC
TD) and the area of cross section A and is inversely proportional to the
length L :

H = KA (11.14)
The constant of proportionality K is called the THERMAL
CONDUCTIVITY of the material. The greater the value of K for a
material, the more rapidly will it conduct heat. The SI unit of K is J S1
m1 K1 or W m1 K1. The thermal conductivities of various
substances are listed in Table 11.5. These values vary slightly with
temperature, but can be considered to be constant over a normal
temperature range.

Fig. 11.14 Steady state heat flow by conduction in a bar with its two ends
maintained at temperatures TC and TD; (TC > TD).

Compare the relatively large thermal conductivities of the good


thermal conductors, the metals, with the relatively small thermal
conductivities of some good thermal insulators, such as wood and
glass wool. You may have noticed that some cooking pots have
copper coating on the bottom. Being a good conductor of heat, copper
promotes the distribution of heat over the bottom of a pot for uniform
cooking. Plastic foams, on the other hand, are good insulators, mainly
because they contain pockets of air. Recall that gases are poor
conductors, and note the low thermal conductivity of air in the Table
11.5. Heat retention and transfer are important in many other
applications. Houses made of concrete roofs get very hot during
summer days, because thermal conductivity of concrete (though much
smaller than that of a metal) is still not small enough. Therefore,
people usually prefer to give a layer of earth or foam insulation on the
ceiling so that heat transfer is prohibited and keeps the room cooler. In
some situations, heat transfer is critical. In a nuclear reactor, for
example, elaborate heat transfer systems need to be installed so that
the enormous energy produced by nuclear fission in the core transits
out sufficiently fast, thus preventing the core from overheating.
Table 11.6</ Thermal conductivities of some materials
Example 11.6 What is the temperature of the steel-copper junction in the
steady state of the system shown in Fig. 11.15. Length of the steel rod = 15.0
cm, length of the copper rod = 10.0 cm, temperature of the furnace = 300 C,
temperature of the other end = 0 C. The area of cross section of the steel rod
is twice that of the copper rod. (Thermal conductivity of steel = 50.2 J s1 m
1K1; and of copper = 385 J s1m1K1).

Fig. 11.15

Answer The insulating material around the rods reduces heat loss
from the sides of the rods. Therefore, heat flows only along the length
of the rods. Consider any cross section of the rod. In the steady state,
heat flowing into the element must equal the heat flowing out of it;
otherwise there would be a net gain or loss of heat by the element and
its temperature would not be steady. Thus in the steady state, rate of
heat flowing across a cross section of the rod is the same at every
point along the length of the combined steel-copper rod. Let T be the
temperature of the steel-copper junction in the steady state. Then,

where 1 and 2 refer to the steel and copper rod respectively. For A1 =
2 A2, L1 = 15.0 cm, L2 = 10.0 cm, K1 = 50.2 J s1 m1 K 1, K2 = 385 J
s1 m1 K 1, we have

which gives T = 44.4 C t

Example 11.7 An iron bar (L1 = 0.1 m, A1 = 0.02 m2, K1 = 79 W m1 K1) and
a brass bar (L2 = 0.1 m, A2 = 0.02 m2, K2 = 109 W m1K1) are soldered end
to end as shown in Fig. 11.16. The free ends of the iron bar and brass bar are
maintained at 373 K and 273 K respectively. Obtain expressions for and hence
compute (i) the temperature of the junction of the two bars, (ii) the equivalent
thermal conductivity of the compound bar, and (iii) the heat current through the
compound bar.

Answer

Fig 11.16

Given, L1 = L2= L = 0.1 m, A1 = A2= A= 0.02 m2

K1 = 79 W m1 K1, K2 = 109 W m1 K1,


T1 = 373 K, and T2 = 273 K.
Under steady state condition, the heat current (H1) through iron bar is
equal to the heat current (H2) through brass bar.
So, H = H1 = H2

=
For A1 = A2 = A and L1 = L2 = L, this equation leads to
K1 (T1 T0) = K2 (T0 T2)
Thus the junction temperature T0 of the two bars is

T0 =
Using this equation, the heat current H through either bar is

H=

Using these equations, the heat current H through the compound bar
of length L1 + L2 = 2L and the equivalent thermal conductivity K, of the
compound bar are given by

(i)

= 315 K

(ii)

=
= 91.6 W m1 K1

(iii)
= 916.1 W t

11.9.2 Convection
Convection is a mode of heat transfer by actual motion of matter. It is
possible only in fluids. Convection can be natural or forced. In natural
convection, gravity plays an important part. When a fluid is heated
from below, the hot part expands and, therefore, becomes less dense.
Because of buoyancy, it rises and the upper colder part replaces it.
This again gets heated, rises up and is replaced by the colder part of
the fluid. The process goes on. This mode of heat transfer is evidently
different from conduction. Convection involves bulk transport of
different parts of the fluid. In forced convection, material is forced to
move by a pump or by some other physical means. The common
examples of forced convection systems are forced-air heating systems
in home, the human circulatory system, and the cooling system of an
automobile engine. In the human body, the heart acts as the pump
that circulates blood through different parts of the body, transferring
heat by forced convection and maintaining it at a uniform temperature.
Natural convection is responsible for many familiar phenomena.
During the day, the ground heats up more quickly than large bodies of
water do. This occurs both because the water has a greater specific
heat and because mixing currents disperse the absorbed heat
throughout the great volume of water. The air in contact with the warm
ground is heated by conduction. It expands, becoming less dense
than the surrounding cooler air. As a result, the warm air rises (air
currents) and other air moves (winds) to fill the space-creating a sea
breeze near a large body of water. Cooler air descends, and a thermal
convection cycle is set up, which transfers heat away from the land. At
night, the ground loses its heat more quickly, and the water surface is
warmer than the land. As a result, the cycle is reveresed (Fig. 11.17).

Fig. 11.17 Convection cycles.

The other example of natural convection is the steady surface wind on


the earth blowing in from north-east towards the equator, the so called
trade wind. A resonable explanation is as follows : the equatorial and
polar regions of the earth receive unequal solar heat. Air at the earths
surface near the equator is hot while the air in the upper atmosphere
of the poles is cool. In the absence of any other factor, a convection
current would be set up, with the air at the equatorial surface rising
and moving out towards the poles, descending and streaming in
towards the equator. The rotation of the earth, however, modifies this
convection current. Because of this, air close to the equator has an
eastward speed of 1600 km/h, while it is zero close to the poles. As a
result, the air descends not at the poles but at 30 N (North) latitude
and returns to the equator. This is called TRADE WIND.
11.9.3 Radiation
Conduction and convection require some material as a transport
medium. These modes of heat transfer cannot operate between
bodies separated by a distance in vacuum. But the earth does receive
heat from the sun across a huge distance and we quickly feel the
warmth of the fire nearby even though air conducts poorly and before
convection can set in. The third mechanism for heat transfer needs no
medium; it is called radiation and the energy so radiated by
electromagnetic waves is called radiant energy. In an electromagnetic
wave electric and magnetic fields oscillate in space and time. Like any
wave, electromagnetic waves can have different wavelengths and can
travel in vacuum with the same speed, namely the speed of light i.e., 3
108 m s1 . You will learn these matters in more details later, but you
now know why heat transfer by radiation does not need any medium
and why it is so fast. This is how heat is transfered to the earth from
the sun through empty space. All bodies emit radiant energy, whether
they are solid, liquid or gases. The electromagnetic radiation emitted
by a body by virtue of its temperature like the radiation by a red hot
iron or light from a filament lamp is called thermal radiation.

When this thermal radiation falls on other bodies, it is partly reflected


and partly absorbed. The amount of heat that a body can absorb by
radiation depends on the colour of the body.
We find that black bodies absorb and emit radiant energy better than
bodies of lighter colours. This fact finds many applications in our daily
life. We wear white or light coloured clothes in summer so that they
absorb the least heat from the sun. However, during winter, we use
dark coloured clothes which absorb heat from the sun and keep our
body warm. The bottoms of the utensils for cooking food are
blackened so that they absorb maximum heat from the fire and give it
to the vegetables to be cooked.
Similarly, a Dewar flask or thermos bottle is a device to minimise heat
transfer between the contents of the bottle and outside. It consists of a
double-walled glass vessel with the inner and outer walls coated with
silver. Radiation from the inner wall is reflected back into the contents
of the bottle. The outer wall similarly reflects back any incoming
radiation. The space between the walls is evacuted to reduce
conduction and convection losses and the flask is supported on an
insulator like cork. The device is, therefore, useful for preventing hot
contents (like milk) from getting cold, or alternatively to store cold
contents (like ice).

11.10 NEWTONS LAW OF COOLING


We all know that hot water or milk when left on a table begins to cool
gradually. Ultimately it attains the temperature of the surroundings. To
study how a given body can cool on exchanging heat with its
surroundings, let us perform the following activity.

Take some water, say 300 ml, in a calorimeter with a stirrer and cover
it with two holed lid. Fix a thermometer through a hole in the lid and
make sure that the bulb of thermometer is immersed in the water.
Note the reading of the thermometer. This reading T1 is the
temperature of the surroundings. Heat the water kept in the
calorimeter till it attains a temperature, say, 40 C above room
temperature (i.e., temperature of the surroundings). Then stop heating
the water by removing the heat source. Start the stop-watch and note
the reading of the thermometer after fixed interval of time, say after
every one minute of stirring gently with the stirrer. Continue to note the
temperature (T2) of water till it attains a temperature about 5 C above
that of the surroundings. Then plot a graph by taking each value of
temperature T = T2 T1 along y axis and the coresponding value of t
along x-axis (Fig. 11.18).

Fig. 11.18 Curve showing cooling of hot water with time.

From the graph you will infer how the cooling of hot water depends on
the difference of its temperature from that of the surroundings. You will
also notice that initially the rate of cooling is higher and decreases as
the temperature of the body falls.

The above activity shows that a hot body loses heat to its
surroundings in the form of heat radiation. The rate of loss of heat
depends on the difference in temperature between the body and its
surroundings. Newton was the first to study, in a systematic manner,
the relation between the heat lost by a body in a given enclosure and
its temperature.
According to Newtons law of cooling, the rate of loss of heat, dQ/dt
of the body is directly proportional to the difference of temperature T
= (T2T1) of the body and the surroundings. The law holds good only
for small difference of temperature. Also, the loss of heat by radiation
depends upon the nature of the surface of the body and the area of
the exposed surface. We can write

(11.15)
where k is a positive constant depending upon the area and nature of
the surface of the body. Suppose a body of mass m and specific heat
capacity s is at temperature T2. Let T1 be the temperature of the
surroundings. If the temperature falls by a small amount dT2 in time dt,
then the amount of heat lost is
dQ = ms dT2
Rate of loss of heat is given by

(11.16)
From Eqs. (11.15) and (11.16) we have

(11.17)

where K = k/m s
On integrating,
loge (T2 T1) = K t + c (11.18)
or T2 = T1 + C eKt; where C = ec (11.19)
Equation (11.19) enables you to calculate the time of cooling of a body
through a particular range of temperature.
For small temperature differences, the rate of cooling, due to
conduction, convection, and radiation combined, is proportional to the
difference in temperature. It is a valid approximation in the transfer of
heat from a radiator to a room, the loss of heat through the wall of a
room, or the cooling of a cup of tea on the table.

Fig. 11.19 Verification of Newtons Law of cooling.

Newtons law of cooling can be verified with the help of the


experimental set-up shown in Fig. 11.19(a). The set-up consists of a
double walled vessel (V) containing water in between the two walls. A
copper calorimeter (C) containing hot water is placed inside the
double walled vessel. Two thermometers through the corks are used
to note the temperatures T2 of water in calorimeter and T1 of hot water
in between the double walls respectively. Temperature of hot water in
the calorimeter is noted after equal intervals of time. A graph is plotted
between loge (T2T1) and time (t). The nature of the graph is observed
to be a straight line having a negative slope as shown in Fig. 11.19(b).
This is in support of Eq. (11.18).
Example 11.8 A pan filled with hot food cools from 94 C to 86 C in 2 minutes
when the room temperature is at 20 C. How long will it take to cool from 71 C
to 69 C?

Answer The average temperature of 94 C and 86 C is 90 C,


which is 70 C above the room temperature. Under these conditions
the pan cools 8 C in 2 minutes.

Using Eq. (11.17), we have

The average of 69 C and 71 C is 70 C, which is 50 C above room


temperature. K is the same for this situation as for the original.

= K (50 C)
When we divide above two equations, we have

Time = 0.7 min


= 42 s t

SUMMARY

1. Heat is a form of energy that flows between a body and its surrounding
medium by virtue of temperature difference between them. The degree of
hotness of the body is quantitatively represented by temperature.
2. A temperature-measuring device (thermometer) makes use of some
measurable property (called thermometric property) that changes with
temperature. Different thermometers lead to different temperature scales. To
construct a temperature scale, two fixed points are chosen and assigned some
arbitrary values of temperature. The two numbers fix the origin of the scale
and the size of its unit.

3. The Celsius temperature (tC) and the Farenheit temperare (tF)are related by

tF = (9/5) tC + 32

4. The ideal gas equation connecting pressure (P), volume (V) and absolute
temperature (T) is :

PV = RT

where is the number of moles and R is the universal gas constant.

5. In the absolute temperature scale, the zero of the scale is the absolute zero
of temperature the temperature where every substance in nature has the
least possible molecular activity. The Kelvin absolute temperature scale (T )
has the same unit size as the Celsius scale (Tc ), but differs in the origin :

TC = T 273.15

6. The coefficient of linear expansion (l ) and volume expansion (v ) are


defined by the relations :

where l and V denote the change in length l and volume V for a change of
temperature T. The relation between them is :

v = 3 l
7. The specific heat capacity of a substance is defined by

where m is the mass of the substance and Q is the heat required to change
its temperature by T. The molar specific heat capacity of a substance is
defined by

where is the number of moles of the substance.

8. The latent heat of fusion (Lf) is the heat per unit mass required to change a
substance from solid into liquid at the same temperature and pressure. The
latent heat of vaporisation (Lv) is the heat per unit mass required to change a
substance from liquid to the vapour state without change in the temperature
and pressure.

9. The three modes of heat transfer are conduction, convection and radiation.

10. In conduction, heat is transferred between neighbouring parts of a body


through molecular collisions, without any flow of matter. For a bar of length L
and uniform cross section A with its ends maintained at temperatures TC and
TD, the rate of flow of heat H is :

where K is the thermal conductivity of the material of the bar.

11. Newtons Law of Cooling says that the rate of cooling of a body is
proportional to the excess temperature of the body over the surroundings :

Where T1 is the temperature of the surrounding medium and T2 is the


temperature of the body.

POINTS TO PONDER

1. The relation connecting Kelvin temperature (T ) and the Celsius


temperature tc

T = tc + 273.15

and the assignment T = 273.16 K for the triple point of water are exact
relations (by choice). With this choice, the Celsius temperature of the melting
point of water and boiling point of water (both at 1 atm pressure) are very
close to, but not exactly equal to 0 C and 100 C respectively. In the original
Celsius scale, these latter fixed points were exactly at 0 C and 100 C (by
choice), but now the triple point of water is the preferred choice for the fixed
point, because it has a unique temperature.

2. A liquid in equilibrium with vapour has the same pressure and temperature
throughout the system; the two phases in equilibrium differ in their molar
volume (i.e. density). This is true for a system with any number of phases in
equilibrium.
3. Heat transfer always involves temperature difference between two systems
or two parts of the same system. Any energy transfer that does not involve
temperature difference in some way is not heat.

4. Convection involves flow of matter within a fluid due to unequal


temperatures of its parts. A hot bar placed under a running tap loses heat by
conduction between the surface of the bar and water and not by convection
within water.

EXERCISES
11.1 The triple points of neon and carbon dioxide are 24.57 K and
216.55 K respectively. Express these temperatures on the Celsius
and Fahrenheit scales.
11.2 Two absolute scales A and B have triple points of water
defined to be 200 A and 350 B. What is the relation between TA
and TB ?

11.3 The electrical resistance in ohms of a certain thermometer


varies with temperature according to the approximate law :
R = Ro [1 + (T To )]
The resistance is 101.6 at the triple-point of water 273.16 K, and
165.5 at the normal melting point of lead (600.5 K). What is the
temperature when the resistance is 123.4 ?
11.4 Answer the following :
(a) The triple-point of water is a standard fixed point in modern
thermometry. Why ? What is wrong in taking the melting point of
ice and the boiling point of water as standard fixed points (as was
originally done in the Celsius scale) ?
(b) There were two fixed points in the original Celsius scale as
mentioned above which were assigned the number 0 C and 100
C respectively. On the absolute scale, one of the fixed points is
the triple-point of water, which on the Kelvin absolute scale is
assigned the number 273.16 K. What is the other fixed point on
this (Kelvin) scale ?
(c) The absolute temperature (Kelvin scale) T is related to the
temperature tc on the Celsius scale by
tc = T 273.15
Why do we have 273.15 in this relation, and not 273.16 ?
(d) What is the temperature of the triple-point of water on an
absolute scale whose unit interval size is equal to that of the
Fahrenheit scale ?
11.5 Two ideal gas thermometers A and B use oxygen and
hydrogen respectively. The following observations are made :
Temperature Pressure Pressure
thermometer A thermometer B
Triple-point of water 1.250 105 Pa 0.200 105 Pa
Normal melting point 1.797 105 Pa 0.287 105 Pa
of sulphur
(a) What is the absolute temperature of normal melting point of
sulphur as read by thermometers A and B ?
(b) What do you think is the reason behind the slight difference in
answers of thermometers A and B ? (The thermometers are not
faulty). What further procedure is needed in the experiment to
reduce the discrepancy between the two readings ?
11.6 A steel tape 1m long is correctly calibrated for a temperature
of 27.0 C. The length of a steel rod measured by this tape is
found to be 63.0 cm on a hot day when the temperature is 45.0 C.
What is the actual length of the steel rod on that day ? What is the
length of the same steel rod on a day when the temperature is
27.0 C ? Coefficient of linear expansion of steel = 1.20 105 K1
.
11.7 A large steel wheel is to be fitted on to a shaft of the same
material. At 27 C, the outer diameter of the shaft is 8.70 cm and
the diameter of the central hole in the wheel is 8.69 cm. The shaft
is cooled using dry ice. At what temperature of the shaft does the
wheel slip on the shaft? Assume coefficient of linear expansion of
the steel to be constant over the required temperature range :

steel = 1.20 105 K1.


11.8 A hole is drilled in a copper sheet. The diameter of the hole is
4.24 cm at 27.0 C. What is the change in the diameter of the hole
when the sheet is heated to 227 C? Coefficient of linear
expansion of copper = 1.70 105 K1.
11.9 A brass wire 1.8 m long at 27 C is held taut with little tension
between two rigid supports. If the wire is cooled to a temperature
of 39 C, what is the tension developed in the wire, if its diameter
is 2.0 mm ? Co-efficient of linear expansion of brass = 2.0 105
K1; Youngs modulus of brass = 0.91 1011 Pa.

11.10 A brass rod of length 50 cm and diameter 3.0 mm is joined


to a steel rod of the same length and diameter. What is the change
in length of the combined rod at 250 C, if the original lengths are
at 40.0 C? Is there a thermal stress developed at the junction ?
The ends of the rod are free to expand (Co-efficient of linear
expansion of brass = 2.0 105 K1, steel = 1.2 105 K1 ).

11.11 The coefficient of volume expansion of glycerin is 49 105


K1. What is the fractional change in its density for a 30 C rise in
temperature ?
11.12 A 10 kW drilling machine is used to drill a bore in a small
aluminium block of mass 8.0 kg. How much is the rise in
temperature of the block in 2.5 minutes, assuming 50% of power is
used up in heating the machine itself or lost to the surroundings.
Specific heat of aluminium = 0.91 J g1 K1.
11.13 A copper block of mass 2.5 kg is heated in a furnace to a
temperature of 500 C and then placed on a large ice block. What
is the maximum amount of ice that can melt? (Specific heat of
copper = 0.39 J g1 K1; heat of fusion of water = 335 J g1 ).
11.14 In an experiment on the specific heat of a metal, a 0.20 kg
block of the metal at
150 C is dropped in a copper calorimeter (of water equivalent
0.025 kg) containing 150 cm3 of water at 27 C. The final
temperature is 40 C. Compute the specific heat of the metal. If
heat losses to the surroundings are not negligible, is your answer
greater or smaller than the actual value for specific heat of the
metal ?
11.15 Given below are observations on molar specific heats at
room temperature of some common gases.

The measured molar specific heats of these gases are markedly


different from those for monatomic gases. Typically, molar specific
heat of a monatomic gas is 2.92 cal/mol K. Explain this difference.
What can you infer from the somewhat larger (than the rest) value
for chlorine ?
11.16 Answer the following questions based on the P-T phase
diagram of carbon dioxide:
(a) At what temperature and pressure can the solid, liquid and
vapour phases of CO2 co-exist in equilibrium ?

(b) What is the effect of decrease of pressure on the fusion and


boiling point of CO2 ?
(c) What are the critical temperature and pressure for CO2 ? What
is their significance ?
(d) Is CO2 solid, liquid or gas at (a) 70 C under 1 atm, (b) 60 C
under 10 atm, (c) 15 C under 56 atm ?
11.17 Answer the following questions based on the P T phase
diagram of CO2:
(a) CO2 at 1 atm pressure and temperature 60 C is compressed
isothermally. Does it go through a liquid phase ?

(b) What happens when CO2 at 4 atm pressure is cooled from


room temperature at constant pressure ?
(c) Describe qualitatively the changes in a given mass of solid CO2
at 10 atm pressure and temperature 65 C as it is heated up to
room temperature at constant pressure.

(d) CO2 is heated to a temperature 70 C and compressed


isothermally. What changes in its properties do you expect to
observe ?
11.18 A child running a temperature of 101F is given an antipyrin
(i.e. a medicine that lowers fever) which causes an increase in the
rate of evaporation of sweat from his body. If the fever is brought
down to 98 F in 20 min, what is the average rate of extra
evaporation caused, by the drug. Assume the evaporation
mechanism to be the only way by which heat is lost. The mass of
the child is 30 kg. The specific heat of human body is
approximately the same as that of water, and latent heat of
evaporation of water at that temperature is about 580 cal g1.
11.19 A thermacole icebox is a cheap and efficient method for
storing small quantities of cooked food in summer in particular. A
cubical icebox of side 30 cm has a thickness of 5.0 cm. If 4.0 kg of
ice is put in the box, estimate the amount of ice remaining after 6
h. The outside temperature is 45 C, and co-efficient of thermal
conductivity of thermacole is 0.01 J s1 m1 K1. [Heat of fusion of
water = 335 103 J kg1]
11.20 A brass boiler has a base area of 0.15 m2 and thickness 1.0
cm. It boils water at the rate of 6.0 kg/min when placed on a gas
stove. Estimate the temperature of the part of the flame in contact
with the boiler. Thermal conductivity of brass = 109 J s1 m1 K1 ;
Heat of vaporisation of water = 2256 103 J kg1.
11.21 Explain why :
(a) a body with large reflectivity is a poor emitter

(b) a brass tumbler feels much colder than a wooden tray on a


chilly day
(c) an optical pyrometer (for measuring high temperatures)
calibrated for an ideal black body radiation gives too low a value
for the temperature of a red hot iron piece in the open, but gives a
correct value for the temperature when the same piece is in the
furnace
(d) the earth without its atmosphere would be inhospitably cold
(e) heating systems based on circulation of steam are more
efficient in warming a building than those based on circulation of
hot water
11.22 A body cools from 80 C to 50 C in 5 minutes. Calculate
the time it takes to cool from 60 C to 30 C. The temperature of
the surroundings is 20 C.
Chapter Twelve

Thermodynamics

12.1 Introduction

12.2 Thermal equilibrium

12.3 Zeroth law of Thermodynamics

12.4 Heat, internal energy and work

12.5 First law of thermodynamics

12.6 Specific heat capacity

12.7 Thermodynamic state variables and equation of state

12.8 Thermodynamic processes

12.9 Heat engines

12.10 Refrigerators and heat pumps

12.11 Second law of thermodynamics

12.12 Reversible and irreversible processes

12.13 Carnot engine

Summary

Points to ponder

Exercises
12.1 INTRODUCTION
In previous chapter we have studied thermal properties of matter. In
this chapter we shall study laws that govern thermal energy. We shall
study the processes where work is converted into heat and vice versa.
In winter, when we rub our palms together, we feel warmer; here work
done in rubbing produces the heat. Conversely, in a steam engine,
the heat of the steam is used to do useful work in moving the pistons,
which in turn rotate the wheels of the train.

In physics, we need to define the notions of heat, temperature, work,


etc. more carefully. Historically, it took a long time to arrive at the
proper concept of heat. Before the modern picture, heat was
regarded as a fine invisible fluid filling in the pores of a substance. On
contact between a hot body and a cold body, the fluid (called caloric)
flowed from the colder to the hotter body! This is similar to what
happens when a horizontal pipe connects two tanks containing water
up to different heights. The flow continues until the levels of water in
the two tanks are the same. Likewise, in the caloric picture of heat,
heat flows until the caloric levels (i.e., the temperatures) equalise.

In time, the picture of heat as a fluid was discarded in favour of the


modern concept of heat as a form of energy. An important experiment
in this connection was due to Benjamin Thomson (also known as
Count Rumford) in 1798. He observed that boring of a brass cannon
generated a lot of heat, indeed enough to boil water. More
significantly, the amount of heat produced depended on the work done
(by the horses employed for turning the drill) but not on the sharpness
of the drill. In the caloric picture, a sharper drill would scoop out more
heat fluid from the pores; but this was not observed. A most natural
explanation of the observations was that heat was a form of energy
and the experiment demonstrated conversion of energy from one form
to anotherfrom work to heat.
Thermodynamics is the branch of physics that deals with the concepts
of heat and temperature and the inter-conversion of heat and other
forms of energy. Thermodynamics is a macroscopic science. It deals
with bulk systems and does not go into the molecular constitution of
matter. In fact, its concepts and laws were formulated in the
nineteenth century before the molecular picture of matter was firmly
established. Thermodynamic description involves relatively few
macroscopic variables of the system, which are suggested by
common sense and can be usually measured directly. A microscopic
description of a gas, for example, would involve specifying the co-
ordinates and velocities of the huge number of molecules constituting
the gas. The description in kinetic theory of gases is not so detailed
but it does involve molecular distribution of velocities. Thermodynamic
description of a gas, on the other hand, avoids the molecular
description altogether. Instead, the state of a gas in thermodynamics
is specified by macroscopic variables such as pressure, volume,
temperature, mass and composition that are felt by our sense
perceptions and are measurable*.
The distinction between mechanics and thermodynamics is worth
bearing in mind. In mechanics, our interest is in the motion of particles
or bodies under the action of forces and torques. Thermodynamics is
not concerned with the motion of the system as a whole. It is
concerned with the internal macroscopic state of the body. When a
bullet is fired from a gun, what changes is the mechanical state of the
bullet (its kinetic energy, in particular), not its temperature. When the
bullet pierces a wood and stops, the kinetic energy of the bullet gets
converted into heat, changing the temperature of the bullet and the
surrounding layers of wood. Temperature is related to the energy of
the internal (disordered) motion of the bullet, not to the motion of the
bullet as a whole.

12.2 THERMAL EQUILIBRIUM


Equilibrium in mechanics means that the net external force and torque
on a system are zero. The term equilibrium in thermodynamics
appears in a different context : we say the state of a system is an
equilibrium state if the macroscopic variables that characterise the
system do not change in time. For example, a gas inside a closed rigid
container, completely insulated from its surroundings, with fixed
values of pressure, volume, temperature, mass and composition that
do not change with time, is in a state of thermodynamic equilibrium.
In general, whether or not a system is in a state of equilibrium
depends on the surroundings and the nature of the wall that separates
the system from the surroundings. Consider two gases A and B
occupying two different containers. We know experimentally that
pressure and volume of a given mass of gas can be chosen to be its
two independent variables. Let the pressure and volume of the gases
be (PA, VA) and (PB, VB) respectively. Suppose first that the two
systems are put in proximity but are separated by an ADIABATIC
WALL an insulating wall (can be movable) that does not allow flow
of energy (heat) from one to another. The systems are insulated from
the rest of the surroundings also by similar adiabatic walls. The
situation is shown schematically in Fig. 12.1 (a). In this case, it is
found that any possible pair of values (PA, VA) will be in equilibrium
with any possible pair of values (PB, VB ). Next, suppose that the
adiabatic wall is replaced by a DIATHERMIC WALL a conducting
wall that allows energy flow (heat) from one to another. It is then found
that the macroscopic variables of the systems A and B change
spontaneously until both the systems attain equilibrium states. After
that there is no change in their states. The situation is shown in Fig.
12.1(b). The pressure and volume variables of the two gases change
to (PB , VB ) and (PA , VA ) such that the new states of A and B are in
equilibrium with each other**. There is no more energy flow from one
to another. We then say that the system A is in thermal equilibrium
with the system B.
* Thermodynamics may also involve other variables that are not so
obvious to our senses e.g. entropy, enthalpy, etc., and they are all
macroscopic variables.
** Both the variables need not change. It depends on the constraints.
For instance, if the gases are in containers of fixed volume, only the
pressures of the gases would change to achieve thermal equilibrium.
What characterises the situation of thermal equilibrium between two
systems ? You can guess the answer from your experience. In
thermal equilibrium, the temperatures of the two systems are equal.
We shall see how does one arrive at the concept of temperature in
thermodynamics? The Zeroth law of thermodynamics provides the
clue.

12.3 ZEROTH LAW OF THERMODYNAMICS


Imagine two systems A and B, separated by an adiabatic wall, while
each is in contact with a third system C, via a conducting wall [Fig.
12.2(a)]. The states of the systems (i.e., their macroscopic variables)
will change until both A and B come to thermal equilibrium with C.
After this is achieved, suppose that the adiabatic wall between A and
B is replaced by a conducting wall and C is insulated from A and B by
an adiabatic wall [Fig.12.2(b)]. It is found that the states of A and B
change no further i.e. they are found TO BE IN THERMAL
EQUILIBRIUM WITH EACH OTHER. This observation forms the basis
of the ZEROTH LAW OF THERMODYNAMICS, which states that
TWO SYSTEMS IN THERMAL EQUILIBRIUM WITH A THIRD
SYSTEM SEPARATELY ARE IN THERMAL EQUILIBRIUM WITH
EACH OTHER. R.H. Fowler formulated this law in 1931 long after the
first and second Laws of thermodynamics were stated and so
numbered.

Fig. 12.1 (a) Systems A and B (two gases) separated by an adiabatic wall an
insulating wall that does not allow flow of heat. (b) The same systems A and B
separated by a diathermic wall a conducting wall that allows heat to flow from
one to another. In this case, thermal equilibrium is attained in due course.

The Zeroth Law clearly suggests that when two systems A and B, are
in thermal equilibrium, there must be a physical quantity that has the
same value for both. This thermodynamic variable whose value is
equal for two systems in thermal equilibrium is called temperature (T ).
Thus, if A and B are separately in equilibrium with C, TA = TC and TB =
TC. This implies that TA = TB i.e. the systems A and B are also in
thermal equilibrium.
We have arrived at the concept of temperature formally via the Zeroth
Law. The next question is :how to assign numerical values to
temperatures of different bodies ? In other words, how do we
construct a scale of temperature ? Thermometry deals with this basic
question to which we turn in the next section.
Fig. 12.2 (a) Systems A and B are separated by an adiabatic wall, while each is in
contact with a third system C via a conducting wall. (b) The adiabatic wall between
A and B is replaced by a conducting wall, while C is insulated from A and B by an
adiabatic wall.

12.4 HEAT, INTERNAL ENERGY AND WORK


The Zeroth Law of Thermodynamics led us to the concept of
temperature that agrees with our commonsense notion. Temperature
is a marker of the hotness of a body. It determines the direction of
flow of heat when two bodies are placed in thermal contact. Heat flows
from the body at a higher temperature to the one at lower
temperature. The flow stops when the temperatures equalise; the two
bodies are then in thermal equilibrium. We saw in some detail how to
construct temperature scales to assign temperatures to different
bodies. We now describe the concepts of heat and other relevant
quantities like internal energy and work.

The concept of internal energy of a system is not difficult to


understand. We know that every bulk system consists of a large
number of molecules. Internal energy is simply the sum of the kinetic
energies and potential energies of these molecules. We remarked
earlier that in thermodynamics, the kinetic energy of the system, as a
whole, is not relevant. Internal energy is thus, the sum of molecular
kinetic and potential energies in the frame of reference relative to
which the centre of mass of the system is at rest. Thus, it includes
only the (disordered) energy associated with the random motion of
molecules of the system. We denote the internal energy of a system
by U.
Though we have invoked the molecular picture to understand the
meaning of internal energy, as far as thermodynamics is concerned, U
is simply a macroscopic variable of the system. The important thing
about internal energy is that it depends only on the state of the
system, not on how that state was achieved. Internal energy U of a
system is an example of a thermodynamic state variable its value
depends only on the given state of the system, not on history i.e. not
on the path taken to arrive at that state. Thus, the internal energy of a
given mass of gas depends on its state described by specific values of
pressure, volume and temperature. It does not depend on how this
state of the gas came about. Pressure, volume, temperature, and
internal energy are thermodynamic state variables of the system (gas)
(see section 12.7). If we neglect the small intermolecular forces in a
gas, the internal energy of a gas is just the sum of kinetic energies
associated with various random motions of its molecules. We will see
in the next chapter that in a gas this motion is not only translational
(i.e. motion from one point to another in the volume of the container); it
also includes rotational and vibrational motion of the molecules (Fig.
12.3).
What are the ways of changing internal energy of a system ? Consider
again, for simplicity, the system to be a certain mass of gas contained
in a cylinder with a movable piston as shown in Fig. 12.4. Experience
shows there are two ways of changing the state of the gas (and hence
its internal energy). One way is to put the cylinder in contact with a
body at a higher temperature than that of the gas. The temperature
difference will cause a flow of energy (heat) from the hotter body to
the gas, thus increasing the internal energy of the gas. The other way
is to push the piston down i.e. to do work on the system, which again
results in increasing the internal energy of the gas. Of course, both
these things could happen in the reverse direction. With surroundings
at a lower temperature, heat would flow from the gas to the
surroundings. Likewise, the gas could push the piston up and do work
on the surroundings. In short, heat and work are two different modes
of altering the state of a thermodynamic system and changing its
internal energy.
The notion of heat should be carefully distinguished from the notion of
internal energy. Heat is certainly energy, but it is the energy in transit.
This is not just a play of words. The distinction is of basic significance.
The state of a thermodynamic system is characterised by its internal
energy, not heat. A statement like A GAS IN A GIVEN STATE HAS A
CERTAIN AMOUNT OF HEAT is as meaningless as the statement
that A GAS IN A GIVEN STATE HAS A CERTAIN AMOUNT OF
WORK. In contrast, A GAS IN A GIVEN STATE HAS A CERTAIN
AMOUNT OF INTERNAL ENERGY is a perfectly meaningful
statement. Similarly, the statements A CERTAIN AMOUNT OF HEAT
IS SUPPLIED TO THE SYSTEM OR A CERTAIN AMOUNT OF
WORK WAS DONE BY THE SYSTEM are perfectly meaningful.
To summarise, heat and work in thermodynamics are not state
variables. They are modes of energy transfer to a system resulting in
change in its internal energy, which, as already mentioned, is a state
variable.

In ordinary language, we often confuse heat with internal energy. The


distinction between them is sometimes ignored in elementary physics
books. For proper understanding of thermodynamics, however, the
distinction is crucial.

Fig. 12.3 (a) Internal energy U of a gas is the sum of the kinetic and potential
energies of its molecules when the box is at rest. Kinetic energy due to various
types of motion (translational, rotational, vibrational) is to be included in U. (b) If
the same box is moving as a whole with some velocity, the kinetic energy of the
box is not to be included in U.
Fig. 12.4 Heat and work are two distinct modes of energy transfer to a system that
results in change in its internal energy. (a) Heat is energy transfer due to
temperature difference between the system and the surroundings. (b) Work is
energy transfer brought about by means (e.g. moving the piston by raising or
lowering some weight connected to it) that do not involve such a temperature
difference.

12.5 FIRST LAW OF THERMODYNAMICS


We have seen that the internal energy U of a system can change
through two modes of energy transfer : heat and work. Let
Q = Heat supplied to the system by the surroundings
W = Work done by the system on the surroundings
U = Change in internal energy of the system
The general principle of conservation of energy then implies that

Q = U + W (12.1)
i.e. the energy (Q) supplied to the system goes in partly to increase
the internal energy of the system (U) and the rest in work on the
environment (W). Equation (12.1) is known as the FIRST LAW OF
THERMODYNAMICS. It is simply the general law of conservation of
energy applied to any system in which the energy transfer from or to
the surroundings is taken into account.
Let us put Eq. (12.1) in the alternative form
Q W = U (12.2)
Now, the system may go from an initial state to the final state in a
number of ways. For example, to change the state of a gas from (P1,
V1) to (P2, V2), we can first change the volume of the gas from V1 to
V2, keeping its pressure constant i.e. we can first go the state (P1, V2)
and then change the pressure of the gas from P1 to P2, keeping
volume constant, to take the gas to (P2, V2). Alternatively, we can first
keep the volume constant and then keep the pressure constant. Since
U is a state variable, U depends only on the initial and final states
and not on the path taken by the gas to go from one to the other.
However, Q and W will, in general, depend on the path taken to go
from the initial to final states. From the First Law of Thermodynamics,
Eq. (12.2), it is clear that the combination Q W, is however, path
independent. This shows that if a system is taken through a process in
which U = 0 (for example, isothermal expansion of an ideal gas, see
section 12.8),
Q = W
i.e., heat supplied to the system is used up entirely by the system in
doing work on the environment.
If the system is a gas in a cylinder with a movable piston, the gas in
moving the piston does work. Since force is pressure times area, and
area times displacement is volume, work done by the system against
a constant pressure P is

W = P V
where V is the change in volume of the gas. Thus, for this case, Eq.
(12.1) gives
Q = U + P V (12.3)
As an application of Eq. (12.3), consider the change in internal energy
for 1 g of water when we go from its liquid to vapour phase. The
measured latent heat of water is 2256 J/g. i.e., for 1 g of water Q =
2256 J. At atmospheric pressure, 1 g of water has a volume 1 cm3 in
liquid phase and 1671 cm3 in vapour phase.
Therefore,

W =P (Vg Vl ) = 1.013 105 (1670)106 =169.2 J


Equation (12.3) then gives
U = 2256 169.2 = 2086.8 J
We see that most of the heat goes to increase the internal energy of
water in transition from the liquid to the vapour phase.

12.6 SPECIFIC HEAT CAPACITY


Suppose an amount of heat Q supplied to a substance changes its
temperature from T to T + T. We define heat capacity of a substance
(see Chapter 11) to be

(12.4)
We expect Q and, therefore, heat capacity S to be proportional to the
mass of the substance. Further, it could also depend on the
temperature, i.e., a different amount of heat may be needed for a unit
rise in temperature at different temperatures. To define a constant
characteristic of the substance and independent of its amount, we
divide S by the mass of the substance m in kg :

(12.5)
s is known as the specific heat capacity of the substance. It depends
on the nature of the substance and its temperature. The unit of
specific heat capacity is J kg1 K1.
If the amount of substance is specified in terms of moles (instead of
mass m in kg ), we can define heat capacity per mole of the substance
by

(12.6)
C is known as molar specific heat capacity of the substance. Like s, C
is independent of the amount of substance. C depends on the nature
of the substance, its temperature and the conditions under which heat
is supplied. The unit of C is J mo11 K1. As we shall see later (in
connection with specific heat capacity of gases), additional conditions
may be needed to define C or s. The idea in defining C is that simple
predictions can be made in regard to molar specific heat capacities.
Table 12.1 lists measured specific and molar heat capacities of solids
at atmospheric pressure and ordinary room temperature.

We will see in Chapter 13 that predictions of specific heats of gases


generally agree with experiment. We can use the same law of
equipartition of energy that we use there to predict molar specific heat
capacities of solids. Consider a solid of N atoms, each vibrating about
its mean position. An oscillator in one dimension has average energy
of 2 kBT
= kBT. In three dimensions, the average energy is 3 kBT. For a mole of
a solid, the total energy is
U = 3 kBT NA = 3 RT
Now, at constant pressure, Q = U + P V U, since for a solid V
is negligible. Therefore,

(12.7)

Table 12.1 Specific and molar heat capacities of some solids at


room temperature and atmospheric pressure

As Table 12.1 shows, the experimentally measured values which


generally agrees with predicted value 3R at ordinary temperatures.
(Carbon is an exception.) The agreement is known to break down at
low temperatures.

Specific heat capacity of water


The old unit of heat was calorie. One calorie was earlier defined to be
the amount of heat required to raise the temperature of 1g of water by
1C. With more precise measurements, it was found that the specific
heat of water varies slightly with temperature. Figure 12.5 shows this
variation in the temperature range 0 to 100 C.

Fig. 12.5 Variation of specific heat capacity of water with temperature.

For a precise definition of calorie, it was, therefore, necessary to


specify the unit temperature interval. One calorie is defined to be the
amount of heat required to raise the temperature of 1g of water from
14.5 C to 15.5C. Since heat is just a form of energy, it is preferable
to use the unit joule, J. In SI units, the specific heat capacity of water
is 4186 J kg1 K1 i.e. 4.186 J g1 K1. The so called mechanical
equivalent of heat defined as the amount of work needed to produce 1
cal of heat is in fact just a conversion factor between two different
units of energy : calorie to joule. Since in SI units, we use the unit
joule for heat, work or any other form of energy, the term mechanical
equivalent is now superfluous and need not be used.

As already remarked, the specific heat capacity depends on the


process or the conditions under which heat capacity transfer takes
place. For gases, for example, we can define two specific heats :
SPECIFIC HEAT CAPACITY AT CONSTANT VOLUME AND
SPECIFIC HEAT CAPACITY AT CONSTANT PRESSURE. For an
ideal gas, we have a simple relation.

Cp Cv = R (12.8)
where Cp and Cv are molar specific heat capacities of an ideal gas at
constant pressure and volume respectively and R is the universal gas
constant. To prove the relation, we begin with Eq. (12.3) for 1 mole of
the gas :
Q = U + P V
If Q is absorbed at constant volume, V = 0

(12.9)

where the subscript v is dropped in the last step, since U of an ideal


gas depends only on temperature. (The subscript denotes the quantity
kept fixed.) If, on the other hand, Q is absorbed at constant pressure,

(12.10)
The subscript p can be dropped from the first term since U of an ideal
gas depends only on T. Now, for a mole of an ideal gas
PV = RT
which gives

(12.11)
Equations (12.9) to (12.11) give the desired relation, Eq. (12.8).

12.7 THERMODYNAMIC STATE VARIABLES AND


EQUATION OF STATE
Every EQUILIBRIUM STATE of a thermodynamic system is
completely described by specific values of some macroscopic
variables, also called state variables. For example, an equilibrium
state of a gas is completely specified by the values of pressure,
volume, temperature, and mass (and composition if there is a mixture
of gases). A thermodynamic system is not always in equilibrium. For
example, a gas allowed to expand freely against vacuum is not an
equilibrium state [Fig. 12.6(a)]. During the rapid expansion, pressure
of the gas may not be uniform throughout. Similarly, a mixture of
gases undergoing an explosive chemical reaction (e.g. a mixture of
petrol vapour and air when ignited by a spark) is not an equilibrium
state; again its temperature and pressure are not uniform [Fig.
12.6(b)]. Eventually, the gas attains a uniform temperature and
pressure and comes to thermal and mechanical equilibrium with its
surroundings.
Fig. 12.6 (a) The partition in the box is suddenly removed leading to free
expansion of the gas. (b) A mixture of gases undergoing an explosive chemical
reaction. In both situations, the gas is not in equilibrium and cannot be described
by state variables.

In short, thermodynamic state variables describe equilibrium states of


systems. The various state variables are not necessarily independent.
The connection between the state variables is called the equation of
state. For example, for an ideal gas, the equation of state is the ideal
gas relation
PV=RT
For a fixed amount of the gas i.e. given , there are thus, only two
independent variables, say P and V or T and V. The pressure-volume
curve for a fixed temperature is called an ISOTHERM. Real gases
may have more complicated equations of state.
The thermodynamic state variables are of two kinds: EXTENSIVE and
INTENSIVE. Extensive variables indicate the size of the system.
Intensive variables such as pressure and temperature do not. To
decide which variable is extensive and which intensive, think of a
relevant system in equilibrium, and imagine that it is divided into two
equal parts. The variables that remain unchanged for each part are
intensive. The variables whose values get halved in each part are
extensive. It is easily seen, for example, that internal energy U,
volume V, total mass M are extensive variables. Pressure P,
temperature T, and density are intensive variables. It is a good
practice to check the consistency of thermodynamic equations using
this classification of variables. For example, in the equation
Q = U + P V
quantities on both sides are extensive*. (The product of an intensive
variable like P and an extensive quantity V is extensive.)

12.8 THERMODYNAMIC PROCESSES


12.8.1 Quasi-static process
Consider a gas in thermal and mechanical equilibrium with its
surroundings. The pressure of the gas in that case equals the external
pressure and its temperature is the same as that of its surroundings.
Suppose that the external pressure is suddenly reduced (say by lifting
the weight on the movable piston in the container). The piston will
accelerate outward. During the process, the gas passes through
states that are not equilibrium states. The non-equilibrium states do
not have well-defined pressure and temperature. In the same way, if a
finite temperature difference exists between the gas and its
surroundings, there will be a rapid exchange of heat during which the
gas will pass through non-equilibrium states. In due course, the gas
will settle to an equilibrium state with well-defined temperature and
pressure equal to those of the surroundings. The free expansion of a
gas in vacuum and a mixture of gases undergoing an explosive
chemical reaction, mentioned in section 12.7 are also examples where
the system goes through non-equilibrium states.
* As emphasised earlier, Q is not a state variable. However, Q is
clearly proportional to the total mass of system and hence is
extensive.

Non-equilibrium states of a system are difficult to deal with. It is,


therefore, convenient to imagine an idealised process in which at
every stage the system is an equilibrium state. Such a process is, in
principle, infinitely slow-hence the name quasi-static (meaning nearly
static). The system changes its variables (P, T, V ) so slowly that it
remains in thermal and mechanical equilibrium with its surroundings
throughout. In a quasi-static process, at every stage, the difference in
the pressure of the system and the external pressure is infinitesimally
small. The same is true of the temperature difference between the
system and its surroundings. To take a gas from the state (P, T ) to
another state (P , T ) via a quasi-static process, we change the
external pressure by a very small amount, allow the system to
equalise its pressure with that of the surroundings and continue the
process infinitely slowly until the system achieves the pressure P .
Similarly, to change the temperature, we introduce an infinitesimal
temperature difference between the system and the surrounding
reservoirs and by choosing reservoirs of progressively different
temperatures T to T , the system achieves the temperature T .
Fig. 12.7 In a quasi-static process, the temperature of the surrounding reservoir
and the external pressure differ only infinitesimally from the temperature and
pressure of the system.

A quasi-static process is obviously a hypothetical construct. In


practice, processes that are sufficiently slow and do not involve
accelerated motion of the piston, large temperature gradient, etc. are
reasonably approximation to an ideal quasi-static process. We shall
from now on deal with quasi-static processes only, except when stated
otherwise.
A process in which the temperature of the system is kept fixed
throughout is called an ISOTHERMAL PROCESS. The expansion of a
gas in a metallic cylinder placed in a large reservoir of fixed
temperature is an example of an isothermal process. (Heat transferred
from the reservoir to the system does not materially affect the
temperature of the reservoir, because of its very large heat capacity.)
In ISOBARIC PROCESSES the pressure is constant while in
ISOCHORIC PROCESSES the volume is constant. Finally, if the
system is insulated from the surroundings and no heat flows between
the system and the surroundings, the process is ADIABATIC. The
definitions of these special processes are summarised in Table. 12.2
Table 12.2 Some special thermodynamic processes

We now consider these processes in some detail:

Isothermal process
For an isothermal process (T fixed), the ideal gas equation gives
PV = constant
i.e., pressure of a given mass of gas varies inversely as its volume.
This is nothing but Boyles Law.
Suppose an ideal gas goes isothermally (at temperature T ) from its
initial state (P1, V1) to the final state (P2, V2). At any intermediate
stage with pressure P and volume change from V to
V + V (V small)
W = P V
Taking (V 0) and summing the quantity W over the entire
process,
(12.12)
where in the second step we have made use of the ideal gas equation
PV = RT and taken the constants out of the integral. For an ideal
gas, internal energy depends only on temperature. Thus, there is no
change in the internal energy of an ideal gas in an isothermal process.
The First Law of Thermodynamics then implies that heat supplied to
the gas equals the work done by the gas : Q = W. Note from Eq.
(12.12) that for V2 > V1, W > 0; and for V2 < V1, W < 0. That is, in an
isothermal expansion, the gas absorbs heat and does work while in an
isothermal compression, work is done on the gas by the environment
and heat is released.

Adiabatic process
In an adiabatic process, the system is insulated from the surroundings
and heat absorbed or released is zero. From Eq. (12.1), we see that
work done by the gas results in decrease in its internal energy (and
hence its temperature for an ideal gas). We quote without proof (the
result that you will learn in higher courses) that for an adiabatic
process of an ideal gas.
P V = const (12.13)
where is the ratio of specific heats (ordinary or molar) at constant
pressure and at constant volume.

Thus if an ideal gas undergoes a change in its state adiabatically from


(P1, V1) to (P2, V2) :
P1 V1 = P2 V2 (12.14)

Figure12.8 shows the P-V curves of an ideal gas for two adiabatic
processes connecting two isotherms.

Fig. 12.8 P-V curves for isothermal and adiabatic processes of an ideal gas.

We can calculate, as before, the work done in an adiabatic change of


an ideal gas from the state (P1, V1, T1) to the state (P2, V2, T2).

= (12.15)
From Eq. (12.34), the constant is P1V1 or P2V2

(12.16)
As expected, if work is done by the gas in an adiabatic process (W >
0), from Eq. (12.16), T2 < T1. On the other hand, if work is done on the
gas (W < 0), we get T2 > T1 i.e., the temperature of the gas rises.

Isochoric process
In an isochoric process, V is constant. No work is done on or by the
gas. From Eq. (12.1), the heat absorbed by the gas goes entirely to
change its internal energy and its temperature. The change in
temperature for a given amount of heat is determined by the specific
heat of the gas at constant volume.

Isobaric process
In an isobaric process, P is fixed. Work done by the gas is
W = P (V2 V1) = R (T2 T1) (12.17)
Since temperature changes, so does internal energy. The heat
absorbed goes partly to increase internal energy and partly to do
work. The change in temperature for a given amount of heat is
determined by the specific heat of the gas at constant pressure.

Cyclic process
In a cyclic process, the system returns to its initial state. Since internal
energy is a state variable, U = 0 for a cyclic process. From Eq.
(12.1), the total heat absorbed equals the work done by the system.

12.9 HEAT ENGINES


Heat engine is a device by which a system is made to undergo a
cyclic process that results in conversion of heat to work.
(1) It consists of a WORKING SUBSTANCEthe system. For
example, a mixture of fuel vapour and air in a gasoline or diesel
engine or steam in a steam engine are the working substances.

(2) The working substance goes through a cycle consisting of several


processes. In some of these processes, it absorbs a total amount of
heat Q1 from an external reservoir at some high temperature T1.
(3) In some other processes of the cycle, the working substance
releases a total amount of heat Q2 to an external reservoir at some
lower temperature T2.
(4) The work done (W ) by the system in a cycle is transferred to the
environment via some arrangement (e.g. the working substance may
be in a cylinder with a moving piston that transfers mechanical energy
to the wheels of a vehicle via a shaft).
The basic features of a heat engine are schematically represented in
Fig. 12.9.

Fig. 12.9 Schematic representation of a heat engine. The engine takes heat Q1
from a hot reservoir at temperature T1, releases heat Q2 to a cold reservoir at
temperature T2 and delivers work W to the surroundings.

The cycle is repeated again and again to get useful work for some
purpose. The discipline of thermodynamics has its roots in the study of
heat engines. A basic question relates to the efficiency of a heat
engine. The efficiency () of a heat engine is defined by
(12.18)

where Q1 is the heat input i.e., the heat absorbed by the system in one
complete cycle and W is the work done on the environment in a cycle.
In a cycle, a certain amount of heat (Q2) may also be rejected to the
environment. Then, according to the First Law of Thermodynamics,
over one complete cycle,
W = Q1 Q2 (12.19)

i.e.,

(12.20)
For Q2 = 0, = 1, i.e., the engine will have 100% efficiency in
converting heat into work. Note that the First Law of Thermodynamics
i.e., the energy conservation law does not rule out such an engine. But
experience shows that such an ideal engine with = 1 is never
possible, even if we can eliminate various kinds of losses associated
with actual heat engines. It turns out that there is a fundamental limit
on the efficiency of a heat engine set by an independent principle of
nature, called the Second Law of Thermodynamics (section 12.11).
The mechanism of conversion of heat into work varies for different
heat engines. Basically, there are two ways : the system (say a gas or
a mixture of gases) is heated by an external furnace, as in a steam
engine; or it is heated internally by an exothermic chemical reaction as
in an internal combustion engine. The various steps involved in a
cycle also differ from one engine to another.

12.10 REFRIGERATORS AND HEAT PUMPS


A refrigerator is the reverse of a heat engine. Here the working
substance extracts heat Q2 from the cold reservoir at temperature T2,
some external work W is done on it and heat Q1 is released to the hot
reservoir at temperature T1 (Fig. 12.10).

Fig. 12.10 Schematic representation of a refrigerator or a heat pump, the reverse


of a heat engine.

A heat pump is the same as a refrigerator. What term we use depends


on the purpose of the device. If the purpose is to cool a portion of
space, like the inside of a chamber, and higher temperature reservoir
is surrounding, we call the device a refrigerator; if the idea is to pump
heat into a portion of space (the room in a building when the outside
environment is cold), the device is called a heat pump.

Pioneers of Thermodynamics
Lord Kelvin (William Thomson) (1824-1907), born in Belfast, Ireland,
is among the foremost British scientists of the nineteenth century.
Thomson played a key role in the development of the law of conservation of
energy suggested by the work of James Joule (1818-1889), Julius Mayer
(1814-1878) and Hermann Helmholtz (1821-1894). He collaborated with Joule
on the so-called Joule-Thomson effect : cooling of a gas when it expands into
vacuum. He introduced the notion of the absolute zero of temperature and
proposed the absolute temperature scale, now called the Kelvin scale in his
honour. From the work of Sadi
Carnot
(1796-1832), Thomson arrived at a form of the Second Law of
Thermodynamics. Thomson was a versatile physicist, with notable
contributions to electromagnetic theory and hydrodynamics.

Rudolf Clausius (1822-1888), born in Poland, is generally regarded


as the discoverer of the Second Law of Thermodynamics. Based on
the work of Carnot and Thomson, Clausius arrived at the important
notion of entropy that led him to a fundamental version of the Second Law of
Thermodynamics that states that the entropy of an isolated system can never
decrease. Clausius also worked on the kinetic theory of gases and obtained
the first reliable estimates of molecular size, speed, mean free path, etc.

In a refrigerator the working substance (usually, in gaseous form)


goes through the following steps : (a) sudden expansion of the gas
from high to low pressure which cools it and converts it into a vapour-
liquid mixture, (b) absorption by the cold fluid of heat from the region
to be cooled converting it into vapour, (c) heating up of the vapour due
to external work done on the system, and (d) release of heat by the
vapour to the surroundings, bringing it to the initial state and
completing the cycle.
The coefficient of performance () of a refrigerator is given by
(12.21)

where Q2 is the heat extracted from the cold reservoir and W is the
work done on the systemthe refrigerant. ( for heat pump is defined
as Q1/W) Note that while by definition can never exceed 1, can be
greater than 1. By energy conservation, the heat released to the hot
reservoir is
Q1 = W + Q2

i.e., (12.22)
In a heat engine, heat cannot be fully converted to work; likewise a
refrigerator cannot work without some external work done on the
system, i.e., the coefficient of performance in Eq. (12.21) cannot be
infinite.

12.11 SECOND LAW OF THERMODYNAMICS


The First Law of Thermodynamics is the principle of conservation of
energy. Common experience shows that there are many conceivable
processes that are perfectly allowed by the First Law and yet are
never observed. For example, nobody has ever seen a book lying on
a table jumping to a height by itself. But such a thing would be
possible if the principle of conservation of energy were the only
restriction. The table could cool spontaneously, converting some of its
internal energy into an equal amount of mechanical energy of the
book, which would then hop to a height with potential energy equal to
the mechanical energy it acquired. But this never happens. Clearly,
some additional basic principle of nature forbids the above, even
though it satisfies the energy conservation principle. This principle,
which disallows many phenomena consistent with the First Law of
Thermodynamics is known as the Second Law of Thermodynamics.

The Second Law of Thermodynamics gives a fundamental limitation to


the efficiency of a heat engine and the co-efficient of performance of a
refrigerator. In simple terms, it says that efficiency of a heat engine
can never be unity. According to Eq. (12.20), this implies that heat
released to the cold reservoir can never be made zero. For a
refrigerator, the Second Law says that the co-efficient of performance
can never be infinite. According to Eq. (12.21), this implies that
external work (W) can never be zero. The following two statements,
one due to Kelvin and Planck denying the possibility of a perfect heat
engine, and another due to Clausius denying the possibility of a
perfect refrigerator or heat pump, are a concise summary of these
observations.
Second Law of Thermodynamics
KELVIN-PLANCK STATEMENT
No process is possible whose sole result is the absorption of heat
from a reservoir and the complete conversion of the heat into work.

CLAUSIUS STATEMENT
No process is possible whose sole result is the transfer of heat from a
colder object to a hotter object.
It can be proved that the two statements above are completely
equivalent.

12.12 REVERSIBLE AND IRREVERSIBLE


PROCESSES
Imagine some process in which a thermodynamic system goes from
an initial state i to a final state f. During the process the system
absorbs heat Q from the surroundings and performs work W on it. Can
we reverse this process and bring both the system and surroundings
to their initial states with no other effect anywhere ? Experience
suggests that for most processes in nature this is not possible. The
spontaneous processes of nature are irreversible. Several examples
can be cited. The base of a vessel on an oven is hotter than its other
parts. When the vessel is removed, heat is transferred from the base
to the other parts, bringing the vessel to a uniform temperature (which
in due course cools to the temperature of the surroundings). The
process cannot be reversed; a part of the vessel will not get cooler
spontaneously and warm up the base. It will violate the Second Law of
Thermodynamics, if it did. The free expansion of a gas is irreversible.
The combustion reaction of a mixture of petrol and air ignited by a
spark cannot be reversed. Cooking gas leaking from a gas cylinder in
the kitchen diffuses to the entire room. The diffusion process will not
spontaneously reverse and bring the gas back to the cylinder. The
stirring of a liquid in thermal contact with a reservoir will convert the
work done into heat, increasing the internal energy of the reservoir.
The process cannot be reversed exactly; otherwise it would amount to
conversion of heat entirely into work, violating the Second Law of
Thermodynamics. Irreversibility is a rule rather an exception in nature.
Irreversibility arises mainly from two causes: one, many processes
(like a free expansion, or an explosive chemical reaction) take the
system to non-equilibrium states; two, most processes involve friction,
viscosity and other dissipative effects (e.g., a moving body coming to
a stop and losing its mechanical energy as heat to the floor and the
body; a rotating blade in a liquid coming to a stop due to viscosity and
losing its mechanical energy with corresponding gain in the internal
energy of the liquid). Since dissipative effects are present everywhere
and can be minimised but not fully eliminated, most processes that we
deal with are irreversible.
A thermodynamic process (state i state f ) is reversible if the
process can be turned back such that both the system and the
surroundings return to their original states, with no other change
anywhere else in the universe. From the preceding discussion, a
reversible process is an idealised notion. A process is reversible only
if it is quasi-static (system in equilibrium with the surroundings at every
stage) and there are no dissipative effects. For example, a quasi-static
isothermal expansion of an ideal gas in a cylinder fitted with a
frictionless movable piston is a reversible process.
Why is reversibility such a basic concept in thermodynamics ? As we
have seen, one of the concerns of thermodynamics is the efficiency
with which heat can be converted into work. The Second Law of
Thermodynamics rules out the possibility of a perfect heat engine with
100% efficiency. But what is the highest efficiency possible for a heat
engine working between two reservoirs at temperatures T1 and T2 ? It
turns out that a heat engine based on idealised reversible processes
achieves the highest efficiency possible. All other engines involving
irreversibility in any way (as would be the case for practical engines)
have lower than this limiting efficiency.

12.13 CARNOT ENGINE


Suppose we have a hot reservoir at temperature T1 and a cold
reservoir at temperature T2. What is the maximum efficiency possible
for a heat engine operating between the two reservoirs and what cycle
of processes should be adopted to achieve the maximum efficiency ?
Sadi Carnot, a French engineer, first considered this question in 1824.
Interestingly, Carnot arrived at the correct answer, even though the
basic concepts of heat and thermodynamics had yet to be firmly
established.
We expect the ideal engine operating between two temperatures to be
a reversible engine. Irreversibility is associated with dissipative effects,
as remarked in the preceding section, and lowers efficiency. A
process is reversible if it is quasi-static and non-dissipative. We have
seen that a process is not quasi-static if it involves finite temperature
difference between the system and the reservoir. This implies that in a
reversible heat engine operating between two temperatures, heat
should be absorbed (from the hot reservoir) isothermally and released
(to the cold reservoir) isothermally. We thus have identified two steps
of the reversible heat engine : isothermal process at temperature T1
absorbing heat Q1 from the hot reservoir, and another isothermal
process at temperature T2 releasing heat Q2 to the cold reservoir. To
complete a cycle, we need to take the system from temperature T1 to
T2 and then back from temperature T2 to T1. Which processes should
we employ for this purpose that are reversible? A little reflection
shows that we can only adopt reversible adiabatic processes for these
purposes, which involve no heat flow from any reservoir. If we employ
any other process that is not adiabatic, say an isochoric process, to
take the system from one temperature to another, we shall need a
series of reservoirs in the temperature range T2 to T1 to ensure that at
each stage the process is quasi-static. (Remember again that for a
process to be quasi-static and reversible, there should be no finite
temperature difference between the system and the reservoir.) But we
are considering a reversible engine that operates between only two
temperatures. Thus adiabatic processes must bring about the
temperature change in the system from T1 to T2 and T2 to T1 in this
engine.

Fig. 12.11 Carnot cycle for a heat engine with an ideal gas as the working
substance.

A reversible heat engine operating between two temperatures is called


a Carnot engine. We have just argued that such an engine must have
the following sequence of steps constituting one cycle, called the
Carnot cycle, shown in Fig. 12.11. We have taken the working
substance of the Carnot engine to be an ideal gas.
(a) Step 1 2 Isothermal expansion of the gas taking its state from
(P1, V1, T1) to (P2, V2, T1).
The heat absorbed by the gas (Q1) from the reservoir at temperature
T1 is given by Eq. (12.12). This is also the work done (W1 2) by the
gas on the environment.
W1 2 = Q1 = R T1 ln (12.23)

(b) Step 2 3 Adiabatic expansion of the gas from (P2, V2, T1) to (P3,
V3, T2) Work done by the gas, using Eq. (12.16), is

(12.24)
(c) Step 3 4 Isothermal compression of the gas from (P3, V3, T2) to
(P4, V4, T2).

Heat released (Q2) by the gas to the reservoir at temperature T2 is


given by Eq. (12.12). This is also the work done (W3 4) on the gas
by the environment.

(12.25)
(d) Step 4 1 Adiabatic compression of the gas from (P4, V4, T2) to
(P1,V1, T1).
Work done on the gas, [using Eq.(12.16)], is

(12.26)

From Eqs. (12.23) to (12.26) total work done by the gas in one
complete cycle is
W = W1 2 + W2 3 W3 4 W4 1

= RT1 ln RT2 ln (12.27)


The efficiency of the Carnot engine is
(12.28)
Now since step 2 3 is an adiabatic process,

i.e. (12.29)
Similarly, since step 4 1 is an adiabatic process

i.e. (12.30)
From Eqs. (12.29) and (12.30),

(12.31)
Using Eq. (12.31) in Eq. (12.28), we get

(Carnot engine) (12.32)


We have already seen that a Carnot engine is a reversible engine.
Indeed it is the only reversible engine possible that works between two
reservoirs at different temperatures. Each step of the Carnot cycle
given in Fig. 12.11 can be reversed. This will amount to taking heat Q2
from the cold reservoir at T2, doing work W on the system, and
transferring heat Q1 to the hot reservoir. This will be a reversible
refrigerator.
We next establish the important result (sometimes called Carnots
theorem) that (a) working between two given temperatures T1 and T2
of the hot and cold reservoirs respectively, no engine can have
efficiency more than that of the Carnot engine and (b) the efficiency of
the Carnot engine is independent of the nature of the working
substance.
To prove the result (a), imagine a reversible (Carnot) engine R and an
irreversible engine I working between the same source (hot reservoir)
and sink (cold reservoir). Let us couple the engines, I and R, in such a
way so that I acts like a heat engine and R acts as a refrigerator. Let I
absorb heat Q1 from the source, deliver work W and release the heat
Q1- W to the sink. We arrange so that R returns the same heat Q1 to
the source, taking heat Q2 from the sink and requiring work W = Q1
Q2 to be done on it. Now suppose R < I i.e. if R were to act as an
engine it would give less work output than that of I i.e. W < W for a
given Q1. With R acting like a refrigerator, this would mean Q2 = Q1
W > Q1 W . Thus on the whole, the coupled I-R system extracts
heat (Q1 W) (Q1 W ) = (W W ) from the cold reservoir and
delivers the same amount of work in one cycle, without any change in
the source or anywhere else. This is clearly against the Kelvin-Planck
statement of the Second Law of Thermodynamics. Hence the
assertion I > R is wrong. No engine can have efficiency greater than
that of the Carnot engine. A similar argument can be constructed to
show that a reversible engine with one particular substance cannot be
more efficient than the one using another substance. The maximum
efficiency of a Carnot engine given by Eq. (12.32) is independent of
the nature of the system performing the Carnot cycle of operations.
Thus we are justified in using an ideal gas as a system in the
calculation of efficiency of a Carnot engine. The ideal gas has a
simple equation of state, which allows us to readily calculate , but the
final result for , [Eq. (12.32)], is true for any Carnot engine.

Fig. 12.12 An irreversible engine (I) coupled to a reversible refrigerator (R). If W >
W, this would amount to extraction of heat W W from the sink and its full
conversion to work, in contradiction with the Second Law of Thermodynamics.

This final remark shows that in a Carnot cycle,

(12.33)

is a universal relation independent of the nature of the system. Here


Q1 and Q2 are respectively, the heat absorbed and released
isothermally (from the hot and to the cold reservoirs) in a Carnot
engine. Equation (12.33), can, therefore, be used as a relation to
define a truly universal thermodynamic temperature scale that is
independent of any particular properties of the system used in the
Carnot cycle. Of course, for an ideal gas as a working substance, this
universal temperature is the same as the ideal gas temperature
introduced in section 12.11.
SUMMARY

1. The zeroth law of thermodynamics states that two systems in thermal


equilibrium with a third system are in thermal equilibrium with each other. The
Zeroth Law leads to the concept of temperature.

2. Internal energy of a system is the sum of kinetic energies and potential


energies of the molecular constituents of the system. It does not include the
over-all kinetic energy of the system. Heat and work are two modes of energy
transfer to the system. Heat is the energy transfer arising due to temperature
difference between the system and the surroundings. Work is energy transfer
brought about by other means, such as moving the piston of a cylinder
containing the gas, by raising or lowering some weight connected to it.

3. The first law of thermodynamics is the general law of conservation of energy


applied to any system in which energy transfer from or to the surroundings
(through heat and work) is taken into account. It states that

Q = U + W

where Q is the heat supplied to the system, W is the work done by the
system and U is the change in internal energy of the system.

4. The specific heat capacity of a substance is defined by

where m is the mass of the substance and Q is the heat required to change
its temperature by T. The molar specific heat capacity of a substance is
defined by
where is the number of moles of the substance. For a solid, the law of
equipartition of energy gives

C=3R

which generally agrees with experiment at ordinary temperatures.

Calorie is the old unit of heat. 1 calorie is the amount of heat required to raise
the temperature of 1 g of water from 14.5 C to 15.5 C. 1 cal = 4.186 J.

5. For an ideal gas, the molar specific heat capacities at constant pressure and
volume satisfy the relation

Cp Cv = R

where R is the universal gas constant.

6. Equilibrium states of a thermodynamic system are described by state


variables. The value of a state variable depends only on the particular state,
not on the path used to arrive at that state. Examples of state variables are
pressure (P ), volume (V ), temperature (T ), and mass (m ). Heat and work are
not state variables. An Equation of State (like the ideal gas equation PV = RT
) is a relation connecting different state variables.

7. A quasi-static process is an infinitely slow process such that the system


remains in thermal and mechanical equilibrium with the surroundings
throughout. In a quasi-static process, the pressure and temperature of the
environment can differ from those of the system only infinitesimally.

8. In an isothermal expansion of an ideal gas from volume V1 to V2 at


temperature T the heat absorbed (Q) equals the work done (W ) by the gas,
each given by

Q = W = R T ln
9. In an adiabatic process of an ideal gas

PV = constant

where

Work done by an ideal gas in an adiabatic change of state from (P1, V1, T1) to
(P2, V2, T2) is

10. Heat engine is a device in which a system undergoes a cyclic process


resulting in conversion of heat into work. If Q1 is the heat absorbed from the
source, Q2 is the heat released to the sink, and the work output in one cycle is
W, the efficiency of the engine is:

11. In a refrigerator or a heat pump, the system extracts heat Q2 from the cold
reservoir and releases Q1 amount of heat to the hot reservoir, with work W
done on the system. The co-efficient of performance of a refrigerator is given
by

12. The second law of thermodynamics disallows some processes consistent


with the First Law of Thermodynamics. It states

Kelvin-Planck statement

No process is possible whose sole result is the absorption of heat from a


reservoir and complete conversion of the heat into work.

Clausius statement
No process is possible whose sole result is the transfer of heat from a colder
object to a hotter object.

Put simply, the Second Law implies that no heat engine can have efficiency
equal to 1 or no refrigerator can have co-efficient of performance equal to
infinity.

13. A process is reversible if it can be reversed such that both the system and
the surroundings return to their original states, with no other change anywhere
else in the universe. Spontaneous processes of nature are irreversible. The
idealised reversible process is a quasi-static process with no dissipative
factors such as friction, viscosity, etc.

14. Carnot engine is a reversible engine operating between two temperatures


T1 (source) and T2 (sink). The Carnot cycle consists of two isothermal
processes connected by two adiabatic processes. The efficiency of a Carnot
engine is given by

(Carnot engine)

No engine operating between two temperatures can have efficiency greater


than that of the Carnot engine.

15. If Q > 0, heat is added to the system

If Q < 0, heat is removed to the system

If W > 0, Work is done by the system

If W < 0, Work is done on the system


POINTS TO PONDER

1. Temperature of a body is related to its average internal energy, not to the


kinetic energy of motion of its centre of mass. A bullet fired from a gun is not
at a higher temperature because of its high speed.

2. Equilibrium in thermodynamics refers to the situation when macroscopic


variables describing the thermodynamic state of a system do not depend on
time. Equilibrium of a system in mechanics means the net external force and
torque on the system are zero.

3. In a state of thermodynamic equilibrium, the microscopic constituents of a


system are not in equilibrium (in the sense of mechanics).

4. Heat capacity, in general, depends on the process the system goes through
when heat is supplied.

5. In isothermal quasi-static processes, heat is absorbed or given out by the


system even though at every stage the gas has the same temperature as that
of the surrounding reservoir. This is possible because of the infinitesimal
difference in temperature between the system and the reservoir.
EXERCISES

12.1 A geyser heats water flowing at the rate of 3.0 litres per
minute from 27 C to 77 C. If the geyser operates on a gas
burner, what is the rate of consumption of the fuel if its heat of
combustion is 4.0 104 J/g ?
12.2 What amount of heat must be supplied to 2.0 102 kg of
nitrogen (at room temperature) to raise its temperature by 45 C at
constant pressure ? (Molecular mass of N2 = 28; R = 8.3 J mol1
K1.)
12.3 Explain why
(a) Two bodies at different temperatures T1 and T2 if brought in
thermal contact do not necessarily settle to the mean temperature
(T1 + T2 )/2.
(b) The coolant in a chemical or a nuclear plant (i.e., the liquid
used to prevent the different parts of a plant from getting too hot)
should have high specific heat.
(c) Air pressure in a car tyre increases during driving.

(d) The climate of a harbour town is more temperate than that of a


town in a desert at the same latitude.
12.4 A cylinder with a movable piston contains 3 moles of
hydrogen at standard temperature and pressure. The walls of the
cylinder are made of a heat insulator, and the piston is insulated
by having a pile of sand on it. By what factor does the pressure of
the gas increase if the gas is compressed to half its original
volume ?
12.5 In changing the state of a gas adiabatically from an
equilibrium state A to another equilibrium state B, an amount of
work equal to 22.3 J is done on the system. If the gas is taken
from state A to B via a process in which the net heat absorbed by
the system is 9.35 cal, how much is the net work done by the
system in the latter case ? (Take 1 cal = 4.19 J)
12.6 Two cylinders A and B of equal capacity are connected to
each other via a stopcock. A contains a gas at standard
temperature and pressure. B is completely evacuated. The entire
system is thermally insulated. The stopcock is suddenly opened.
Answer the following :
(a) What is the final pressure of the gas in A and B ?
(b) What is the change in internal energy of the gas ?
(c) What is the change in the temperature of the gas ?
(d) Do the intermediate states of the system (before settling to the
final equilibrium
state) lie on its P-V-T surface ?

12.7 A steam engine delivers 5.4108J of work per minute and


services 3.6 109J of heat per minute from its boiler. What is the
efficiency of the engine? How much heat is wasted per minute?
12.8 An electric heater supplies heat to a system at a rate of
100W. If system performs work at a rate of 75 joules per second.
At what rate is the internal energy increasing?
12.9 A thermodynamic system is taken from an original state to an
intermediate state by the linear process shown in Fig. (12.13)
Fig. 12.13

Its volume is then reduced to the original value from E to F by an


isobaric process. Calculate the total work done by the gas from D
to E to F
12.10 A refrigerator is to maintain eatables kept inside at 9oC. If
room temperature is 36oC, calculate the coefficient of
performance.
Chapter Thirteen

Kinetic Theory

13.1 Introduction

13.2 Molecular nature of matter

13.3 Behaviour of gases

13.4 Kinetic theory of an ideal gas

13.5 Law of equipartition of energy

13.6 Specific heat capacity

13.7 Mean free path

Summary

Points to ponder

Exercises

ADDITIONAL EXERCISES

13.1 INTRODUCTION
Boyle discovered the law named after him in 1661. Boyle, Newton and
several others tried to explain the behaviour of gases by considering
that gases are made up of tiny atomic particles. The actual atomic
theory got established more than 150 years later. Kinetic theory
explains the behaviour of gases based on the idea that the gas
consists of rapidly moving atoms or molecules. This is possible as the
inter-atomic forces, which are short range forces that are important for
solids and liquids, can be neglected for gases. The kinetic theory was
developed in the nineteenth century by Maxwell, Boltzmann and
others. It has been remarkably successful. It gives a molecular
interpretation of pressure and temperature of a gas, and is consistent
with gas laws and Avogadros hypothesis. It correctly explains specific
heat capacities of many gases. It also relates measurable properties
of gases such as viscosity, conduction and diffusion with molecular
parameters, yielding estimates of molecular sizes and masses. This
chapter gives an introduction to kinetic theory.

13.2 MOLECULAR NATURE OF MATTER


Richard Feynman, one of the great physicists of 20th century
considers the discovery that Matter is made up of atoms to be a very
significant one. Humanity may suffer annihilation (due to nuclear
catastrophe) or extinction (due to environmental disasters) if we do not
act wisely. If that happens, and all of scientific knowledge were to be
destroyed then Feynman would like the Atomic Hypothesis to be
communicated to the next generation of creatures in the universe.
Atomic Hypothesis: All things are made of atoms - little particles that
move around in perpetual motion, attracting each other when they are
a little distance apart, but repelling upon being squeezed into one
another.
Speculation that matter may not be continuous, existed in many
places and cultures. Kanada in India and Democritus in Greece had
suggested that matter may consist of indivisible constituents. The
scientific Atomic Theory is usually credited to John Dalton. He
proposed the atomic theory to explain the laws of definite and multiple
proportions obeyed by elements when they combine into compounds.
The first law says that any given compound has, a fixed proportion by
mass of its constituents. The second law says that when two elements
form more than one compound, for a fixed mass of one element, the
masses of the other elements are in ratio of small integers.

Atomic Hypothesis in Ancient India and Greece

Though John Dalton is credited with the introduction of atomic viewpoint in


modern science, scholars in ancient India and Greece conjectured long before
the existence of atoms and molecules. In the Vaiseshika school of thought in
India founded by Kanada (Sixth century B.C.) the atomic picture was
developed in considerable detail. Atoms were thought to be eternal, indivisible,
infinitesimal and ultimate parts of matter. It was argued that if matter could be
subdivided without an end, there would be no difference between a mustard
seed and the Meru mountain. The four kinds of atoms (Paramanu Sanskrit
word for the smallest particle) postulated were Bhoomi (Earth), Ap (water),
Tejas (fire) and Vayu (air) that have characteristic mass and other attributes,
were propounded. Akasa (space) was thought to have no atomic structure and
was continuous and inert. Atoms combine to form different molecules (e.g. two
atoms combine to form a diatomic molecule dvyanuka, three atoms form a
tryanuka or a triatomic molecule), their properties depending upon the nature
and ratio of the constituent atoms. The size of the atoms was also estimated,
by conjecture or by methods that are not known to us. The estimates vary. In
Lalitavistara, a famous biography of the Buddha written mainly in the second
century B.C., the estimate is close to the modern estimate of atomic size, of
the order of 1010 m.

In ancient Greece, Democritus (Fourth century B.C.) is best known for his
atomic hypothesis. The word atom means indivisible in Greek. According to
him, atoms differ from each other physically, in shape, size and other
properties and this resulted in the different properties of the substances
formed by their combination. The atoms of water were smooth and round and
unable to hook on to each other, which is why liquid /water flows easily. The
atoms of earth were rough and jagged, so they held together to form hard
substances. The atoms of fire were thorny which is why it caused painful
burns. These fascinating ideas, despite their ingenuity, could not evolve much
further, perhaps because they were intuitive conjectures and speculations not
tested and modified by quantitative experiments - the hallmark of modern
science.

To explain the laws Dalton suggested, about 200 years ago, that the
smallest constituents of an element are atoms. Atoms of one element
are identical but differ from those of other elements. A small number of
atoms of each element combine to form a molecule of the compound.
Gay Lussacs law, also given in early 19th century, states: When
gases combine chemically to yield another gas, their volumes are in
the ratios of small integers. Avogadros law (or hypothesis) says:
Equal volumes of all gases at equal temperature and pressure have
the same number of molecules. Avogadros law, when combined with
Daltons theory explains Gay Lussacs law. Since the elements are
often in the form of molecules, Daltons atomic theory can also be
referred to as the molecular theory of matter. The theory is now well
accepted by scientists. However even at the end of the nineteenth
century there were famous scientists who did not believe in atomic
theory !
From many observations, in recent times we now know that molecules
(made up of one or more atoms) constitute matter. Electron
microscopes and scanning tunnelling microscopes enable us to even
-10
see them. The size of an atom is about an angstrom (10 m). In
solids, which are tightly packed, atoms are spaced about a few
angstroms (2 ) apart. In liquids the separation between atoms is also
about the same. In liquids the atoms are not as rigidly fixed as in
solids, and can move around. This enables a liquid to flow. In gases
the interatomic distances are in tens of angstroms. The average
distance a molecule can travel without colliding is called the mean free
path. The mean free path, in gases, is of the order of thousands of
angstroms. The atoms are much freer in gases and can travel long
distances without colliding. If they are not enclosed, gases disperse
away. In solids and liquids the closeness makes the interatomic force
important. The force has a long range attraction and a short range
repulsion. The atoms attract when they are at a few angstroms but
repel when they come closer. The static appearance of a gas is
misleading. The gas is full of activity and the equilibrium is a dynamic
one. In dynamic equilibrium, molecules collide and change their
speeds during the collision. Only the average properties are constant.
Atomic theory is not the end of our quest, but the beginning. We now
know that atoms are not indivisible or elementary. They consist of a
nucleus and electrons. The nucleus itself is made up of protons and
neutrons. The protons and neutrons are again made up of quarks.
Even quarks may not be the end of the story. There may be string like
elementary entities. Nature always has surprises for us, but the search
for truth is often enjoyable and the discoveries beautiful. In this
chapter, we shall limit ourselves to understanding the behaviour of
gases (and a little bit of solids), as a collection of moving molecules in
incessant motion.
13.3 Behaviour of Gases
Properties of gases are easier to understand than those of solids and
liquids. This is mainly because in a gas, molecules are far from each
other and their mutual interactions are negligible except when two
molecules collide. Gases at low pressures and high temperatures
much above that at which they liquefy (or solidify) approximately
satisfy a simple relation between their pressure, temperature and
volume given by (see Ch. 11)

PV = KT (13.1)
for a given sample of the gas. Here T is the temperature in kelvin or
(absolute) scale. K is a constant for the given sample but varies with
the volume of the gas. If we now bring in the idea of atoms or
molecules then K is proportional to the number of molecules, (say) N
in the sample. We can write K = N k . Observation tells us that this k is
same for all gases. It is called Boltzmann constant and is denoted by
kB.

As = constant = kB (13.2)

if P, V and T are same, then N is also same for all gases. This is
Avogadros hypothesis, that the number of molecules per unit volume
is same for all gases at a fixed temperature and pressure. The number
in 22.4 litres of any gas is 6.02 10 23 . This is known as Avogadro
number and is denoted by N A . The mass of 22.4 litres of any gas is
equal to its molecular weight in grams at S.T.P (standard temperature
273 K and pressure 1 atm). This amount of substance is called a mole
(see Chapter 2 for a more precise definition). Avogadro had guessed
the equality of numbers in equal volumes of gas at a fixed temperature
and pressure from chemical reactions. Kinetic theory justifies this
hypothesis.
The perfect gas equation can be written as
PV = RT (13.3)
where is the number of moles and R = N AkB is a universal constant.
The temperature T is absolute temperature. Choosing kelvin scale for
absolute temperature, R = 8.314 J mol1 K1 .
Here

(13.4)

where M is the mass of the gas containing N molecules, M0 is the


molar mass and NA the Avogadros number. Using Eqs. (13.4) and
(13.3) can also be written as
PV = kB NT or
P = kB nT

John Dalton (1766- 1844)

He was an English chemist. When different types of atoms combine, they obey
certain simple laws. Daltons atomic theory explains these laws in a simple
way. He also gave a theory of colour blindness.

Amedeo Avogadro (1776 1856)

He made a brilliant guess that equal volumes of gases have equal number of
molecules at the same temperature and pressure. This helped in
understanding the combination of different gases in a very simple way. It is
now called Avogadros hypothesis (or law). He also suggested that the
smallest constituent of gases like hydrogen, oxygen and nitrogen are not
atoms but diatomic molecules.

P (atm)
Fig.13.1 Real gases approach ideal gas behaviour at low pressures and high
temperatures.
where n is the number density, i.e. number of molecules per unit
volume. kB is the Boltzmann constant introduced above. Its value in SI
units is 1.38 1023JK1.
Another useful form of Eq. (13.3) is

(13.5)
where is the mass density of the gas.

A gas that satisfies Eq. (13.3) exactly at all pressures and


temperatures is defined to be an ideal gas. An ideal gas is a simple
theoretical model of a gas. No real gas is truly ideal. Fig. 13.1 shows
departures from ideal gas behaviour for a real gas at three different
temperatures. Notice that all curves approach the ideal gas behaviour
for low pressures and high temperatures.
At low pressures or high temperatures the molecules are far apart and
molecular interactions are negligible. Without interactions
the gas behaves like an ideal one.
If we fix and T in Eq. (13.3), we get

PV = constant (13.6)
i.e., keeping temperature constant, pressure of a given mass of gas
varies inversely with volume. This is the famous Boyles law. Fig. 13.2
shows comparison between experimental P-V curves and the
theoretical curves predicted by Boyles law. Once again you see that
the agreement is good at high temperatures and low pressures. Next,
if you fix P, Eq. (13.1) shows that V Ti.e., for a fixed pressure, the
volume of a gas is proportional to its absolute temperature T (Charles
law). See Fig. 13.3.
Fig.13.2 Experimental P-V curves (solid lines) for steam at three temperatures
compared with Boyles law (dotted lines). P is in units of 22 atm and V in units of
0.09 litres.

Finally, consider a mixture of non-interacting ideal gases: 1 moles of


gas 1, 2 moles of gas 2, etc. in a vessel of volume V at temperature T
and pressure P. It is then found that the equation
of state of the mixture is :
PV = ( 1 + 2 +... ) RT (13.7)

i.e. (13.8)
= P1 + P2 + ... (13.9)
Clearly P1 = 1RT/V is the pressure gas 1 would exert at the same
conditions of volume and temperature if no other gases were present.
This is called the partial pressure of the gas. Thus, the total pressure
of a mixture of ideal gases is the sum of partial pressures. This is
Daltons law of partial pressures.

Fig. 13.3 Experimental T-V curves (solid lines) for CO2 at three pressures
compared with Charles law (dotted lines). T is in units of 300 K and V in units of
0.13 litres.

We next consider some examples which give us information about the


volume occupied by the molecules and the volume of a single
molecule.

Example 13.1 The density of water is 1000 kg m3. The density of water
vapour at 100 C and 1 atm pressure is 0.6 kg m3. The volume of a molecule
multiplied by the total number gives ,what is called, molecular volume.
Estimate the ratio (or fraction) of the molecular volume to the total volume
occupied by the water vapour under the above conditions of temperature and
pressure.

Answer For a given mass of water molecules, the density is less if


volume is large. So the volume of the vapour is 1000/0.6 = /(6 104)
times larger. If densities of bulk water and water molecules are same,
then the fraction of molecular volume to the total volume in liquid state
is 1. As volume in vapour state has increased, the fractional volume is
less by the same amount, i.e. 6104. t

Example 13.2 Estimate the volume of a water molecule using the data in
Example 13.1.

Answer In the liquid (or solid) phase, the molecules of water are quite
closely packed. The density of water molecule may therefore, be
regarded as roughly equal to the density of bulk water = 1000 kg m3.
To estimate the volume of a water molecule, we need to know the
mass of a single water molecule. We know that 1 mole of water has a
mass approximately equal to (2 + 16)g = 18 g = 0.018 kg. Since 1
mole contains about 6 1023 molecules (Avogadros number), the
mass of a molecule of water is (0.018)/(6 1023) kg = 3 1026 kg.
Therefore, a rough estimate of the volume of a water molecule is as
follows : Volume of a water molecule = (3 1026 kg)/ (1000 kg m3) =
3 1029 m3 = (4/3) (Radius)3 Hence, Radius 2 1010 m = 2
Example 13.3 What is the average distance between atoms (interatomic
distance) in water? Use the data given in Examples 13.1 and 13.2.

Answer : A given mass of water in vapour state has 1.67103 times


the volume of the same mass of water in liquid state (Ex. 13.1). This is
also the increase in the amount of volume available for each molecule
of water. When volume increases by 103 times the radius increases by
V1/3 or 10 times, i.e., 10 2 = 20 . So the average distance is 2
20 = 40 .

Example 13.4 A vessel contains two non-reactive gases : neon (monatomic)


and oxygen (diatomic). The ratio of their partial pressures is 3:2. Estimate the
ratio of (i) number of molecules and (ii) mass density of neon and oxygen in
the vessel. Atomic mass of Ne = 20.2 u, molecular mass of O2 = 32.0 u.

Answer Partial pressure of a gas in a mixture is the pressure it would


have for the same volume and temperature if it alone occupied the
vessel. (The total pressure of a mixture of non-reactive gases is the
sum of partial pressures due to its constituent gases.) Each gas
(assumed ideal) obeys the gas law. Since V and T are common to the
two gases, we have P1V = 1 RT and P2V = 2 RT, i.e. (P1/P2) = (1 /
2). Here 1 and 2 refer to neon and oxygen respectively. Since (P1/P2)
= (3/2) (given), (1/ 2) = 3/2. (i) By definition 1 = (N1/NA ) and 2 =
(N2/NA) where N1 and N2 are the number of molecules of 1 and 2, and
NA is the Avogadros number. Therefore, (N1/N2) = (1 / 2) = 3/2.
(ii) We can also write 1 = (m1/M1) and 2 = (m2/M2) where m1 and m2
are the masses of 1 and 2; and M1 and M2 are their molecular
masses. (Both m1 and M1; as well as m2 and M2 should be expressed
in the same units). If 1 and 2 are the mass densities of 1 and 2
respectively, we have

13.4 KINETIC THEORY OF AN IDEAL GAS


Kinetic theory of gases is based on the molecular picture of matter. A
given amount of gas is a collection of a large number of molecules
(typically of the order of Avogadros number) that are in incessant
random motion. At ordinary pressure and temperature, the average
distance between molecules is a factor of 10 or more than the typical
size of a molecule (2 ). Thus the interaction between the molecules
is negligible and we can assume that they move freely in straight lines
according to Newtons first law. However, occasionally, they come
close to each other, experience intermolecular forces and their
velocities change. These interactions are called collisions. The
molecules collide incessantly against each other or with the walls and
change their velocities. The collisions are considered to be elastic. We
can derive an expression for the pressure of a gas based on the
kinetic theory. We begin with the idea that molecules of a gas are in
incessant random motion, colliding against one another and with the
walls of the container. All collisions between molecules among
themselves or between molecules and the walls are elastic. This
implies that total kinetic energy is conserved. The total momentum is
conserved as usual.

13.4.1 Pressure of an Ideal Gas


Consider a gas enclosed in a cube of side l. Take the axes to be
parallel to the sides of the cube, as shown in Fig. 13.4. A molecule
with velocity (vx ,vy , vz ) hits the planar wall parallel to yz-plane

Fig. 13.4 Elastic collision of a gas molecule with the wall of the container.
of area A (= l2). Since the collision is elastic, the molecule rebounds
with the same velocity; its y and z components of velocity do not
change in the collision but the x-component reverses sign. That is, the
velocity after collision is (-vx , vy , vz) . The change in momentum of
the molecule is : mvx (mvx ) = 2mvx . By the principle of
conservation of momentum, the momentum imparted to the wall in the
collision = 2mvx .
To calculate the force (and pressure) on the wall, we need to calculate
momentum imparted to the wall per unit time. In a small time interval
t, a molecule with x-component of velocity vx will hit the wall if it is
within the distance vx t from the wall. That is, all molecules within the
volume Avx t only can hit the wall in time t. But, on the average, half
of these are moving towards the wall and the other half away from the
wall. Thus the number of molecules with velocity (vx , vy , vz) hitting
the wall in time t is 12A vx t n where n is the number of molecules
per unit volume. The total momentum transferred to the wall by these
molecules in time t is :
Q = (2mvx ) (12 n A vx t ) (13.10)
The force on the wall is the rate of momentum transfer Q/t and
pressure is force per unit area :
P = Q /(A t) = n m vx2 (3.11)
Actually, all molecules in a gas do not have the same velocity; there is
a distribution in velocities. The above equation therefore, stands for
pressure due to the group of molecules with speed vx in the x-direction
and n stands for the number density of that group of molecules. The
total pressure is obtained by summing over the contribution due to all
groups:

P=nm (13.12)

where is the average of vx2 . Now the gas is isotropic, i.e. there is
no preferred direction of velocity of the molecules in the vessel.
Therefore, by symmetry,

= =

= (1/3) [ + + ] = (1/3) (13.13)

where v is the speed and denotes the mean of the squared speed.
Thus

P = (1/3) n m (13.14)
Some remarks on this derivation. First, though we choose the
container to be a cube, the shape of the vessel really is immaterial.
For a vessel of arbitrary shape, we can always choose a small
infinitesimal (planar) area and carry through the steps above. Notice
that both A and t do not appear in the final result. By Pascals law,
given in Ch. 10, pressure in one portion of the gas in equilibrium is the
same as anywhere else. Second, we have ignored any collisions in
the derivation. Though this assumption is difficult to justify rigorously,
we can qualitatively see that it will not lead to erroneous results. The
number of molecules hitting the wall in time t was found to be 12 n
Avx t. Now the collisions are random and the gas is in a steady state.
Thus, if a molecule with velocity (vx , vy , vz ) acquires a different
velocity due to collision with some molecule, there will always be
some other molecule with a different initial velocity which after a
collision acquires the velocity (vx , vy , vz). If this were not so, the
distribution of velocities would not remain steady. In any case we are

finding . Thus, on the whole, molecular collisions (if they are not
too frequent and the time spent in a collision is negligible compared to
time between collisions) will not affect the calculation above.

13.4.2 Kinetic Interpretation of Temperature


Equation (13.14) can be written as

PV = (1/3) nV m (13.15a)

PV = (2/3) N x 12 m (13.15b)
where N (= nV ) is the number of molecules in the sample.

Founders of Kinetic Theory of Gases


James Clerk Maxwell (1831 1879), born in Edinburgh, Scotland, was
among the greatest physicists of the nineteenth century. He derived the
thermal velocity distribution of molecules in a gas and was among the first to
obtain reliable estimates of molecular parameters from measurable quantities
like viscosity, etc. Maxwells greatest achievement was the unification of the
laws of electricity and magnetism (discovered by Coulomb, Oersted, Ampere
and Faraday) into a consistent set of equations now called Maxwells
equations. From these he arrived at the most important conclusion that light is
an electromagnetic wave. Interestingly, Maxwell did not agree with the idea
(strongly suggested by the Faradays laws of electrolysis) that electricity was
particulate in nature.

Ludwig Boltzmann (1844 1906) born in Vienna, Austria, worked on the


kinetic theory of gases independently of Maxwell. A firm advocate of atomism,
that is basic to kinetic theory, Boltzmann provided a statistical interpretation of
the Second Law of thermodynamics and the concept of entropy. He is
regarded as one of the founders of classical statistical mechanics. The
proportionality constant connecting energy and temperature in kinetic theory is
known as Boltzmanns constant in his honour.

The quantity in the bracket is the average translational kinetic energy


of the molecules in the gas. Since the internal energy E of an ideal
gas is purely kinetic*,

E = N (1/2) m (13.16)
Equation (13.15) then gives :
PV = (2/3) E (13.17)
We are now ready for a kinetic interpretation of temperature.
Combining Eq. (13.17) with the ideal gas Eq. (13.3), we get
E = (3/2) kBNT (13.18)

or E/ N = 12 m = (3/2) kBT (13.19)

i.e., the average kinetic energy of a molecule is proportional to the


absolute temperature of the gas; it is independent of pressure, volume
or the nature of the ideal gas. This is a fundamental result relating
temperature, a macroscopic measurable parameter of a gas
(a thermodynamic variable as it is called) to a molecular quantity,
namely the average kinetic energy of a molecule. The two domains
are connected by the Boltzmann constant. We note in passing that Eq.
(13.18) tells us that internal energy of an ideal gas depends only on
temperature, not on pressure or volume. With this interpretation of
temperature, kinetic theory of an ideal gas is completely consistent
with the ideal gas equation and the various gas laws based on it.

For a mixture of non-reactive ideal gases, the total pressure gets


contribution from each gas in the mixture. Equation (13.14) becomes

P = (1/3) [n1m1 + n2m2 +... ] (13.20)


In equilibrium, the average kinetic energy of the molecules of different
gases will be equal. That is,

12 m1 = 12 m2 = (3/2) kBT
so that
P = (n1 + n2 +... ) kBT (13.21)
which is Daltons law of partial pressures.
From Eq. (13.19), we can get an idea of the typical speed of
molecules in a gas. At a temperature T = 300 K, the mean square
speed of a molecule in nitrogen gas is :

kg.

= 3 kB T / m = (516)2 m2s-2

The square root of is known as root mean square (rms) speed and
is denoted by vrms,

( We can also write as <v2>.)


vrms = 516 m s-1
The speed is of the order of the speed of sound in air. It follows from
Eq. (13.19) that at the same temperature, lighter molecules have
greater rms speed.

Example 13.5 A flask contains argon and chlorine in the ratio of 2:1 by mass.
The temperature of the mixture is 27 C. Obtain the ratio of (i) average kinetic
energy per molecule, and (ii) root mean square speed vrms of the molecules
of the two gases. Atomic mass of argon = 39.9 u; Molecular mass of chlorine =
70.9 u.

Answer The important point to remember is that the average kinetic


energy (per molecule) of any (ideal) gas (be it monatomic like argon,
diatomic like chlorine or polyatomic) is always equal to (3/2) kBT. It
depends only on temperature, and is independent of the nature of the
gas.
(i) Since argon and chlorine both have the same temperature in the
flask, the ratio of average kinetic energy (per molecule) of the two
gases is 1:1.
(ii) Now m vrms2 = average kinetic energy per molecule = (3/2) ) kBT
where m is the mass of a molecule of the gas. Therefore,

= =1.77
where M denotes the molecular mass of the gas. (For argon, a
molecule is just an atom of argon.)
Taking square root of both sides,

= 1.33
You should note that the composition of the mixture by mass is quite
irrelevant to the above calculation. Any other proportion by mass of
argon and chlorine would give the same answers to (i) and (ii),
provided the temperature remains unaltered.
*E denotes the translational part of the internal energy U that may include energies
due to other degrees of freedom also. See section 13.5.

Maxwell Distribution Function

In a given mass of gas, the velocities of all molecules are not the same, even
when bulk parameters like pressure, volume and temperature are fixed.
Collisions change the direction and the speed of molecules. However in a
state of equilibrium, the distribution of speeds is constant or fixed.

Distributions are very important and useful when dealing with systems
containing large number of objects. As an example consider the ages of
different persons in a city. It is not feasible to deal with the age of each
individual. We can divide the people into groups: children up to age 20 years,
adults between ages of 20 and 60, old people above 60. If we want more
detailed information we can choose smaller intervals, 0-1, 1-2,..., 99-100 of
age groups. When the size of the interval becomes smaller, say half year, the
number of persons in the interval will also reduce, roughly half the original
number in the one year interval. The number of persons dN(x) in the age
interval x and x+dx is proportional to dx or dN(x) = nx dx. We have used nx to
denote the number of persons at the value of x.
Maxwell distribution of molecular speeds

In a similar way the molecular speed distribution gives the number of


2
molecules between the speeds v and v+ dv. dN(v) = 4p N a3ebv v2 dv =
nvdv. This is called Maxwell distribution. The plot of nv against v is shown in
the figure. The fraction of the molecules with speeds v and v+dv is equal to the
area of the strip shown. The average of any quantity like v2 is defined by the
integral <v2> = (1/N ) v2 dN(v) = (3kB T/m) which agrees with the result
derived from more elementary considerations.

Example 13.6 Uranium has two isotopes of masses 235 and 238 units. If both
are present in Uranium hexafluoride gas which would have the larger average
speed ? If atomic mass of fluorine is 19 units, estimate the percentage
difference in speeds at any temperature.

2
Answer At a fixed temperature the average energy = m <v > is
constant. So smaller the mass of the molecule, faster will be the
speed. The ratio of speeds is inversely proportional to the square root
of the ratio of the masses. The masses are 349 and 352 units. So
v349 / v352 = ( 352/ 349)1/2 = 1.0044 .

Hence difference = 0.44 %.


[235U is the isotope needed for nuclear fission. To separate it from the
more abundant isotope 238U, the mixture is surrounded by a porous
cylinder. The porous cylinder must be thick and narrow, so that the
molecule wanders through individually, colliding with the walls of the
long pore. The faster molecule will leak out more than the slower one
and so there is more of the lighter molecule (enrichment) outside the
porous cylinder (Fig. 13.5). The method is not very efficient and has to
be repeated several times for sufficient enrichment.]. t
When gases diffuse, their rate of diffusion is inversely proportional to
square root of the masses (see Exercise 13.12 ). Can you guess the
explanation from the above answer?
Fig. 13.5 Molecules going through a porous wall.

Example 13.7 (a) When a molecule (or an elastic ball) hits a ( massive) wall, it
rebounds with the same speed. When a ball hits a massive bat held firmly, the
same thing happens. However, when the bat is moving towards the ball, the
ball rebounds with a different speed. Does the ball move faster or slower?
(Ch.6 will refresh your memory on elastic collisions.)
(b) When gas in a cylinder is compressed by pushing in a piston, its
temperature rises. Guess at an explanation of this in terms of kinetic theory
using (a) above.
(c) What happens when a compressed gas pushes a piston out and expands.
What would you observe ?
(d) Sachin Tendulkar uses a heavy cricket bat while playing. Does it help him
in anyway ?
Answer (a) Let the speed of the ball be u relative to the wicket behind
the bat. If the bat is moving towards the ball with a speed V relative to
the wicket, then the relative speed of the ball to bat is V + u towards
the bat. When the ball rebounds (after hitting the massive bat) its
speed, relative to bat, is V + u moving away from the bat. So relative
to the wicket the speed of the rebounding ball is V + (V + u) = 2V + u,
moving away from the wicket. So the ball speeds up after the collision
with the bat. The rebound speed will be less than u if the bat is not
massive. For a molecule this would imply an increase in temperature.
You should be able to answer (b) (c) and (d) based on the answer to
(a).

(Hint: Note the correspondence, piston bat, cylinder wicket,

molecule ball.) t

13.5 LAW OF EQUIPARTITION OF ENERGY


The kinetic energy of a single molecule is

(13.22)
For a gas in thermal equilibrium at temperature T the average value of

energy denoted by < > is

(13.23)
Since there is no preferred direction, Eq. (13.23) implies

, ,
(13.24)

A molecule free to move in space needs three coordinates to specify


its location. If it is constrained to move in a plane it needs two;and if
constrained to move along a line, it needs just one coordinate to
locate it. This can also be expressed in another way. We say that it
has one degree of freedom for motion in a line, two for motion in a
plane and three for motion in space. Motion of a body as a whole from
one point to another is called translation. Thus, a molecule free to
move in space has three translational degrees of freedom. Each
translational degree of freedom contributes a term that contains
square of some variable of motion, e.g., mvx2 and similar terms in vy
and vz. In, Eq. (13.24) we see that in thermal equilibrium, the average
of each such term is kBT .
Molecules of a monatomic gas like argon have only translational
degrees of freedom. But what about a diatomic gas such as O2 or N2?
A molecule of O2 has three translational degrees of freedom. But in
addition it can also rotate about its centre of mass. Figure 13.6 shows
the two independent axes of rotation 1 and 2, normal to the axis
joining the two oxygen atoms about which the molecule can rotate*.
The molecule thus has two rotational degrees of freedom, each of
which contributes a term to the total energy consisting of translational

energy and rotational energy r.

(13.25)
Fig. 13.6 The two independent axes of rotation of a diatomic molecule

where 1 and 2 are the angular speeds about the axes 1 and 2 and
I1, I2 are the corresponding moments of inertia. Note that each
rotational degree of freedom contributes a term to the energy that
contains square of a rotational variable of motion.
We have assumed above that the O2 molecule is a rigid rotator, i.e.
the molecule does not vibrate. This assumption, though found to be
true (at moderate temperatures) for O2, is not always valid. Molecules
like CO even at moderate temperatures have a mode of vibration, i.e.
its atoms oscillate along the interatomic axis like a one-dimensional
oscillator, and contribute a vibrational energy term v to the total
energy:

(13.26)
where k is the force constant of the oscillator and y the vibrational co-
ordinate.
Once again the vibrational energy terms in Eq. (13.26) contain
squared terms of vibrational variables of motion y and dy/dt .

At this point, notice an important feature in Eq.(13.26). While each


translational and rotational degree of freedom has contributed only
one squared term in Eq.(13.26), one vibrational mode contributes two
squared terms : kinetic and potential energies.

Each quadratic term occurring in the expression for energy is a mode


of absorption of energy by the molecule. We have seen that in thermal
equilibrium at absolute temperature T, for each translational mode of
motion, the average energy is kBT. A most elegant principle of
classical statistical mechanics (first proved by Maxwell) states that this
is so for each mode of energy: translational, rotational and vibrational.
That is, in equilibrium, the total energy is equally distributed in all
possible energy modes, with each mode having an average energy
equal to kBT. This is known as the law of equipartition of energy.
Accordingly, each translational and rotational degree of freedom of a
molecule contributes kBT to the energy while each vibrational
frequency contributes 2 kBT = kBT , since a vibrational mode has
both kinetic and potential energy modes.
The proof of the law of equipartition of energy is beyond the scope of
this book. Here we shall apply the law to predict the specific heats of
gases theoretically. Later we shall also discuss briefly, the application
to specific heat of solids.

13.6 SPECIFIC HEAT CAPACITY


13.6.1 Monatomic Gases
The molecule of a monatomic gas has only three translational degrees
of freedom. Thus, the average energy of a molecule at temperature T
is (3/2)kBT . The total internal energy of a mole of such a gas is

(13.27)
The molar specific heat at constant volume, Cv, is

Cv (monatomic gas) = = RT (13.28)

* Rotation along the line joining the atoms has very small moment of
inertia and does not come into play for quantum mechanical reasons.
See end of section 13.6.

For an ideal gas,


Cp Cv = R (13.29)
where Cp is the molar specific heat at constant pressure. Thus,

Cp = R (13.30)

The ratio of specific heats (13.31)

13.6.2 Diatomic Gases


As explained earlier, a diatomic molecule treated as a rigid rotator like
a dumbbell has 5 degrees of freedom : 3 translational and 2 rotational.
Using the law of equipartition of energy, the total internal energy of a
mole of such a gas is
(13.32)

The molar specific heats are then given by

Cv (rigid diatomic) = R, Cp = R (13.33)

(rigid diatomic) = (13.34)


If the diatomic molecule is not rigid but has in addition a vibrational
mode

R (13.35)

13.6.3 Polyatomic Gases


In general a polyatomic molecule has 3 translational, 3 rotational
degrees of freedom and a certain number (f) of vibrational modes.
According to the law of equipartition of energy, it is easily seen that
one mole of such a gas has

U=( kBT + kBT + f kBT ) NA

i.e. Cv = (3 + f ) R, Cp = (4 + f ) R, (13.36)
Note that Cp Cv = R is true for any ideal gas, whether mono, di or
polyatomic.
Table 13.1 summarises the theoretical predictions for specific heats of
gases ignoring any vibrational modes of motion. The values are in
good agreement with experimental values of specific heats of several
gases given in Table 13.2. Of course, there are discrepancies
between predicted and actual values of specific heats of several other
gases (not shown in the table), such as Cl2, C2H6 and many other
polyatomic gases. Usually, the experimental values for specific heats
of these gases are greater than the predicted values given in
Table13.1 suggesting that the agreement can be improved by
including vibrational modes of motion in the calculation. The law of
equipartition of energy is thus well verified experimentally at ordinary
temperatures.

Example 13.8 A cylinder of fixed capacity 44.8 litres contains helium gas at
standard temperature and pressure. What is the amount of heat needed to
raise the temperature of the gas in the cylinder by 15.0 C ? (R = 8.31 J mo11
K1).

Answer Using the gas law PV = RT, you can easily show that 1 mol
of any (ideal) gas at standard temperature (273 K) and pressure
Table 13.1 Predicted values of specific heat capacities of gases
(ignoring vibrational modes),

Table13.2 Measured values of specific heat capacities of some gases


(1 atm = 1.01 105 Pa) occupies a volume of 22.4 litres. This
universal volume is called molar volume. Thus the cylinder in this
example contains 2 mol of helium. Further, since helium is monatomic,
its predicted (and observed) molar specific heat at constant volume,
Cv = (3/2) R, and molar specific heat at constant pressure, Cp = (3/2)
R + R = (5/2) R. Since the volume of the cylinder is fixed, the heat
required is determined by Cv. Therefore,
Heat required = no. of moles molar specific heat rise in
temperature
= 2 1.5 R 15.0 = 45 R
= 45 8.31 = 374 J. t

13.6.4 Specific Heat Capacity of Solids


We can use the law of equipartition of energy to determine specific
heats of solids. Consider a solid of N atoms, each vibrating about its
mean position. An oscillation in one dimension has average energy of
2 kBT = kBT . In three dimensions, the average energy is 3 kBT.
For a mole of solid, N = NA, and the total energy is

U = 3 kBT NA = 3 RT
Now at constant pressure Q = U + PV
= U, since for a solid V is negligible. Hence,

(13.37)

Table 13.3 Specific Heat Capacity of some solids at room temperature


and atmospheric pressure

As Table 13.3 shows the prediction generally agrees with


experimental values at ordinary temperature (Carbon is an exception).

13.6.5 SPECIFIC HEAT CAPACITY OF WATER


We treat water like a solid. For each atom average energy is 3kBT.
Water molecule has three atoms, two hydrogen and one oxygen. So it
has
U = 3 3 kBT NA = 9 RT
and C = Q/ T = U / T = 9R .
This is the value observed and the agreement is very good. In the
calorie, gram, degree units, water is defined to have unit specific heat.
As 1 calorie = 4.179 joules and one mole of water is 18 grams, the
heat capacity per mole is
~ 75 J mol-1 K-1 ~ 9R. However with more complex molecules like
alcohol or acetone the arguments, based on degrees of freedom,
become more complicated.
Lastly, we should note an important aspect of the predictions of
specific heats, based on the classical law of equipartition of energy.
The predicted specific heats are independent of temperature. As we
go to low temperatures, however, there is a marked departure from
this prediction. Specific heats of all substances approach zero as T 0.
This is related to the fact that degrees of freedom get frozen and
ineffective at low temperatures. According to classical physics
degrees of freedom must remain unchanged at all times. The
behaviour of specific heats at low temperatures shows the inadequacy
of classical physics and can be explained only by invoking quantum
considerations, as was first shown by Einstein. Quantum mechanics
requires a minimum, nonzero amount of energy before a degree of
freedom comes into play. This is also the reason why vibrational
degrees of freedom come into play only in some cases.

13.7 MEAN FREE PATH


Molecules in a gas have rather large speeds of the order of the speed
of sound. Yet a gas leaking from a cylinder in a kitchen takes
considerable time to diffuse to the other corners of the room. The top
of a cloud of smoke holds together for hours. This happens because
molecules in a gas have a finite though small size, so they are bound
to undergo collisions. As a result, they cannot move straight
unhindered; their paths keep getting incessantly deflected.
Seeing is Believing

Can one see atoms rushing about. Almost but not quite. One can see pollen
grains of a flower being pushed around by molecules of water. The size of the
grain is ~ 10-5 m. In 1827, a Scottish botanist Robert Brown, while examining,
under a microscope, pollen grains of a flower suspended in water noticed that
they continuously moved about in a zigzag, random fashion.

Kinetic theory provides a simple explanation of the phenomenon. Any object


suspended in water is continuously bombarded from all sides by the water
molecules. Since the motion of molecules is random, the number of molecules
hitting the object in any direction is about the same as the number hitting in the
opposite direction. The small difference between these molecular hits is
negligible compared to the total number of hits for an object of ordinary size,
and we do not notice any movement of the object.

When the object is sufficiently small but still visible under a microscope, the
difference in molecular hits from different directions is not altogether negligible,
i.e. the impulses and the torques given to the suspended object through
continuous bombardment by the molecules of the medium (water or some
other fluid) do not exactly sum to zero. There is a net impulse and torque in
this or that direction. The suspended object thus, moves about in a zigzag
manner and tumbles about randomly. This motion called now Brownian
motion is a visible proof of molecular activity. In the last 50 years or so
molecules have been seen by scanning tunneling and other special
microscopes.

In 1987 Ahmed Zewail, an Egyptian scientist working in USA was able to


observe not only the molecules but also their detailed interactions. He did this
by illuminating them with flashes of laser light for very short durations, of the
order of tens of femtoseconds and photographing them. ( 1 femto- second =
10-15 s ). One could study even the formation and breaking of chemical bonds.
That is really seeing !

Fig. 13.7 The volume swept by a molecule in time t in which any molecule will
collide with it.

Suppose the molecules of a gas are spheres of diameter d. Focus on


a single molecule with the average speed <v>. It will suffer collision
with any molecule that comes within a distance d between the centres.
In time t, it sweeps a volume d2 <v> t wherein any other molecule
will collide with it (see Fig. 13.7). If n is the number of molecules per
unit volume, the molecule suffers nd2 <v> t collisions in time t.
Thus the rate of collisions is nd2 <v> or the time between two
successive collisions is on the average,
= 1/(n <v> d2 ) (13.38)
The average distance between two successive collisions, called the
mean free path l, is :

l = <v> = 1/(nd2) (13.39)


In this derivation, we imagined the other molecules to be at rest. But
actually all molecules are moving and the collision rate is determined
by the average relative velocity of the molecules. Thus we need to
replace <v> by <vr> in Eq. (13.38). A more exact treatment gives

(13.40)
Let us estimate l and for air molecules with average speeds <v> = (
485m/s). At STP

n=
= 2.7 10 25 m 3.
Taking, d = 2 1010 m,
= 6.1 1010 s
and l = 2.9 107 m 1500d (13.41)

As expected, the mean free path given by Eq. (13.40) depends


inversely on the number density and the size of the molecules. In a
highly evacuated tube n is rather small and the mean free path can be
as large as the length of the tube.

Example 13.9 Estimate the mean free path for a water molecule in water
vapour at 373 K. Use information from Exercises 13.1 and Eq. (13.41) above.

Answer The d for water vapour is same as that of air. The number
density is inversely proportional to absolute temperature.

So

Hence, mean free path t


Note that the mean free path is 100 times the interatomic distance ~
40 = 4 10-9 m calculated earlier. It is this large value of mean free
path that leads to the typical gaseous behaviour. Gases can not be
confined without a container.
Using, the kinetic theory of gases, the bulk measurable properties like
viscosity, heat conductivity and diffusion can be related to the
microscopic parameters like molecular size. It is through such
relations that the molecular sizes were first estimated.

SUMMARY

1. The ideal gas equation connecting pressure (P), volume (V) and absolute
temperature (T ) is

PV = RT = kB NT

where is the number of moles and N is the number of molecules. R and kB


are universal constants.

R = 8.314 J mol1 K1, kB = = 1.38 1023 J K1

Real gases satisfy the ideal gas equation only approximately, more so at low
pressures and high temperatures.

2. Kinetic theory of an ideal gas gives the relation


where n is number density of molecules, m the mass of the molecule and

is the mean of squared speed. Combined with the ideal gas equation it
yields a kinetic interpretation of temperature.

This tells us that the temperature of a gas is a measure of the average kinetic
energy of a molecule, independent of the nature of the gas or molecule. In a
mixture of gases at a fixed temperature the heavier molecule has the lower
average speed.

3. The translational kinetic energy

E= kB NT.

This leads to a relation

PV = E

4. The law of equipartition of energy states that if a system is in equilibrium at


absolute temperature T, the total energy is distributed equally in different
energy modes of absorption, the energy in each mode being equal to kB T.
Each translational and rotational degree of freedom corresponds to one
energy mode of absorption and has energy kB T. Each vibrational frequency
has two modes of energy (kinetic and potential) with corresponding energy
equal to

2 kB T = kB T.

5. Using the law of equipartition of energy, the molar specific heats of gases
can be determined and the values are in agreement with the experimental
values of specific heats of several gases. The agreement can be improved by
including vibrational modes of motion.

6. The mean free path l is the average distance covered by a molecule


between two successive collisions :

where n is the number density and d the diameter of the molecule.

POINTS TO PONDER
1. Pressure of a fluid is not only exerted on the wall. Pressure exists
everywhere in a fluid. Any layer of gas inside the volume of a container is in
equilibrium because the pressure is the same on both sides of the layer.

2. We should not have an exaggerated idea of the intermolecular distance in a


gas. At ordinary pressures and temperatures, this is only 10 times or so the
interatomic distance in solids and liquids. What is different is the mean free
path which in a gas is 100 times the interatomic distance and 1000 times the
size of the molecule.

3. The law of equipartition of energy is stated thus: the energy for each degree
of freedom in thermal equilibrium is kB T. Each quadratic term in the total
energy expression of a molecule is to be counted as a degree of freedom.
Thus, each vibrational mode gives 2 (not 1) degrees of freedom (kinetic and
potential energy modes), corresponding to the energy 2 kB T = kB T.

4. Molecules of air in a room do not all fall and settle on the ground (due to
gravity) because of their high speeds and incessant collisions. In equilibrium,
there is a very slight increase in density at lower heights (like in the
atmosphere). The effect is small since the potential energy (mgh) for ordinary
heights is much less than the average kinetic energy mv2 of the molecules.
5. < v2 > is not always equal to ( < v >)2. The average of a squared quantity is
not necessarily the square of the average. Can you find examples for this
statement.

EXERCISES
13.1 Estimate the fraction of molecular volume to the actual
volume occupied by oxygen gas at STP. Take the diameter of an
oxygen molecule to be 3 .
13.2 Molar volume is the volume occupied by 1 mol of any (ideal)
gas at standard temperature and pressure (STP : 1 atmospheric
pressure, 0 C). Show that it is 22.4 litres.
13.3 Figure 13.8 shows plot of PV/T versus P for 1.00103 kg of
oxygen gas at two different temperatures.

Fig. 13.8

(a) What does the dotted plot signify?


(b) Which is true: T1 > T2 or T1 < T2?
(c) What is the value of PV/T where the curves meet on the y-
axis?

(d) If we obtained similar plots for 1.00103 kg of hydrogen,


would we get the same value of PV/T at the point where the
curves meet on the y-axis? If not, what mass of hydrogen yields
the same value of PV/T (for low pressurehigh temperature region
of the plot) ? (Molecular mass of H2 = 2.02 u, of O2 = 32.0 u, R =
8.31 J mo11 K1.)

13.4 An oxygen cylinder of volume 30 litres has an initial gauge


pressure of 15 atm and a temperature of 27 C. After some
oxygen is withdrawn from the cylinder, the gauge pressure drops
to 11 atm and its temperature drops to 17 C. Estimate the mass
of oxygen taken out of the cylinder (R = 8.31 J mol1 K1,
molecular mass of O2 = 32 u).
13.5 An air bubble of volume 1.0 cm3 rises from the bottom of a
lake 40 m deep at a temperature of 12 C. To what volume does it
grow when it reaches the surface, which is at a temperature of 35
C ?

13.6 Estimate the total number of air molecules (inclusive of


oxygen, nitrogen, water vapour and other constituents) in a room
of capacity 25.0 m3 at a temperature of 27 C and 1 atm pressure.
13.7 Estimate the average thermal energy of a helium atom at (i)
room temperature
(27 C), (ii) the temperature on the surface of the Sun (6000 K),
(iii) the temperature of 10 million kelvin (the typical core
temperature in the case of a star).
13.8 Three vessels of equal capacity have gases at the same
temperature and pressure. The first vessel contains neon
(monatomic), the second contains chlorine (diatomic), and the
third contains uranium hexafluoride (polyatomic). Do the vessels
contain equal number of respective molecules ? Is the root mean
square speed of molecules the same in the three cases? If not, in
which case is vrms the largest ?
13.9 At what temperature is the root mean square speed of an
atom in an argon gas cylinder equal to the rms speed of a helium
gas atom at 20 C ? (atomic mass of Ar = 39.9 u, of He = 4.0 u).
13.10 Estimate the mean free path and collision frequency of a
nitrogen molecule in a cylinder containing nitrogen at 2.0 atm and
temperature 17 0C. Take the radius of a nitrogen molecule to be
roughly 1.0 . Compare the collision time with the time the
molecule moves freely between two successive collisions
(Molecular mass of N2 = 28.0 u).
Additional Exercises
13.11 A metre long narrow bore held horizontally (and closed at
one end) contains a 76 cm long mercury thread, which traps a 15
cm column of air. What happens if the tube is held vertically with
the open end at the bottom ?
13.12 From a certain apparatus, the diffusion rate of hydrogen has
an average value of 28.7 cm3 s1. The diffusion of another gas
under the same conditions is measured to have an average rate of
7.2 cm3 s1. Identify the gas.
[Hint : Use Grahams law of diffusion: R1/R2 = ( M2 /M1 )1/2, where
R1, R2 are diffusion rates of gases 1 and 2, and M1 and M2 their
respective molecular masses. The law is a simple consequence of
kinetic theory.]
13.13 A gas in equilibrium has uniform density and pressure
throughout its volume. This is strictly true only if there are no
external influences. A gas column under gravity, for example, does
not have uniform density (and pressure). As you might expect, its
density decreases with height. The precise dependence is given
by the so-called law of atmospheres
n2 = n1 exp [ -mg (h2 h1)/ kBT]
where n2, n1 refer to number density at heights h2 and h1
respectively. Use this relation to derive the equation for
sedimentation equilibrium of a suspension in a liquid column:
n2 = n1 exp [ -mg NA ( - P) (h2 h1)/ ( RT)]
where is the density of the suspended particle, and that of
surrounding medium. [NA is Avogadros number, and R the
universal gas constant.] [Hint : Use Archimedes principle to find
the apparent weight of the suspended particle.]

13.14 Given below are densities of some solids and liquids. Give
rough estimates of the size of their atoms :

[Hint : Assume the atoms to be tightly packed in a solid or liquid


phase, and use the known value of Avogadros number. You
should, however, not take the actual numbers you obtain for
various atomic sizes too literally. Because of the crudeness of the
tight packing approximation, the results only indicate that atomic
sizes are in the range of a few ].

S-ar putea să vă placă și