Sunteți pe pagina 1din 319

Course Code: MBCQ-721

Course Name: Quantitative Techniques for Management Applications


UNIVERSITY OF PETROLEUM & ENERGY STUDIES (for this print run only)

Unit 1

The Decision Making Process ......................................................................... 1

Unit 2

Functions and Equations ............................................................................... 11

Unit 3

Matrices and Determinants .......................................................................... 43

Unit 4

Probability Concepts ...................................................................................... 85

Unit 5

Decision Theory ........................................................................................... 133

Unit 6

Linear Programmimg ................................................................................... 141

Unit 7

Transportation & Assignment Models ....................................................... 155

Unit 8

Game Theory ................................................................................................ 169

Unit 9

Markovian Model ......................................................................................... 175

Unit 10

Data Collection and Presentation ............................................................... 179

Unit 11

Sampling ....................................................................................................... 207

Unit 12

Basic Tools of Data Analysis ....................................................................... 227

Unit 13

Forecasting ................................................................................................... 245


Appendix ....................................................................................................... 311

1
Notes
__________________
__________________
__________________
__________________
__________________
__________________

Objectives

__________________
__________________

After reading this unit, you will be able to:

Understand how quantitative decisions are made

Understand the importance of variables

Comprehend the importance of eight decision making tools

Every decision making task results in an output which is


the evidence of the decision taken. In industry, it is ultimately
some kind of product, that is, a good service or on idea. The
reasoning takes place in the decision making rectangle which
is sometimes referred to as, quite appropriately, the black
box. Here a transformation of the inputs takes place that
results in the output. The transformation process has both
physical and mental properties. On the input side a large
number of variables may be listed. These variables can be
classified in terms of the traditional factors of production,
i.e., land, labour and capital as well as the more recently
emerged complex variables related to systems, technology
and entrepreneurship. Underlying this input-output system
is a feedback loop identified as managerial control system.
Its function is to optimize the transformation of inputs into
the desired output. Seen in a nutshell, in industry
optimization means the minimization of costs and the
maximization of profits subject to legal, social and ideological
constraints.
The computer has forced the decision maker to very carefully
delineate and quantify the variables that makeup the building
blocks of the decision task. What is needed and how much is

__________________
__________________

2
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

needed for decision optimization have become the important


questions. In addition, the proper time sequencing of the
decision variables within the decision process had to be
understood. And all answers had to be unequivocally
quantified. It soon became apparent to every decision maker
that quantified variables had different properties and specific
quantitative control mechanisms had to be designed. Not only
was the decision maker confronted with variable-inherent
properties, the decision tasks themselves have such peculiar
quantitative properties.
A variable, the building block of the decision task, may be
seen as a small piece of a complex behaviour. Buying a house,
manufacturing a product, spending money on a show are
examples of variables. Each variable represents a distinct
dimension of the decision making task. So the decision space
is always multidimensional, and it is a major task for the
decision maker to find out which variables make up that
space. If an important variable is overlooked, obviously the
decision will be less than optimal. Furthermore, the
quantitative impact of the variable must be ascertained. And
here the special variable-inherent properties come into play.
The following illustrations may show the differences among
the three types of variables.
Deterministic variables can be measured with certainty. Thus,
equal measures have equal cumulative impact, or, to use a
simple illustration, a+a = 2a.
Stochastic variables are characterized by uncertainty. Thus,
a+a=2a+X, where X is a value that comes about because of
the uncertainty that is associated with the variable.
Heuristic variables are those that exist in highly complex,
unstructured, perhaps unknown decision making situations.
The impact of each variable may be explained contingent upon
the existence of a certain environment. For example,
a + a = 3a but only if certain conditions hold. Actual industrial
decision making situations in each case may involve the
number of
gallons of aviation fuel obtained by cracking
a barrel of crude oil (deterministic), projected product
sales given amount spent on advertising the product
(stochastic) and the construction of a platform in outer space
(heuristic).

3
Notes
__________________

The reason for the existence of a managerial hierarchy, that


is, lower, middle and top management, finds itself in different
parameters in which an organization operates. There are
industry-wide and market-wide decisions that have to be
made. Often these decisions must transcend domestic
considerations to incorporate international aspects. Such
decisionsusually made by top managementoccur in a
broad-based, complex, ill-defined and non-repetitive problem
situation. Middle management usually addresses itself to
company-wide problems. It sees to it that the objectives and
policies of the organization are properly implemented and
that operations are conducted in such a way that optimization
may occur.
You may note that while most of the quantitative decision
making tools indeed virtually all of the deterministic
toolswere developed to optimize the decision making
process, actual managerial practice has sometimes moved
away from that objective. The previously mentioned legal or
social constraints often at times do not permit optimization
and satisfying has been substituted for it. Satisfying refers
to the attainment of certain minimum objectives. For
example, a company that may have the economic and
technological power to smother the competition within its
industry but refrains from doing so because of MRTP
considerations. Big size per se may be considered in violation
of the law or in the international arena, may result in the
imposition of quotas.
Lower management is responsible for the conduct of
operationsthe firing line so to speak is thisin production,
marketing, finance or any of the staff functions like personnel
or research. This decision environment is usually welldefined and repetitive. Obviously, with reference to a given
decision making situation, the distinction between top,
middle and lower management may become blurred. In other
words, in any on-going business there is always a certain
overlapping of the managerial decision making parameters.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

4
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The study and analysis of the existence and interaction of


these parameters is of great importance to the management
systems designer or communication expert. From the
quantitative managerial decision making point of view, their
importance lies in recognizing their peculiar constraints and
then to build the appropriate decision models and to select
the best suited quantitative decision tools. A brief discussion
of each environment in this light may enhance the
understanding of the tools that are discussed later on. The
companys approach to the domestic or international market
is filtered through industry-wide considerations. What does
the market want, what does the competition already supply?
Where is our field of attack? Do we have the know how, do
we have the resources? What is the impact of our actions
upon the market, our own industry and other industries?
These are some of the questions that have to be asked,
defined and answered. The problems are unstructured and
complex. Thus, often a heuristic decision making process can
be utilized to good advantage. Forecasting is of major
importance and hence stochastic decision making is widely
employed in this uncertain decision environment. But even
a deterministic toolusually intended for decision making
situations that assume certaintyinput-output analysis, can
be effectively used in this environment.
Middle management decisions are primarily company-wide
in nature. As mentioned before, these decisions steer the
organization through its life cycle.
Major features of a firms life are objectives, planning,
operation and the ultimate dissolution. The objectives are
general and specific in nature. Obviously, the top management
establishes the objectives, but middle management functions
as their guardian. Indeed, every decision at this level must
provide feedback control for each of the other components.
Planning refers to both policy execution as well as policy
development. Scale of production, pricing of the product,
product mix, in short the orderly and efficient arrangement
of the input factors is to be decided at this point. Making
these factors into a product is the job of operations. Some
operations have been traditionally called line

YO

OLEUM &

n ~;

k~ e
k ; k kf D r o

ST U DI ES

fo

EN

GY

UNIV ER SIT

E TR

ER

FP

UNIT 1 The Decision Making Process

(financing, production, and marketing) and others staff


(personnel, research, etc.); yet, in the quantitative decision
systems of the modern firm, such differences are difficult to
trace in the decision patterns, because the same decision,
making tools are employed. Since the decision environment
at this level is somewhat more structured than at the top
level but still highly uncertain, stochastic decision tools are
frequently employed. In those finance, production and
marketing situations that can be well-defined, may be
repetitive, deterministic decision tools are found.
It may appear somewhat odd that the decision environment
includes attention being paid to the dissolution of the firm.
The life cycle concept has been mentioned, and it will be
encountered again as one of the major underlying conceptual
aids in forecasting. It is well known that business
organizations are born, live and die like natural organisms.
Therefore, decision making should always be cognizant of
the possibility of dissolution. That moment comes when, to
use the vernacular, good money is thrown after bad. While
market forces and the application of quantitative analysis
normally show the approaching occurrence of that
momenteven if the management involved shuts its eyes to
the facts or is ignorant about themat this point the decision
is made or superimposed to opt for a turnaround or
dissolution. Public agencies unfortunately are rarely subject
to such stress producing alternatives.
The lower management decision making environment
represents a specialized, narrowly defined area within a
companys total decision or operational field. Supervisory
personnel of all types are operating in this environment. The
decision tasks are normally well defined and repetitive.
While the element of uncertainty never leaves the decision
environment, here uncertainty can often be programmed into
a general or sub-routine and stochastic decisions taken as if
they were deterministic in nature. A good example is the
pricing system of clothing discounters. Merchandise is put
on the floor at price A on day one. On, say, day ten the price
is automatically reduced to price B and so on until the article
is either sold or given to charity after thirty days. This is
known as programmed decision making. It should be noted

5
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

6
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

that while the nature of the decision environment remains


intact, the decision makers tasks have been greatly reduced.
The complex variables and unstructured decision
environment of the merchandising task have been placed
first into a model and then into decision making sequence
(algorithm). This is the general idea behind model building
and the development of algorithms.
It is highly important that every decision maker has a firm
understanding of the philosophy upon which quantitative
decision making is based. Under no circumstances is it
sufficient to just know how to perform a certain quantitative
analysis and to obtain a solution to be able to make a decision.
To turn to the specific aspects of the quantitative decision
making process, it is possible to recognize three distinct
phases in every decision situation. Given a carefully defined
problem, a conceptual model is generated first. This is
followed by the selection of the appropriate quantitative
model that may lead to a solution. Lastly, a specific algorithm
is selected. Algorithms are the orderly delineated sequences
of mathematical operations that lead to a solution given the
quantitative model that is to be used. The algorithms
generate the decision which is subsequently implemented
by managerial action programs.

Problem definition is a cultural artifact which is especially


visible in a societys economic and industrial decision making
process.
Obviously, if such cultural determinants are operative in the
first phase of managerial decision making, their effect can
be noticed at various stages in the process irrespective of
the quantitative, thus hopefully objective, methods that are
used in the design of the models and algorithms as well as
the decision itself.
A brief digression into problem identification may be in order
at this point. For purposes of this book and for quantitative
management decision methodology in general, it is
presupposed that a problem has been identified.
In the private sectors of free enterprise economies, however,
a managers ability to recognize problems and even to

anticipate problems that may emerge at some future time is


vital to the survival of the firm. Those managers that make
effective decision concerning a known problem are good
administrators; those that in addition can recognize and
anticipate problems are creative. It is known that creativity
is partially inborn and partially acquired. Thus, the
quantitative decision maker will not only try to master the
methodology but also attempt to sharpen his or her problem
identification skillshis or her creativity.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The conceptual model represents the logic that underlies a


decision. Based on this logic the quantitative model and
specific algorithms are constructed. The logic may be a priori
or empirical in nature, e.g., when shooting craps in a casino,
a gambler has pre-established a conceptual model concerning
the odds of the game. On a priori groundusing only his or
her intellectin determining the odds of every roll of the
dice, the concept dictates that the win of a seven or eleven
on the first roll has likelihoods of 6/36 and 2/36 , respectively.
(There are 6 possible combinations of spots showing on 2 dice
that yield a seven and 2 combinations that yield an eleven
with 36 combinations for all spots from two through twelve.)
Given this conceptual model, quantitative models and
algorithms can be designed that facilitate the betting
decision.
Now suppose that our gambler stumbles across a floating
craps game in some dark alley. After observing the action on
the pavement for a while, he notices that sevens and elevens
do not occur on the first roll with the likelihood dictated by
his conceptual model. Rather there seems to be a
preponderance of twos, threes or twelveswhich he knows
are losses. Crooked dice, he may very quietly think to himself.
For crooked dice, an a priori logic which is based on the ideal
situation in which every spot on a dice has an equal
probability of occurring (1/6) and any spot on two dice as well
(1/6 1/6 = 1/36) according to the multiplication theorem) is
unsuitable. Rather, he will now ascertain by observation (by
experiment) the empirical probabilities which are
determined by the weights that have been cleverly or crudely

__________________

8
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(it is a dark alley) concealed in or on the dice. Once this


empirical conceptual model has been generated, our gambler
may continue the betting decision process in terms of the
amount of the bets at each roll, etc. He may also redefine the
problem and leave.
Conceptual models may take many forms. In every case the
general design intent is to understand and to depict the
reality that relates to the problem. Most conceptual models
show a functional relationship in graphic or matrix format.
All models that are used in this book are of this type. But it
is also possible, indeed necessary in some decision cases, to
build a physical model. In the natural and engineering
sciences, it is the usual form. If the decision involves mass
production of some item, the physical model is known as a
prototype.
In the design of the conceptual model, it is important to
observe that the decision maker clearly delineates the
interrelationships that make up the realityor the systemsin
which the problem occurs. But in the model building process
it is virtually impossible to include all variables that have a
bearing on the decision. The model includes only the major
variables (endogenous variables) as seen from the decision
makers vantage point. There will be always decision-related
variables that exist outside of the decision space (exogenous
variables) because of their unrecognized status or conscious
exclusion due to time, cost or limited impact considerations.
Such variables should be kept mentally ready because over
a set decision horizon they may indeed become sufficiently
important to be included into the system.
Once the conceptual model has been designed and its logic
expressed in terms of some systems configuration such as
the graph or matrix or perhaps network or flow diagram,
the quantitative models are simply superimposed by
quantifying the logic. Once that has been accomplished a
relatively minor task remains in the selection of the
algorithms and the computerization of the process. Many a
decision process has been needlessly and most of time
injuriously to some extent, commenced because of faulty
problem definition or poor conceptual model building. Then

there is no optimal or even satisfying outcome. To put it


simply, number crunching and possible error correction is
relatively easy, even though the reader may not immediately
share this view as he or she does just that in the chapters
that follow. Only the difficult tasks, that is, sound insight
into the problem and its careful definition as well as proper
logic employed in the conceptual model building process, will
yield sound decisions and outcomes. Here errors are very
difficult to correct.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Once the conceptual model has been properly designed, the


quantitative model and its algorithms should almost flow
out of it. The transition is natural, smooth, and almost
automatic. The quantitative model is selected from the many
such models that have been designed by mathematicians. So
while the decision maker will always build a conceptual
model, the quantitative model is typically selected from an
available pool of such decision making tools. The selection is
made on the basis of the predominantly stochastic,
deterministic or heuristic nature of the variables. There are
available quantitative models for each kind as discussed in
the following chapters, and the decision makers task is to
select the appropriate one for a given decision situation.
Know thy tools should be inscribed on every decision
makers desk. As it is possible to build a wall with a spade
when the trowel would be the more appropriate tool, decision
makers may sometimes misuse quantitative tools.

A decision always involves choice among several alternatives.


In the most basic sense a decision always involves the answer
to the question to do or not to do? Not to do (inaction)
determines that decision. To do (action) usually involves
different options. The mathematical model identifies the
optimal way, but for a variety of reasons, other satisfying
options may be selected and acted upon. These other options
are firmly rooted in an organizations objectives and planning
activities. As shown in greater detail later, a decision maker

__________________

10
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

always has control over setting the objective and planning


which interfaces with policies, strategies and tactics. But
one has no control over the reaction to the decision within
the market environment. Here various states, collectively
known as the states of nature, emanating from customers,
suppliers, competitors, public agencies, etc., render the final
judgment about the soundness of the decision.The decision
is the end product of a sequence of mental activities as
illustrated in the preceding pages. To make a decision does
not necessarily mean that it gets carried out. In order to
accomplish that, numerous managerial action programs are
necessary. They represent the physical extension to the
decision making process. This book stops at the point when
the decision is rendered. The action programs, the physical
component, cannot be discussed because they must be
specifically designed for each situation. A good decision
maker, however, will try to place the seeds for proper
implementation into the decision.

11
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Objectives

__________________
After reading this unit, you will be able to:

__________________

Understand the basic rules of functions

Analyze how equations are formed and solved

Study various types of functions applicable in management

__________________

When we are talking about functions, we are not talking


about marriage ceremonies or birthday parties but are talking
about certain types of quantitative relationships between
different variables mathematically. For example, sales
revenue is a function of items sold and their price. We all
know that. If we express it in function form it would look
like this:
Sales Revenue = f (No. of items sold)

(Function Form)

Sales Revenue = f (Price per item)

(Function Form)

Function form only tells us that there is a relationship


between the variables. Here sales revenue, items sold and
price per item are variables.
Variables are the terms used for mathematical quantities that
can assume any values within a given set. The set of values
of the variable is known as the domain of the variable, which
could be limited (as is the case with water temperature which
can vary between 0 oC to 100oC) or can be unlimited (as
distance).

12
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Variables can be independent or dependent. Independent


variables are those variables whose values are not governed
by the values of another variable. In the case above, number
of items sold and price per item are independent variables.
Variables whose values are dependent on the values taken
by another variable are called the dependent variables, e.g.
sales revenue above.
In mathematics we say that whenever any variable Y is
dependent on another variable X for its values then variable
Y is a function of variable X. This means that whenever the
value of variable X would change then there would be a
corresponding change in the value of variable Y.
Mathematically this is denoted as:
Y = f (X)
Now this relation between X and Y can take up any form,
e.g., let us say Y is double of X. This means when X takes the
value 2 than Y takes the value 2 2 = 4. So in the above
function X is called the independent variable and Y is called
the dependent variable.
Note that the function does not specify the exact relationship
between X and Y, it only tells you that a relationship exists.
The exact relationship is defined by equation, something that
we will see later. As stated above, the set of values of
independent variable is called the domain. The set of values
of dependent variable is called the range.
In the problems of business applications, generally the range
and domain will be real numbers.
How do we apply these functions and equations in the
business? The sales revenue example gives us a clue. Most
of the activities in the business are dependent upon
some other events happening. For example, if we want
to know how the share prices move, we have to understand
that the stock prices are dependent upon the
economic situation, industry conditions and company
performance.

13

So the share price would be a function of all these factors


and the function would look like the following:
Share Price = f (economy, industry situation, company
performance)

Notes
__________________
__________________
__________________
__________________

Therefore, we can use functions to define any relationships


that exist between different processes and which can be
measured mathematically. These models can be applied in
making decisions when the business situations are complex.
An important point to remember here is that, the functions
do not imply a simple direct relationship.
The applications of these models are manifold and as diverse
as the business situations we are in. The biggest benefit of
using functions and equations for defining the relationships
is that we understand how exactly the processes work and
what are the constraints we are dealing with. For example,
when we define the production as a function of number of
machines and time and we know that we can produce a
maximum of 500 units of a product per hour per machine. So
this becomes our constraint and helps us in planning and
scheduling. When we know that we need 5000 units of that
product in 2 hours, we can either use five machines (so that
each machine gives us 1000 pieces in 2 hours) or we get the
work done from outside. Of course, this is a very simple
situation and we will learn much more complex methods of
mathematical treatments of solving complex situations.
Writing the relationship in function and equation form above,
makes it easy for us to understand how the values of the
dependent variables change because of a change in any of
the independent variables on which it is based.
Most of the quantitative methods we will be dealing with in
this book, would make use of mathematical models, some of
which we will develop as we go along. We will only take the
basic mathematical concepts as granted. Rest of the concepts
would be dealt with as if we are talking to a novice.

We said earlier that the mathematical models of the


functions can take up different forms but we did not discuss

__________________
__________________
__________________
__________________
__________________
__________________

14
Notes
__________________
__________________

the forms there. Now let us understand what are the different
forms that these mathematical functions can take up.
1.

__________________
__________________

Constant Function: Let A denote a fixed number.


Consider that the function X has this value A for the
value of X. So this constant function can be denoted by:

__________________

f (X) = A for all X

__________________

or more briefly by f (X) = A and we call it the constant


function A

__________________
__________________
__________________

Therefore, there can be many constant functions.

__________________

2.

Identity Function: The function that associates to each


number X. The same number X is called the identity
function. This is denoted by:
Y = f (X) = X for all X

3.

Linear Function: Consider the expression f(Y) Y = 2X+


10. For each number A, this associates the number 2A +
10 to Y, got by substituting A for X. If X = 2 then the
value of Y is 4+10 = 14. This function tells us about the
relationship between the two variables X and Y.

4.

Quadratic/Polynomial Function: Consider the


expression f(Y)X2 + 2X + 10. For each number A, this
associates the number A2 + 2A + 10 to Y. This is a
polynomial function and is used for solving complex
situations.

5.

Exponential Function: The function that associates


the number e x to each real number x is called the
exponential function. The properties of this function are
given below:
ex+y = ex ey
e-x = 1/ex
elog x = x
Here e= 1+
and ex = 1 +

+ ..........
+

ex is always positive.

......

15

6.

Logarithmic Function: Log X is that number Y such


that ey = x. Log x is defined only when x is positive. The
function that associates log x to x is called the logarithmic
function. Its domain is the set of positive real numbers.
Log x=y implies loge x=y and real as log x on base eis
equal to y which implies ey = x but in frequent reading
and writing we tend to take the base as understood. But
if we are using two different bases then it must be
mentioned. The two popular bases are e and 10. The
properties of log function are given below:
Log 1 = 0
Log (xy)= Log x + Log y

7.

Modulus Function: For each real number X let |X|


denote the absolute value of X. It is also known as mod
X.
|X|= X if X 0, and
|X|= -X if X < 0
For example |4| = 4 and |-4| = (-4) = 4

These are only some of the examples of the different type of


functions that can be there. These functions are translated
into equations for use so that exact relationships are defined.

While the functions tell us that a relationship exists,


equations give us the exact relationship between the
variables. These equations take many forms and have one or
more variables. For example, we can say that Sales Revenue
y = Number of items sold NX price per item P. This is an
equation with three variables Y, N, P and defines an exact
relationship amongst them. Here sales revenue Y is a
dependent variable and other two are independent variables.
Here we will discuss only linear and quadratic forms of
equations.

Linear Equation may be defined as an equation where the


power of the variable(s) is one, and no cross or product terms
are present.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

16
Notes
__________________
__________________

The general expressions of these linear equations look like


the following:
AX + B = 0

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Here X = independent variable, A and B are numeric


coefficients
This definition is a working definition.
Note that it is an accepted convention in mathematics
that letters from the beginning of the alphabet are used to
typify known quantities and letters from the end of the
alphabet are used to represent unknown quantities. In a
linear equation, A and B are real numbers which can be either
positive or negative and may involve fractions or decimals;
it is also possible that B can be zero but A cannot be zero for
then it is not an equation and B also has to be zero.

A dealer sells a table and seven chairs for Rs 6,000. The price
of the table is known to be Rs 2,500, what is the price of one
chair?

If X represents the price of each chair in units of Re1 then


we can say that
7 chairs (Cost of each chair X) + Cost of table
(Rs.2,500) = Rs.6,000
Rs.7X + Rs.2,500 = Rs.6,000, or simply
7X + 2,500 = 6,000
7X - 6,000 + 2,500 = 0
7X - 3,500 = 0

(i.e., the form AX + B = 0)

7X = 3,500
X = 3,500 / 7 = 500
This means that each chair costs Rs 500.
The number of variables in a linear equation can be one or
more than one. For example, if
Y = AX + B

17
Notes

This equation can be rewritten as:

__________________

Y - AX - B = 0

__________________

That is the same form as we discussed earlier but it is an


equation with two variables.

__________________

The purpose of this form of the linear equation, with two


variables, is not to enable the problems to be solved but to
state the relationship between Y and X. The value of X can
vary from time to time, for some hypothetical future value of
X may be under consideration; in each case, the
corresponding value of Y is determined from the equation.

__________________

Extending the illustration 2.1, let Y represent the value of


the table, the equation can be rewritten as:
7X + Y = 6,000
If X takes up the values 200, 300, 400 and 500 we can easily
see that Y will take the values 4,600, 3,900, 3,200, 2,500
respectively, which are calculated by substituting the value
of X in the above equation.
Let us now plot this equation on the graph. On the graph the
values of chair (X) are represented by rupees along the
horizontal line and value of table (Y) along the vertical line.
Any convenient scales may be chosen to represent the two
variables; there is no reason why the same distance as used
to represent an increase of Rs100 on horizontal axis cannot
be used to represent Rs 1,000 on the vertical axis.
F ig u re X .1
G ra p h o f 7X + Y = 6,000
7000
6000
5000
Y

4000
3000
2000
1000
0
0

100

200

300

400

500

Figure 2.1: Graph of 7x + y = 6,000

__________________

__________________
__________________
__________________
__________________
__________________

18
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The point to be noted here is that both X and Y are dependent


on each other. If the value of X changes the value of Y also
changes and vice versa. This is because the net sum is a
constant.
These kinds of equations are very easy to draw on a graph.
You substitute zero for one variable to get the corresponding
value of the other variable. In this example when we
substitute zero for X we get Y = 6000 and when we substitute
zero for Y we get X = 6000 / 7 = 857.14. When we want to plot
this line on the graph we mark 6,000 on the vertical axis as
this represents X = 0.
Similarly, we mark 857.14 on the horizontal axis as this
represents Y = 0. Connecting these two points by a straight
line gives us the line which can be used to find the value of
one chair corresponding to value of one table, satisfying the
above equation.
The word linear is used to represent a function, which can
be represented by a straight line (and not any function which
can be represented by a line). Another point that must be
remembered is that although we are using letter Y to
represent the dependent variable, other letters can also be
employed for denoting variables. Like S can represent Sales,
D can represent Distance and so on.
Once we have the equation plotted on a graph we can very
easily find out the value of one variable from the given value
of another variable. In the illustration 2.1, we were given
the value of Y, i.e., the Fvalue
of one table as Rs 2,500. So we draw
ig u re X .1
G ra p h o f 7X + Y = 6,000
7000
6000
5000
Y

4000
3000
2000
1000
0
0

100

200

300

400

500

Figure 2.2: Graph of 7x + y = 6,000

19

a straight line from the vertical axis parallel to the horizontal


axis at the value Rs 2,500. The point where this line meets
the equation line is used to draw a line parallel to the vertical
line. Where this line meets the horizontal axis gives us the
value of one chair. So use of graphs makes it very easy for us
to solve these linear equations.
What happens when we are not given the value of both the
variables? We cannot solve the equation if only a single
equation is given but only give a range of values, which these
two variables can take, all of which satisfy the equation in
hand. But if we are given two simultaneous equations then
it makes it easy for us to calculate the exact values of these
two variables, which satisfy both the equations
simultaneously.
In general it is usually possible to solve a set of equations if
the number of variables is equal to number of equations,
except in a case where the equations overlap.

A manufacturer of printed fabrics has three machines, that


prepare raw fabric and five machines that print on it. Two
types of printed fabrics are produced; type A requires 3
minutes per meter to prepare and 6 minutes per meter to
print, while type B requires 11 and 17 minutes per meter
respectively. How much of each type of fabric should be
produced per hour in order to keep all the machines fully
occupied?

The quantities to be produced per hour can be represented


by X meters of type A and Y meters of type B. Then the
situation above can be summarized in two simultaneous
linear equations, one equation for each machine
3 X + 11 Y = 180

(1)

6 X + 17 Y = 300

(2)

The right-hand sides of these equations are obtained from


the fact that there are 180 machine-minutes available per
hour for preparing fabric (60 minutes x 3 machines) and 300
machine-minutes for printing (60 minutes 5 machines).

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

20
Notes

__________________

There are two ways of solving any pair of simultaneous linear


equations. The first method is by elimination and the second
method is of substitution.

__________________

Elimination Method:

__________________

__________________
__________________

It will be observed that 6X is exactly twice 3X and so the


first equation can be doubled to give:

__________________
__________________
__________________
__________________
__________________

6 X + 22 Y = 360
This is then subtracted from equation (2) to eliminate the
terms involving X :
- 5 Y = -60
Y = 12
Substituting this value of Y in equation (1):
3 X + 132 = 180
X = 16
Substitution Method:
The second method of solution is by substitution. Equation
(1) is rearranged so that one of the unknowns is expressed
in terms of the other:
3 X = 180 - 11 Y
X=
This formula for X is then substituted in equation (2):

) + 17 Y = 300

360 22Y + 17Y = 300


360 - 5 Y = 300
Y = 12
The value of X is then found using equation (1) and
substituting them in equation (2) can check both values.

21

The stepwise general procedure for solving these linear


equations is given below for your reference:
Stepwise Procedure for solving 2x2 simultaneous equations

Notes
__________________
__________________
__________________

1.

Eliminate one of the variables using any or both of the properties


specified below

__________________
__________________

Any linear equation can be multiplied or divided on both sides by

__________________

any number without altering its truth or meaning.

__________________

Any two linear equations can be added or subtracted (one from


the other) to give a third, equally valid, equation.

__________________
__________________
__________________

2.

Solve the resulting simple equation (to yield the value of the other
variable).

3.

Substitute this value back into one of the original equations, say
equation 1 ( to yield the value of the first variable).

4.

Check the solutions (by substituting both values into original equation
2).

Of course, graphical method can also be used to solve the


2x2 simultaneous equations. The first step is to draw the
two lines represented by the two given equations on the same
graph and the second step is to identify the X and Y values
at the intersection of the lines. These X and Y values are the
required solution for the pair of simultaneous equations.

The situation above was summarized in two simultaneous


linear equations:
3 X + 11 Y = 180

(1)

6 X + 17 Y = 300

(2)

Plotting the two equations simultaneously on the graph we


find that they intersect at values Y=12 and X=16. This
becomes the solution to the problem.
Similarly, 3 3 simultaneous equations can also be solved
using an extension of the same technique that we have used
above. There are other methods to solve these simultaneous

22
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Figure 2.3: Graph of 3X +11Y=180 and 6X + 17Y = 300

__________________
__________________

equations which we would discuss in subsequent chapters


on matrices and determinants and linear programming. The
basic method for solving these 3 x 3 equations mathematically
is given below:
Procedure for solving 3 x 3 simultaneous equations
1.

Using any two of the given equations, eliminate one of the variables
(using the equation-manipulating techniques previously described)
to obtain an equation in two variables.

2.

Using another pair of equations, eliminate the same variable as in


(1), which will give a second equation in two variables.

3.

Solve this 2 x 2 system of equations in the normal way.

4.

Substitute into one of the three original equations to find the value of
the third variable.

5.

Check the solutions by substituting the values of these variables in


the three equations.

A furniture manufacturer sends Company A a bill for Rs10,


700 to cover 3 tables, 4 chairs and 3 stools. Company B is
charged Rs14, 800 for 2 tables, 5 chairs and 7 stools. Company
C is charged Rs15, 100 for 5 tables, 9 chairs and 2 stools.
What are the respective prices for each of these items?

Representing the prices of one table, one chair and one stool

23

by Rs x, Rs y and Rs z respectively, the problem gives rise to


three simultaneous linear equations:
3 x + 4y + 3z = 107

(1)

Notes
__________________
__________________
__________________

2x + 5y + 7z = 148

(2)

__________________

5x + 9y + 2z = 151

(3)

__________________
__________________

These equations are still called linear even though each


could only be represented by a plane in a three-dimensional
model and not by a straight line on a two-dimensional graph.
The first step in their solution would be to multiply the first
equation by 2 and the second equation by 3 in order to
eliminate x and then subtracting first equation from the
second one:
6x+

8y + 6 = 214

6 x + 15y + 21z = 444


7y + 15z = 230
The second and third equations are then multiplied by 5 and
2 respectively in order to obtain a second equation in which
x has been eliminated. The two equations involving only y
and z are then solved as in illustration 2.2, to give y = 5, z =
13. Substituting these values in the first of the original
equations gives x = 16. Substituting them in the other two
original equations can check all three values.
Such equations can be solved much more easily using Matrix
concept, which is discussed later.

Let us start with an illustration.

The total production costs of a packaging machinery


manufacturer are found to be an average of Rs 60,000 per
day. The cost accountant finds that the fixed costs are Rs
32,000 per day and the direct costs average Rs 7000 per
machine. Calculate the average number of machines
produced per day?

__________________
__________________
__________________
__________________

24
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

This employs the accountants terms fixed costs and direct


costs and uses the accountants model:
Total costs = fixed costs + (direct costs quantity produced),
i.e., T = F + Dx
Let x represent the no. of machines sold, the above model
would look like:

__________________
__________________
__________________

Rs 60,000 = Rs 32, 000 + (Rs 7,000 x)

(1)

Dividing the equation by Rs 1,000 it reduces to


60 = 32 + 7x

(2)

In the above example, the model is very useful though


approximate, since the direct costs per machine will probably
vary quite widely. These types of models are used a great
deal and are regarded as absolute truth by top management.
But these models have their limitations as we will see.
Now if all the machines are sold at the same price, then the
revenue is a linear function of the quantity produced. Putting
R for revenue and p for price the function becomes
R = px
In the case of machine manufacturer, let us assume that the
selling price is Rs 18,000. By putting p = 18 (again assuming
Rs 1000 as the unit) a graph (Figure 2.4) can be drawn of
the revenue function. However, it is much more informative
to draw the line representing the cost function
and revenue function on the same graph, as shown in the
graph.
Extending the lines beyond their range in which their
practical usefulness is proved is called extrapolation; it is
a bad practice to extrapolate too far. The unreliable parts of
the lines on the graph are shown by broken lines and the
meaningless parts by dotted lines.

25
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Figure 2.4: Machine Manufacturers' Cost and Revenue


Functions

The difference between revenue and total production costs


can be described as gross profit G:
G=R T
=px (F+Dx) = 18x (32 + 7x)
= 11x 32
The break even point is when profit = 0, that is your revenue
is equal to your costs. Putting this in the above equation we
get:
11x 32 = 0
11x = 32
x=
So your average production should be 2.91 machines for you
to cover all your costs but make no profits.

26
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

This breakeven level can also be found from the graph where
your revenue and costs curves cross each other.
Alternatively, you can plot the equation (11x-32=0) and find
the value of x where the line meets the x-axis as at that point
the value of the function would be zero.
There are two possible ways in which you could have obtained
the information related to fixed and direct costs as the cost
accountant found. Either you take all the accounting records
and classify each cost into the two headings, a tedious and
time consuming process which is prone to error because of
limited accounting knowledge and problems of classification.

__________________

Or a quicker and better method would be to record the actual


total costs at two different levels of production and then find
linear cost function which fits these actual costs. For
instance, the records might show that the average total cost
per day was Rs 49,500 when production averaged 2.5
machines per day and it rose to Rs 63,500 when production
rose to an average of 4.5 machines per day.
All that is necessary is to insert these two values of T and
the corresponding values of x into the linear cost function,
defined above.
T = F + Dx
This gives two equations, involving two unknowns F and D
49.5 = F + 2.5D
63.5 = F + 4.5D
Solving these equations using the techniques already
described, we get D = 7 and F = 32.
Substituting these values in the above function we get T = 32
+ 7x, i.e., the same equation the cost accountant had with all
his information.

Supply and Demand Functions are an important field of study


for the economists. The amount of a particular product
which a firm is willing to supply at a specified price will

27

depend on the firms cost function and also on its marketing


policy. The firm may be concerned with maximizing profit,
increase market share, or just to keep the factory going in
times of economic slow down. Once the firm decides what
its policy is, the amount of products, which can be supplied
to the market, is clearly a function of the price at which these
products can be sold in the market. This forms the supply
function of the firm. If the quantities of products that can be
supplied by all the firms in this industry are totalled up for
each price level of the product, this gives us the total supply
function for the market as a whole.
As an illustration, let us assume that total supply of a
particular type of phones in the market is 29,000 pieces per
month when the price is Rs500 per piece. The same
manufacturers are prepared to supply a total of 52,000 pieces
per month if the price is raised to Rs600 per piece. A further
rise in the price per piece would justify working overtime in
the factory and also bring in foreign suppliers who were
earlier not interested in selling at low prices in the market.
It is found that a total of 75,000 pieces per month can be
supplied when the price is Rs 700 per piece.
These three individual points can be plotted on a graph
(Figure 2.5) letting Rs P represent the price per phone (in
Rs hundred) and X the total quantity ( in thousands of pieces
per month) which would be supplied at that price. Although,

Figure 2.5: Demand & Supply Curves

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

28
Notes
__________________
__________________
__________________
__________________
__________________
__________________

here P is the independent variable and X is the dependent


variable, it is customary for economist to plot prices on the
vertical axis and quantities on the horizontal axis and this
practice would be followed here. In the simplified example
being considered, three points are found to lie on a straight
line and so it can be assumed that the supply function for
the market is approximately linear. The function is then
found to be:

__________________
__________________
__________________
__________________

X = 23P 86
For example, by substituting Rs 500 as price (P = 5) we get X
= 29 (i.e., 29,000 phones). This line represents the quantities
which will be produced at different prices provided all the
quantity produced can be sold. But to find out what can be
sold in the market we need look at the demand function of
the market. The demand function would indicate the total
quantity that will be purchased at a particular price and
therefore, represents the total individual demand functions
of all the individual buyers.
Normally, large quantities would be bought when the price
is lower and as the price goes up the quantities purchased
come down. In this particular case it was found that only
24,000 telephone pieces can be sold at Rs 700 per piece but
that the sales would increase to 35,000 and 46,000 pieces per
month at the prices of Rs 600 and Rs 500 respectively. These
three points can be plotted on the same graph as supply curve
so as to get the demand curve. The demand function is then
found to be:
X = 101 11P
It would be wrong to assume that we can extend these lines
on either side for supply and demand functions. It would be
absurd to assume that the demand is 2,000 pieces when the
price is Rs 900 and equally wrong to assume that demand is
approximately 90,000 pieces when price is Rs 100.
The reason for plotting supply and demand of the same graph
is to found out the point of market equilibrium, which is the
point of intersection of these two lines. It can also be find
out using simple equation solving techniques mentioned

29

earlier, by finding the value of P which makes the value of X


same for both demand and supply functions. The prices and
quantity at the point of market equilibrium are known as
equilibrium price and the equilibrium quantity. Under the
condition of free competition, the equilibrium quantity will
be the quantity actually produced and the equilibrium price
would be the price in the market.
Fitting demand and supply curves is much more tedious than
solving other business situations. It is much more difficult
to access how supply will respond to change in price than to
access how the total production cost within a firm will vary
with the quantities produced. Demand is also complicated
because of the presence of substitute products in the market.
These difficulties explain why mathematical economists need
a lot of training and experience and why sometimes
forecasted situations vastly differ from the actual situations
in the market.

Sometimes it is assumed that y is a function of x means that


there is a single formula connecting y with x. While it is easy
to discuss functions which are described by a single formula,
the only correct interpretation is that y is a function of x if
the value of x determines the value of y, irrespective of the
fact, whether it is in steps, multiple formulae are required
or there are constraints attached.
For example, since the price of a commodity determines
the quantity supplied and the quantity demanded, these
quantities are functions of the price, even in cases
where the relationship is so irregular that it can be
described only by a list of prices with the corresponding
quantities.

A certain car is so expensive that wealthy people buy it for


prestige reasons because it is expensive. The demand falls
from 30 per month when the price is Rs 20,00,000 to 20 per
month when the price is Rs16,00,000 and then rises to 100
per month when the price falls to Rs 12,00,000.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

30
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Figure 2.6: Demand for a Luxury Car

__________________

The methods discussed earlier can be used to find the


equation for the straight line which passes through the points
(30, 20 lacs) and (20, 16 lacs), and again for the straight line
through the points (20, 16 lacs) and (100, 12 lacs). It is
standard practice to denote points in the graph in the form
(x, y). In order to state the range of values for which each of
these equations is valid, it is necessary to use one of the family
of symbols known as inequalities. The most important
inequalities are
a > b means a is greater than b
a b means a is greater than or equal to b
a < b means a is less than b
a b means a is less than or equal to b
Only the last one is needed for the present. The formulae for
the demand curve shown in the graph are:
x = (p/40000) 20 when 16, 00,000 p 20, 00,000
x = 340 - (p/5000) when 12, 00,000 p 16, 00,000
The function is now defined for all values of p in the range
12 lacs < p < 20 lacs. It is perfectly mathematically sound to
leave the function undefined outside this range if no
information is available, though a keen sales manager would
like to see that whether it would be highly profitable to fix
the price higher than Rs 20 lacs.

31

In this example, it is not correct to say that price p is a


function of quantity x. The value of x does not determine the
value of p, since p could be either of two values if the value
of x is, for instance, 25. For normal commodities the demand
curves slope downwards throughout their length and it is
then correct to regard price and quantity each as a function
of the other.
Some functions are represented by two or more lines which
do not meet each other. A good example is a schedule of
postage rates where the first slab is Rs 2 up to 20 gms and
then Re 1 for every additional 10 gms.
Figure
12

Y=Price of the Postage

10
8
6
4
2
0
0

10

20

30

40

50

60

70

80

90

100

X=Weight in gram s

Figure 2.7: Postal Rates Graph

Clearly, the postage rate is a function of the weight of the


letter, since the latter determines the former. Any function
which consists of two or more lines which do not meet is
called a discontinuous function. There is said to be a
discontinuity at each of the values of the independent variable
at which there is a gap in the values of the dependent
variable.
It must be noted that the graph includes the points (20, 30),
(30, 40) but does not include the points (20, 40), (40,
80) Sometimes appropriate marks are added to a graph
to indicate whether or not these boundary points belong to
the function.
Discontinuities and inequalities introduce awkward
complications, and so mathematicians and scientists usually
ignore them for as long as they can. Unfortunately, they arise

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

32
Notes
__________________
__________________
__________________

far too frequently in managerial problems to be so lightly


disregarded. Because of this reason, we use limitations and
constraints whenever we develop problems and analyze
them.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

We saw that it is usually possible to sell larger quantities of


a commodity where the price is lower. For a monopolist, the
demand curve for the market is the price curve to be used in
calculating the revenue of the firm. Where there is no
monopoly, the amount a manufacturer can sell is still a
function for the price at which he offers his goods, although
in this case the price curve will not be the same as the
demand curve for the market as a whole.
Let us consider an example.

The same machine manufacturer finds that he could sell an


average of four machines per day at a price of Rs 18,000 per
machine. Stepping up his production to an average of 4
machines per day, he finds that he has to reduce the price to
Rs 17,500 per machine in order to sell all that he produces.
Find the profit function.

Now this problem abandons the unrealistic assumption of


traditional cost accounting that the price is a constant and
the revenue function therefore linear. Putting the machines
sold per day, x, as a linear function of the price (in units of
Rs 1,000), p:
x = ap + b
The method learned earlier makes it easy to find out a and
b. Substituting their values we find that the equation reduces
to:
x= 22 p

33

The revenue R is the price multiplied by the number of


machines sold:
Revenue R = Price p x Quantity x

Notes
__________________
__________________
__________________

R = px
In order to find the breakeven point, it is simplest to express
p as a function of x; R then becomes a quadratic function of x:
p = 22 x from the equation x = 22 - p above
Substituting this value of p in R = px we get
R = px = (22 x) x = 22x x2
Assuming that the linear cost function to be 32 + 7x, as found
earlier, the gross profit G becomes a quadratic function of x:
G=RT
= (22x x2) (32 + 7x) = -x2 + 15x 32
This is the profit function for this manufacturer.
This quadratic function more closely approximates the real
life situation. Now the question comes, how do we solve these
quadratic functions/equations. There are three basic
methods:
1.

Factorization

2.

Using Graphs

3.

Using Formula

If the quadratic equation can be expressed as a product of


two linear expressions (known as factors) it can be solved
using factorization. This is possible only if the solution to
the equation is an ordinary number or a fraction. For
example:
2x2 17x + 21 = (x 7) (2x 3) = 0
The identity sign (=) is used as a reminder that the two sides
are equal for all values of x, which can be confirmed by
multiplying out the right-hand side.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

34
Notes
__________________
__________________
__________________
__________________
__________________

If the product of any two expressions is zero, then at least


one of these expressions must be zero. So recognizing the
factors immediately leads to the solution of the equation:
2x2 17x + 21 = 0
(x 7) (2x 3) = 0

__________________

Either x 7 = 0

__________________

x = 7 or x =

__________________
__________________
__________________

or 2x 3 = 0

The main difficulty in finding the solution of a quadratic


equation by factorization lies in finding the factors. It is
possible to do this by a routine procedure which will either
find the factors systematically or prove that none exists.

Let the factors be (px + q) and (rx + s), where p,q, r and s are
positive or negative integers and the product of the two
factors is ax2 + bx + c. Multiplying the factors we get:
prx2 + psx + qrx + qs = ax2 + bx + c
The coefficients must be the same on both sides as this must
be true for all values of x. This gives equations relating the
unknown quantities to the coefficients in the expression to
be factorized: pr = a; ps + qr = b; qs = c. It implies that the
product of ps and qr is ac and so the first task is to find these
two numbers whose sum and product are known.

Find the factors of 10.8x2 + 93x + 140

The first stage is to take out the fractional factor, resulting


in the expression 0.2 (54x2 + 465x + 700). It is then necessary
to look for two numbers whose sum is 465 and whose product
is 54 700 = 37,800.
The simplest method is to start with any two numbers whose
sum is 465, such as 80 and 385. If the product is too small, a
suitable amount is added to the smaller one of the two and
an equal amount is subtracted from the larger one. A little
exercise will show that the numbers are 105 360 = 37,800.

35

Putting ps = 105 and using the fact that pr = 54, the highest
common factor 3 is then equated to p. Calculations would
show that s = 35, r = 18 and q = 20 and the factors of the
above equation are:
0.2 ( 3x + 20) (18x + 35)

Notes
__________________
__________________
__________________
__________________
__________________
__________________

Find the factors of 11x2 + 56x + 21.

__________________
__________________
__________________

Two numbers are to be found whose sum is 56 and the product


is 11 21 = 231. Following the above procedure, it will be
found that 4 52 is too small and 5 51 is too large. It can
immediately be concluded that there are no integral factors.

Find the factors of 12x2 + 39x 105.

Here two numbers have to be found whose sum is 39 and the


product is 12 (-105) = -1260.
It requires thinking to find that the one which is larger
numerically (that is, disregarding any negative sign) must
be positive and the other negative. Trying 70 -31 = -2170 is
too big; 60 21 proves correct. It is pointless to try any pair
of numbers which does not include a multiple of 5 because
the product is required to be a multiple of 5.
Depending on which number is chosen to represent ps, the
factors are either (12x 21) (x + 5) or (3x + 15) (4x 7), both of
which can be reduced to 3(4x 7) (x + 5). It would have been
better to take out the factor 3 right at the beginning if it had
been recognized as a common factor of all the coefficients.

One way of finding the solution of any equation which is in


the form, or can be rearranged in the form
f(x) = 0

__________________

36
Notes

is to draw the graph of the function:

__________________

Y = f(x)

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

If this graph cuts the x-axis at any point, the value of x at


that point is the solution of the original equation, since it is
the value of x at which y = 0. This was the method discussed
earlier also for linear equations and can be applied to
quadratic equation ax2 + bx + c as well. Here a, b and c can be
positive or negative and may involve fractions or decimals.
It is also possible for b or c to be zero, but if a was zero, the
function would become a linear function.

__________________

Plot the function G = x2 + 15x 32 on a graph.

To draw the graph of the function:


G = -x2 + 15x 32
It is necessary to choose a range of values of x and calculate
the corresponding values of G. In this illustration it is enough
to consider values of x between 0 and 14:
x

10

12

14

-32

12

22

24

18

18

It can be seen from the graph below that G is zero when x is


about 2.6 or about 12.4. These two values are said to be the
roots of the equation x2 + 15x 32 = 0.
This means that the manufacturer will make a profit between
2.6 to 12.4 machines and would make the maximum amount
of profit when he makes 8 machines.
Every quadratic function in which the coefficient of x 2 is
negative gives a graph of shape shown above, which is termed
a parabola. If the highest point is above the x-axis, the
corresponding equation has two roots, but if the highest point
of the function is below the x-axis there is no solution for the
corresponding quadratic equation.

37
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

__________________
__________________
__________________

Figure 2.8: Graph of -x2 + 15x -32=0

It is necessary to have a method of solving an equation in


which the quadratic expression is difficult to factorize such
as illustrations 2.6 and 2.8. The method used is equally
applicable to equations where factorization is simple.
The derivation of the formula is of interest only to
mathematicians the solution is given below for a quadratic
equation ax2 + bx + c = 0.

x=

This formula can be applied to any quadratic equation


irrespective of the fact whether the coefficients are positive
or negative.

Taking the equation given in the illustration 2.10, here


a = -1, b = 15 and c = -32. The solution by formula is:

38
Notes
__________________

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

= 12.425 or 2.575.
This is the exact solution whereas from the graph we got an
approximate solution. The sum of the roots is always equal
to -b/ a and their product is always equal to c/ a. In this
illustration the sum 15.05 and the product 32.62 can be
checked with the original equation. This check makes it
unnecessary for you to check the roots separately by
substitution in the equation.

A linear function is never an ideal model of production costs.


It is usually possible to obtain a much better model by
fitting a quadratic curve as we did in the illustration
2.10 above.
T = ax2 + bx + c
The values of a, b and c will be positive. The term x2 implies
that costs increase more steeply as the production goes up.
If there was no such effect, it would never pay to enlarge a
factory.
It must not be assumed that a quadratic curve will be a
perfect model of production costs. There is nothing
magical about the x2 term. A quadratic function will always
be as good as a linear function and nearly always be a better deal.

The machinery manufacturer, discussed earlier, found that


the total production costs averaged Rs 60,000 per day when
an averages of 4 machines per day are produced. An accurate
assessment of costs when the average production is 3
machines per day and again at 4 machines per day gives
figures of Rs 56,600 and Rs 63,600 respectively. Fit a
quadratic cost curve.

39
Notes

Just as fitting a linear curve to two known points was shown


earlier to give two simultaneous linear equations, so fitting
a quadratic curve to three known points give three
simultaneous linear equations. From the given information,
again using units of Rs 1000, the equations are:

__________________
__________________
__________________
__________________
__________________
__________________

a (3)2 + b(3) + c = 56.6

(1)

a (4)2 + b(4) + c = 60.0

(2)

__________________

a (4)2 + b(4) + c = 63.6

(3)

__________________

__________________

__________________

This set of equations can be solved very easily by elimination,


eliminating first c and then b to give a = 0.4, b = 3.8 and c =
38.4.
The cost function is therefore:
T = 0.4x2 + 3.8x + 38.4

(4)

When the quadratic revenue curve found in illustration 2.7


is applied, the gross profit function is found to be:
G = -1.4x2 +18.2x 38.4

(5)

The breakeven point is then approx. 2.65, compared with 2.58


when a linear cost curve is assumed.
More advanced techniques based on more complicated models
are available in managerial problems, but the practical
benefits of increased accuracy will be negligible in most cases.

2.1 Gwalior Drums Ltd. is a medium scale company, engaged


in the manufacture of drums of different qualities and
sizes. It has a fixed cost of Rs10,000,000. The average
cost of manufacturing a drum costs company Rs60 which
the company sells at Rs100. Assuming that every drum
produced is sold off, find a formula for profit for the
company. Find the minimum number of drums that the
company should produce and sell to meet exactly the
cost.

40
Notes
__________________
__________________
__________________

2.2

M/s Kalyani Forge pays its workers Rs70 for an 8-hour


shift. In addition each worker is paid Rs10 for every one
hour of overtime. However, overtime cannot exceed 4
hours per day.

__________________

(a) Cite the total wage paid to the worker as a function


of overtime.

__________________

(b) Draw the graph of this function.

__________________

__________________
__________________
__________________
__________________

2.3 Ash Lubes sells X units of Supreme Lubes each day at


the rate of Rs50 per unit of 100 gm. The cost of
manufacturing and selling these units is Rs35 per unit
plus a fixed daily overhead cost of Rs10,000. Determine
the profit function. How would you interpret the
situation if the company manufactures and sells 400
units of the lubes a day.
2.4 Parker India Ltd., manufacturers of quality stationary
items, have introduced a new variety of pen in the
market. The market supply function of the pen is
represented by the function Q = 160 8P, where Q
denotes the quantity supplies and P denotes the market
price per unit. It costs Rs4 to produce a pen. If the total
profit required is Rs500 what should be the market price
per unit?
2.5 Ishaan Petrochemicals has introduced in the market its
latest lube. The marketing manager has worked out that
the demand function of this product, which can be
expressed as:
Q = 30 4P,
Where, Q is the quantity and P is the per kilogram price.
(a) Write the total revenue as a function of price.
(b) Draw the graph of this function.
2.6 The monthly supply of 2T Oil in Delhi is estimated to be
95,000 tons when the price is Rs 13,000 per ton and
1,10,000 tons when the price is Rs16,000 per ton. The
monthly demand is estimated to be 109,000 tons at Rs
13,000 per ton and 99,000 tons at Rs16, 000 per ton.
Assuming that the supply and demand functions are both
linear find these functions and hence determine the
equilibrium price and quantity.

41

2.7 A manufacturer of petrochemicals finds that his total


production cost is Rs1,20,66,000 per week when he is
producing 1240 tons per week. The fixed costs are Rs67,
34,000 per week, and the selling price is Rs11, 700 per
ton. Find (a) the weekly revenue, (b) the weekly gross
profit, and (c) the weekly production and total
production cost at the break-even point.

Notes
__________________
__________________
__________________
__________________
__________________
__________________

2.8 Yarn is prepared from cotton by being passed


successively through slubbing, roving and spinning
frames. Each 20 kg of yarn A requires 6 minutes in a
slubbing frame, 18 minutes in a roving frame, and 106
minutes in a spinning frame. For yarn B the times are 7,
27, and 150 minutes respectively, and for yarn C they
are 8, 30 and 181 minutes respectively. If the plant
consists of eight slubbing frames, 28 roving frames and
162 spinning frames, how much of each type of yarn
should be produced per hour in order to keep all the
machines fully occupied?
2.9 A rubber glove manufacturer finds that he can sell 1,
38,000 gloves per week pack sizes at Rs 190 per pack.
He increases the price to Rs 200 per pack and finds that
he can sell only 1, 28,000 packs Rs., per week. Assuming
that price curve is linear, find (a) the price, (b) the
weekly revenue of the number of packs sold per week,
(c) find the prices and quantities for which the weekly
revenue will be Rs 60,990 per week.
2.10 A switch manufacturer finds that his total monthly
production costs are Rs 10,600 when production is 16,000
units per month, Rs 17,800 when it is 26,000 units and
Rs 27,000 when the production is 36,000 per month. He
can sell 16,000 units per month at Rs 104 each, but has
to reduce the price to Rs 94 each in order to sell 26,000
pieces. He can sell 36,000 pieces only at Rs 80.
Assuming that both cost curve and price curve are
quadratic, find (a) the monthly total cost, (b) the price,
(c) the monthly revenue, and (d) the monthly gross profit
as functions of the quantity sold. Find also (e) the
quantity sold, (f) the price and (g) the monthly revenue
at the breakeven point and confirm that the monthly
total cost is then equal to the monthly revenue.

__________________
__________________
__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________

After reading this unit, you will be able to:

__________________

Understand how matrices and determinants are developed and


solved.

__________________

Explain how these make solving equations easier

__________________

Analyze how these are applied in the business

Matrices form one of the most powerful tools management


and of modern mathematics. They have innumerable
applications in the analysis of material and machine
requirements and the solution of problems in planning and
organization. An understanding of matrices is also essential
for most branches of advanced mathematics and statistics.
As vectors lie at the base of matrices, let us start by
understanding them first.

The use of vectors can be illustrated by a very simple


example. A small firm uses sheeting fabric to manufacture
white sheets and pillowcases for hospitals and hotels, which
are sold by the dozen. Orders received in the office are passed
by telephone to the packing department, who is interested
only in the quantity to be packed in each parcel.
Typical orders would be 4 dozen sheets and 2 dozen
pillowcases, 18 dozen sheets and 6 dozen pillowcases, 12
dozen sheets, 6 dozen pillowcases and so on. It would not
be long before speaker and hearer agree to save a lot of time
and breath by giving simply a pair of numbers for each order:
[4

2]

[18 6]

[12 0]

[0 6]

__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Here the first number stands for dozens of sheets and the
second number stands for pillowcases. The four brackets
denotes four different orders. As long as the zero is inserted
when necessary, there can be no confusion as to the meanings
of these figures. As the orders are packed, the quantities
can be added up.
These pairs of numbers are examples of vectors. A vector is
any row or column of figures in a specified sequence. The
fact that [12 0] is an order for 12 dozen sheets while [0 12]
would be an order for 12 dozen pillowcases indicates that
the numbers acquire meaning from their positions in the
sequence. A vector is normally printed between square or
curved brackets or between a pair of double vertical lines.
The sum of the four orders is an example of vector addition.
Two vectors are added together by adding the first number
in the first vector to the first number in the second vector,
the second number in the first vector to the second number
in the second vector and so on. Each number is called an
element of the vector. Vectors can have more than two
elements, but two vectors can only be added together if both
have the same number of elements. Clearly the sum of the
above four orders is [34 14], i.e., 34 dozen sheets and 14
dozen pillowcases.
If the firm started to sell blankets also, a new convention
would be needed by which [4 2 3] means 4 dozen sheets, 2
dozen pillowcases and 3 dozen blankets. The convention
would have to be adopted completely for all orders, inserting
0 whenever an order did not include any blankets. The total
quantities ordered would be given by the sum of these threeelement vectors, which would itself be a three-element
vector. Vectors are, thus, an ordered arrangement of
numbers it can be in a row or a column.

If the customer, responsible for the order [4 2], asked for it


to be doubled, this would be interpreted as [8 4]. If he asked
for it to be tripled, it would become [12 6]. This is the rule
for multiplying a vector by an ordinary n umber, which is

called a scalar to distinguish it from a vector. Hence, the


result of multiplying a vector [a b c] by a scalar k is the vector
[ka kb kc].

Notes
__________________
__________________

A vector may also multiply a vector. But it would be


meaningless to multiply together two vectors, both of which
represent orders for goods. The definition of vector
multiplication will be seen to make sense only when it is
applied in a sensible situation.

__________________

When sheets and pillowcases have been ordered and packed,


the next stage is to invoice them. If the prices are Rs 1,800
for a dozen sheets and Rs 700 for a dozen pillowcases, then
the amount due for the order [4 2] will be:

__________________

(4 1,800) + (2700) = Rs 8,600


This suggests a use for the multiplication of vectors. The
prices can be represented by a new vector, which, because it
is a different kind of vector, will be written as a column:

LM1800OP
N700 Q
Then multiplying an order vector by this price vector can be
defined as multiplying the first element of the order vector
by 1,800 and the second element of the order vector by 700
and adding the results together:

LM1,800OP 8,600
N700 Q
L1,800OP 4,200
0 6 M
N700 Q
42

LM1,800OP 36,600
N700 Q
L1,800OP 71,000
34 14 M
N700 Q

18 6

12 0

LM1,800OP 21,600
N700 Q

It is obvious that (34 x 1800) is the total value of all the sheets
in the preceding four order vectors and (14 700) is the total
value of all the pillowcases, so that the sum of these products,
71000, must be the sum of all the separate orders:
8,600 + 36,600 + 21,600 + 4,200 = Rs.71,000
Two vectors can be multiplied together only if both have the
same number of elements. Multiplication of a row vector by
a column vector, which always results in a scalar, is called
the scalar multiplication of vectors.

__________________
__________________
__________________
__________________

__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Note that the product has meaning in this case only because
the first element in both the order vector and the price vector
represents dozens of sheets and the second element in each
type of vector represents dozens of pillowcases. Yet they
are also distinctive; one vector gives the number of units
ordered and the other vector the price per unit, so that the
product gives the value of the units ordered. It would be
just as meaningless to add an order vector to a price vector
as it would be to multiply an order vector by another order
vector. Vectors have, in fact, been implicit in some of the
earlier examples in this book even though they were not
made explicit.
Now let us turn our attention to matrices.

A matrix is a rectangular or square array of numbers


arranged into rows and columns, where the numbers acquire
meaning from their position in the array. This means that
vectors we discussed earlier are just simple example of
matrices. Let us take up an example.
Illustration 3.1
Let us assume that the manufacturer of sheets and
pillowcases discussed earlier has three types of machines.
There is one machine for cutting the fabric, three machines
for sewing and one for folding. The manufacturing times in
minutes per dozen are:
Cutting

Sewing

Folding

Sheets

38

14

Pillowcases

32

To form these facts into a matrix, it is only necessary to


arrange the numbers between brackets or double vertical
lines:

LM8 38 14OP
N6 32 4 Q

Single vertical lines will not do, as these are used to


represent a determinant. Even when it has the same number
of rows as columns, a matrix is not at all the same thing as a
determinant. There is no way of expanding or evaluating a
matrix, since each element has its own distinctive meaning.
The production time for an order [4 2] can be calculated by
multiplying the order with the manufacturing time.

Activity 3A
Find out whether matrices are
used for production planning in
your organization.
__________________
__________________
__________________
__________________

For this purpose, each column of the matrix will be treated


as a column vector and the scalar multiplication of the order
vector by these column vectors would give the three results.

__________________
__________________
__________________

(4 8)

(2 6) = 44

minutes cutting

__________________

(4 38) +

(2 32) = 216 minutes sewing

__________________

(4 14) +

(2 4) = 64

minutes folding

However, it is not necessary to separate out the column


vectors; a convention of matrix multiplication is adopted
which gives the same result:
[4 2]

LM8 38 14OP = [44


N6 32 4 Q

216 64]

The production times now appear as the elements in a new


row vector. Vectors are really simple examples of matrices.
In general, a matrix has m rows and n columns. If m = 1, then
the matrix is a row vector with n elements. If n = 1, then the
matrix is a column vector with m elements. All general
statements that we made or will make about matrices will
also apply to vectors as well.
Two matrices can be multiplied together only if the number
of columns in the first matrix is equal to the number of rows
in the second matrix. The first row of the first matrix is then
multiplied by the first column of the second matrix, following
the rules for the scalar multiplication of vectors and the
result becomes the first element in the first row of the matrix
forming the answer. In general, the product of the ith row of
the first matrix and the jth column of the second matrix
becomes the element in the ith row and the jth column of the
matrix forming the answer.

__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Therefore, it follows that a matrix with m rows and n columns


can be multiplied by a matrix with p rows and q columns
only when n is equal to p. The product is then a matrix with
m rows and q columns. In the above multiplication of a row
vector by the manufacturing time matrix, m = 1, n = 2, p = 2,
and q = 3. Let us now apply the rules of matrix multiplication.
The original four order vectors can be formed into a matrix
in which each of the four rows represents a different order.
The manufacturing time matrix can then multiply this and
the answer is a matrix in which the columns represent the
three types of machine and the rows represent the time
taken to process each of the four orders.

LM4 2 OP
LM44 216 64 OP
MM18 6 PP LM8 38 14OP MM180 876 276 PP
96 456 168
MN012 60PQ N6 32 4 Q MN36
P
192 24 Q
It is often useful to represent a matrix by a single symbol. A
capital letter is usually printed in bold type to emphasize
that it is a matrix. Putting the matrix of the four orders as A
and the manufacturing time matrix as B, the product can be
written as:
AB = C
Therefore, C is a matrix giving the total production time on
each type of machine for each of the four orders.
The equation would no longer be true if B were written before
A. In fact, it is impossible to multiply B by A since B has
three columns while A has four rows. If the fourth row of A
was disregarded, there would then be two matrices which
could be multiplied in either order, but the results would be
different:

LM4 2 OP L8 38 14O LM44 216 64 OP


6P M
180 876 276P
MM18
6 32 4 PQ M
N
MN96 456 168PQ
N12 0PQ

Notes

OR
4 2 O
LM8 38 14OP LM18
L884
6PM
M
P
N6 32 4 Q MN12 0PQ N648

__________________

244
204

OP
Q

The products are entirely different in the two cases. The


second product is, in fact, completely meaningless, since it
includes terms such as (8 2) where 8 is the cutting time for
sheets and 2 is an order for pillowcases!
We showed earlier how a row vector could be multiplied by
a column vector to obtain a scalar product. From the rules of
matrix multiplication, we can now see that it is impossible
to multiply a row vector by another row vector or to multiply
a column by another column vector, except in the trivial case
of vectors with only a single element. Multiplying a column
vector by a row vector gives a matrix instead of a scalar
product:

LM1,800OP
N 700Q

4 2

LM7,200 3,600OP
N2,800 1,400Q

In this matrix, the elements 3,600 and 2,800 are completely


meaningless. For instance, 3600 is obtained by multiplying
the price for sheets by the order for pillowcases.
Saying that the multiplication of ordinary numbers is
commutative (ab = ba) while the multiplication of matrices
is in general non-commutative (AB BA) summarizes the
above result. This means that changing the order of
multiplication of two matrices will generally change the
answer. It will be seen later that there are a few cases where
changing the order does not change the answer. A little
thought will show that this can only be true for square
matrices, that is, matrices with equal numbers of row and
columns; but matrix multiplication is not in general
commutative even for square matrices.
Since multiplying a vector by k means multiplying every
element by k, irrespective of whether it is a row vector or a
column vector, it is to be expected that the same rule would
apply to matrices.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

YO

UNIV ER SIT

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

ST U DI ES

__________________

n ~;

EN

GY

__________________

OLEUM &

fo

Notes

E TR

ER

FP

k~ e
k ; k k f D r o

This is the case:

LM 4 2OP LM 4k 2kOP
18 6P M18k 6k P
kM
MM12 0PP = MM 12k 0PP
N 0 6Q N 0 6 k Q

If k = 0, one obtains a matrix in which all the elements are


zero, termed a zero matrix. A zero matrix is also obtained if
two matrices are multiplied together, one of which is a zero
matrix. But it is also possible to obtain a zero matrix as the
product of two matrices neither of which is a zero matrix:
-2
6

1
-3

1
2

-4
=
-8

0
0

0
0

Matrix addition has not been defined so far. It is possible to


add together two matrices only when both have the same
number of rows and the same number of columns. The sum
is then obtained simply by adding together the corresponding
elements:
3

30 9

25 2

3+5

8+30

5+9

2+4

7+25

2+2

Here, the first matrix could represent the machine loading


and unloading times and the second matrix the machine
running time for the sheets and pillowcases in above
illustration, so that the sum would be the manufacturing time
matrix already employed.
It will be seen that the rule for vector addition conforms to
this rule for matrix addition. Unlike matrix multiplication,
matrix addition is commutative; changing the order of the
matrices, which are added together, does not change the
result.
A few words must be added on the equality, subtraction and
division of matrices. Two matrices are said to be equal only
if they are identical; they must have the same number of
rows and the same number of columns and every element in
the second matrix must be equal to the corresponding
element in the first matrix.

YO

OLEUM &

n ~;

ST U DI ES

fo

EN

GY

UNIV ER SIT

E TR

ER

FP

UNIT 3

k~ e
k ; k kf D r o

Matrices and Determinants

51

A matrix can be subtracted from another matrix only when


both have the same number of rows and the same number of
columns. Subtraction is then simply the reverse of addition:
8

38 14

32

30

25

Notes
__________________
__________________

__________________

__________________

Matrix subtraction is non-commutative, but this is to be


expected since the subtraction of ordinary numbers is also
non-commutative.

__________________
__________________
__________________
__________________

Matrix division is quickly dealt with, as it is impossible to


divide a matrix by another matrix directly. There is a round
about method which we will learn later in the chapter.

Use of Matrices for Production Planning


Before the advent of the computer, use of matrix methods in
production planning were of theoretical interest rather than
real practical value. The position is now completely reversed.
Any firm, which has access to a computer and does not use
matrix methods, must be regarded as backward and
inefficient. By these standards most of the Indian firms would
be!
If a firm makes m products using n different types of
machines, matrix A can represent the machine time
requirements with m rows and n columns. One such matrix
was shown earlier.
The total costs per minute for running each type of machine,
including both capital and labour costs, form the n elements
in a column vector. Small letters in bold type can represent
such vectors; let c be the machine-cost vector. Then Ac will
be a column vector with m elements, giving the total machine
cost per unit for each product.
If the manufacturer of sheets and pillowcases in the
illustration above finds that the machine costs are Re 0.2
per minute for cutting, Re 0.1 for sewing and Re 0.3 for
folding, then the total machine costs per product are given
by:
0.2
8 38 14
9.6
0.1 =
6 32 4
5.6
0.3

LM
N

OP LM
Q MMN

OP L O
PP MN PQ
Q

__________________
__________________

YO

UNIV ER SIT

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

ST U DI ES

__________________

n ~;

EN

GY

__________________

OLEUM &

fo

Notes

E TR

ER

FP

k~ e
k ; k k f D r o

For most manufacturers there would, of course, be much


larger numbers of machines and products. In this section
only a very simple example can be followed through as it
makes it easier to understand the process. Computers can
handle matrices with dozens of rows and columns, each
element having three or four digits and they still use the
same process.
Matrix B with m rows and q columns may represent the
material contents of the different products. With four
ingredients such as sheeting fabric, thread, labels and
packing material, there would be a four-column ingredients
matrix for sheets and pillowcases, such as:
46
16

7
3

12
12

28
13

The units of measurement may be different for each column,


being chosen to suit the nature of the ingredient. The same
unit will be used in each case when preparing the ingredientcost vector. This will be a column vector with q elements
and may be represented by d. Then Bd will be a column
vector with m elements, giving the total ingredient cost per
unit for each product:

LM2.5OP
LM46 7 12 28OP M0.3P = LM123.9OP
N16 3 12 13Q MM0.1PP N 44.7Q
N0.2Q
Any labour or other costs not already included in the
machine-cost and ingredient-cost vectors will be computed
for each product to form an additional cost vector e with m
elements. The total cost per unit for each product is then
obtained by adding together the three vectors each with m
elements:
Ac +

Bd +

9.6

123.9

5.6

44.7

e
+

total cost vector

2.0
1.5

135.5
51.8

All the information about machine time requirements,


material contents and costs is collected by work study and

costing staff and the matrices A and B and vectors c, d and e


are stored in the computer. When new techniques or changed
prices or wage rates make it necessary, the matrices and
vectors are brought up to date. The products Ac and Bd and
the total cost vector Ac + Bd + e are also computed and kept
up to date. Depending on the pricing policy of the firm, there
may also be a selling price vector p, which is computed from
the total cost vector by adding a suitable percentage for fixed
costs and profit.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

When an enquiry is received, the prices and delivery dates


are quoted by reference to the computer. If this results in an
order, it is recorded as a row vector with m elements. A row
vector is distinguished from a column vector by a distinctive
mark:
x= [4

2]

The order vector x may be multiplied by A to obtain a row


vector giving the production times for the order, as calculated
earlier. At the same time, x may be multiplied by Ac to give
the total machine cost for the order:
4 2

LM9.6OP 49.6
N5.6Q

One would expect to obtain the same result if the productiontime vector x A is multiplied by the machine-cost vector c
and this is in fact the case:

4 4 216 64

LM0.2OP
MM00..31PP 49.6
N Q

The fact that xA multiplied by c always gives the same result


as x multiplied by Ac is called the associative property of
matrix multiplication. The product can be written simply as
xAc.
The computer stores a vector of running totals of machinetime commitments, to which the vector xA is added. As each
order is completed, its production times are deducted from
the running totals. The commitment for each type of machine

__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

may then be divided by the number of machines of that type,


obtaining the number of minutes and hence the number of
weeks it will take to produce all outstanding orders. This
information forms the basis for quoting delivery dates for
new order and perhaps also for planning overtime work or
the purchase of additional machines.
The vector xB is computed and similarly incorporated in
the running totals of ingredient requirements. The scalar
quantity xBd gives the total ingredient cost of the order and
may also be added to a continuous running total if it is
necessary to keep a check on the amount of capital needed
to finance work in progress. When the order is delivered,
the amount of money due is given by xp. This serves as a
check on the invoice total.
This presentation is far from complete, but it is enough to
show how computers using matrices can keep a check on
production commitments and stocks of materials. The most
important extension necessary in most firms will take into
account production dates. The final delivery date of each
order will determine the dates by which various stages of
manufacture must be completed. Running totals of
commitments will be kept by dates so that no type of machine
can be overcommitted at any stage.
Ideally, the computer will be used to print out the production
orders for each department at the appropriate times. Because
it makes all calculations extremely rapidly and keeps a
complete check on all machine and material requirements,
the computer will print each production order only when it
is time to commence production. This makes it possible to
accept any rush order, which does not conflict with existing
commitments; such orders are usually very profitable if they
can be successfully handled.
The computer will similarly record all ingredient
requirements by dates, so that supplies are not obtained
unnecessarily early. It will be programmed to give a warning
if supplies do not arrive when they are due.
Properly used, the computer is the managers most efficient
assistant, reminding him of any supplies that need to be

chased and warning him of any future production


bottlenecks. It is a complete waste of the computers powers
to employ it in churning out masses of detail which the
manager has to pore through in order to find where things
may go wrong. This laborious and inefficient management
by detail must give way to management by exception.
Let us now turn our attention to other applications of
matrices.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

There are three methods that can be used to solve linear


equations: (1) Matrices (2) Row operations and (3) Determinants.
A few hints have already been given that matrices can be
used in the solution of sets of simultaneous linear equations.
But before considering the role of matrices, it is useful to
consider a technique known as row operations.

Solving linear equations by row operations is in principle


the same as solving them by elimination. The difference is
that row operations, in turn, aim systematically at a
coefficient of 1 for each unknown. For instance, in illustration
2.2:
3x +
6x +

11y =
17y =

180
300

(1)
(2)

The first stage in row operations is to divide the first equation


by 3 in order to make the coefficient of x into 1. This equation
is then multiplied by 6 and the result subtracted from the
second equation to eliminate x from that equation. So the
equations become:
x

+ 3

0x +

y =

60

(3)

5y = - 60

(4)

The next stage is to divide the second equation by 5, in


order to make the coefficient of y equal to 1. This equation is
then multiplied by 32/3 and the result subtracted from the
first equation in order to eliminate y from that equation:

__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

x + 0y = 16

(5)

0x + y = 12

(6)

This gives the solution, x = 16, y = 12. If one is able to multiply


and subtract mentally, the whole procedure is very rapid.
The other special feature of row operations is that it is
unnecessary to keep writing the letters and the addition
signs. All that is needed is to write down the figures, keeping
the zeros as in the above equations and to insert a vertical
line to separate the two sides of each equation:

__________________
__________________

11

180

17

300
1
0

1
0

0
1

3 2/3
-5

60
-60

16
12

A set of three equations in three unknowns requires three


stages. It is necessary at each stage to write first the row,
which has to be divided through to give the coefficient 1.
This is the first row in the first stage, the second row in the
second stage and so on. This row is then used to eliminate
the corresponding coefficients in all the other rows, both
above and below. Hence, after the first stage the first column
of figures reads 1 0 0; after the second stage the second
column of figures reads 0 1 0, while the first columns remains
as 1 0 0 and so on.
The solution of illustration 2.3 by row operations reads:
3
2
5

4
5
9

3
7
2

107
148
151

1 1/3

35 2/3

2 1/3

76 2/3

Notes

-1

6/

-8

1/

__________________

2 1/7

32 6/7

__________________

__________________

__________________

-8

-104

16

__________________

13

__________________

__________________
__________________

The obvious disadvantage of the method is that it introduces


fractions even when the final solution does not include
any fractions. Its advantage is that it follows a strict
routine, which is always necessary if computers are to be
used.
Since each row represents an equation, it is permissible to
rearrange the rows in order to get out of a difficulty. For
instance, the equations:
2x -

4y

3z

14

(1)

3x -

6y

2z

11

(2)

6x -

3y

(3)

After the first stage of row operations give:


1

-2

11 / 2

-21/2

- 10

-1

- 1/2

-5

To obtain 1 in the second position of the second row, the


simplest procedure is first to interchange the second and
third rows. It would alternatively be permissible to add the
third row to the second row. The reader should follow
through both methods, obtaining the solution x = 7, y = 3,
z = 4.
Row operations can deal with a set of equations that are not
all independent.

__________________
__________________

Notes
__________________

In illustration 3.6 after collecting all the terms involving


unknowns on the left, the initial set of rows is:

__________________

__________________
__________________

__________________
__________________

__________________

__________________
__________________
__________________

After three stages of row operations they become:


1

0 - 1.575

-0.75

-1.05

It is now clear that the fourth row contributes no information


other than that contained in the first three rows; in other
words, the equations were not all independent. There has
been no attempt to discard an equation arbitrarily, as done
when solving this illustration here. If the solution of a set of
equations using determinants gives zero divided by zero for
each of the unknowns, row operations will always lead to
discarding the correct row since it will eventually produce
a row of zeros. If the equations had been contradictory,
row operations would have produced a row of zeros to
the left of the vertical line with a non-zero value on the
right.
Since illustration 3.6 leads to three independent equations
in four unknowns, it is only possible to express each
unknown in terms of one of the others. The first row in the
final set of rows can be interpreted as b = 1.575w and the
other rows give the corresponding solutions for f and h.

Before we understand this method it is necessary for us to


understand the concept of 'inverse' of a matrix.

YO

OLEUM &

n ~;

ST U DI ES

fo

EN

GY

UNIV ER SIT

E TR

ER

FP

UNIT 3

k~ e
k ; k kf D r o

Matrices and Determinants

Inverse of a Square Matrix

59
Notes
__________________

Illustration 3.2

__________________

Solution

__________________

The set of simultaneous equations used in illustrations 2.2


and 2.3 can be written as:

__________________

3
6

11
17

x
y

__________________
__________________

180
300

__________________
__________________

3
2
5

4
5
9

3
7
2

x
y
z

107
148
151

It is only necessary to follow out the rules for matrix


multiplication to see that these matrix products are identical
to the sets of equations. If the symbol A is used to represent
the matrix of coefficients, x to represent the vector of
unknowns and B to represent the vector on the right-hand
side, any set of simultaneous linear equations can be
expressed in the form:
Ax = B
Both A and B are known, but x is unknown. If there was a
simple rule for dividing B by A, the result x would be
obtained. However, it is impossible to divide a matrix directly
by another matrix.
The operations of division are best understood as the
reverse of multiplication. Divide B by A is a way of saying,
'find the matrix or vector which when pre multiplied by A
will give B. Since matrix multiplication is in general noncommutative, it is necessary to say when pre-multiplied by
A to indicate that A precedes x in the above equation.
If the required matrix or vector exists, it can be found. But
the procedure is not very straightforward. It involves the
new concepts termed a unit matrix and the inverse of a
matrix.
The unit matrix of order n, written In, is the square matrix
with n rows and columns which has the figure 1 for each
element in the principal diagonal, the diagonal from the top

__________________
__________________

Notes
__________________

left-hand corner to the bottom right-hand corner and 0 for


all other elements. Thus:

__________________
__________________
__________________

I2

L1 0O
M P
N0 1Q

I2

LM1 0 0OP
0 10
MM0 0 1PP
N Q

__________________
__________________
__________________
__________________
__________________
__________________

In a set of n simultaneous linear equations in n unknowns, A


will always be a square matrix. Let I be the unit matrix with
the same number of rows and columns as A, which is also
the number of elements in x, it is easy to see that:
AI =

I A =

I x =

These equations can be confirmed by writing any elements


one chooses in A and x and then following the rules of matrix
multiplication. Since the product of I with any matrix or
vector is identical to the matrix or vector itself, I is sometimes
called the identity matrix.
For a given matrix A, it is usually possible to find a matrix
which when multiplied by A gives the answer I. This matrix
is a sort of reciprocal of A. The matrix which has this
property is represented by A 1 and is called the reciprocal
or more commonly the inverse of A.
The inverse is accordingly defined by the equation:
A-1 A = I
It can be shown that the multiplication of a square matrix
and its inverse always commutes, that is:
A A-1 = I
In this discussion, it has been assumed that a matrix can
have only one inverse. Although academic proofs are not the
main purpose of this book, it is of interest to see how this
can be proved. Let B be any matrix, which satisfies the
equation:
BA=I

Notes

Then:
B = BI = BAA-1 = IA-1 = A-1
Each step in this proof uses one of the results previously
obtained, including the associative property of matrix
multiplication mentioned earlier. Since two matrices are only
equal if they are identical in all respects, this proves that B
is identical to A-1 and therefore, that there is only one inverse
of A.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

This proof gives a glimpse at the foothills of what


mathematicians term matrix algebra. The usefulness of
matrices lies in the fact that one can employ them in all kinds
of proofs and manipulations, simply representing each matrix
by a bold letter without specifying what its elements are or
even how many rows and columns it has, provided that one
always conforms to the rules of matrix multiplication and
addition.
If the inverse of A can be found, it is easy to solve the
equation:
Ax = B
It follows that :
A-1 Ax = A-1 B
Ix = A-1 B

which implies x = A-1 B

So the method of solving the equations is to find the inverse


of A and then multiply this by B. This is not the same thing
as dividing B by A, but it is the nearest one can get to division
of matrices.
The simplest method of finding the inverse of a square matrix
is to use row operations. Using the matrix interpretation of
a set of equations, row operations involved converting the
matrix A on the left of the vertical line into the matrix I. The
effect of these operations was to convert the vector b on the
right of the vertical line into the solution vector, which is
now seen to be equal to A1 B. It is reasonable to deduce that
the same operations will convert the matrix I into A1I, which
is the same thing as A-1.

__________________
__________________

YO

UNIV ER SIT

__________________
__________________

__________________
__________________

11

17

3 2/3

1
3

-5

-2

17
- 15

11
15

2
5

1
-5

__________________
__________________

k~ e
k ; k k f D r o

For illustration 2.2, the matrix of coefficients is inverted as


follows:

__________________
__________________

ST U DI ES

__________________

n ~;

EN

GY

__________________

OLEUM &

fo

Notes

E TR

ER

FP

As a check, the final matrix obtained on the right of the


vertical line may be multiplied by the original matrix on the
left to show that it produces the unit matrix. It is then
multiplied by the vector B to obtain the solution vector:
17
- 15

11
15

180

2
5

1
-5

300

16
12

In using the inverse, it is often convenient to keep fractions


outside the brackets:

OP
Q

LM
N

1 17 11
15 6 3

To multiply this by the vector with elements 180 and 300 it


is then obviously easiest to start by dividing the latter
numbers by 15.
For illustration 2.3 it is necessary to invert the matrix:
3

Using row operations the inverse proves to be:

LM
MM
N

53 19 13
1
31 9
15
56
7
7
7

OP
PP
Q

YO

OLEUM &

n ~;

ST U DI ES

fo

EN

GY

UNIV ER SIT

E TR

ER

FP

UNIT 3

k~ e
k ; k kf D r o

Matrices and Determinants

63

Rather large numbers become involved when the vector


multiplies this matrix formed by the right-hand sides of the
original equations, but eventually the number 56 cancels out:
1
56

53

-19

-13

107

-31

15

148

-7

151

16

Notes
__________________
__________________
__________________
__________________

5
13

It is unnecessary to check that the inverse was correct


provided that the solution vector is now checked in the
original equations:
3

16

13

107
=

148
151

Following this example though involves a lot of arithmetic,


it must be remembered that all modern mathematical
methods assume that computers are available, so these large
amounts of routine calculations are no drawback.
If row operations are used to try to find the inverse of the
matrix from illustration 3.6, it will be found that the third
stage gives a fourth row consisting entirely of zeros on the
left of the vertical line. This matrix has no inverse and is
known as a singular matrix. Whenever a set of equations
has no unique solution, it will be found that the coefficients
form a singular matrix.

Solution of Linear Equations using Determinants


Any pair of simultaneous equations in two unknowns can be
written in the form:
a1x + b1y = h1

(1)

a2x + b2y = h2

(2)

The symbols, other than x and y, represent known quantities


and the solution by elimination can be carried out in the same
way as done earlier. The first step is to multiply the first
equation throughout by a 2 and the second equation
throughout by a1.

__________________
__________________
__________________
__________________
__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________

a1 a2x + a2 b1y = a2 h1

(3)

a1 a2x + a1 b2 y = a1 h2

(4)

Then, subtract the first equation from the second in order to


eliminate x:
a1b2y - a2 b1y = a1 h2 - a2 h1

__________________
__________________

y=

__________________
__________________
__________________

a1 h2 - a2 h1
a1b2 - a2 b1

(5)

Substituting the value of y in the first equation and


rearranging we find that
h1 b2 - h2 b1
a1b2 - a2 b1

(6)

These two formulae are called the general formulae for the
solution of any pair of simultaneous linear equations.
The next important step is to introduce a particular way of
representing this solution so that it is easy to remember.
The method adopted is to write:
a1

b1

a2

b2

= a1b2 a2b1

The left-hand side of this equation is known as a determinant


and the symbols between the two vertical lines are termed
the elements of the determinant. Any set of four symbols or
numbers arranged in this way between vertical lines is
always interpreted as the difference of two products in the
order shown. It should also be noted that this is a special
kind of equation called an identity and is true for any values
or symbols. Therefore, it is a general mathematical truth or
definition rather than a statement about particular values
or symbols. An identity is often represented by a triple
hyphen in place of the double hyphen for an ordinary
equation.
Two simple properties of determinants are immediately
obvious. The value of the determinant is unchanged if the
first row becomes the first column and the second row
becomes the second column; but the value is multiplied by
1 if the first and second rows are interchanged.

These two properties can be represented by the identities:


a1

b1

a2

b2

a1

b1

a2

b2

__________________

a1

a2

b1

b2

a2

b2

__________________

a1

b1

__________________

__________________
__________________

__________________

These identities can be proved by multiplying both sides in


accordance with the original definition of a determinant. This
is known as expanding the determinants. There is a third
simple property which can be similarly proved; if any
multiple of one row or column is added to the other row or
to the other column respectively, the value of the
determinant is unchanged:
a1

b1

( a1+pa2) ( b1+ pb2)

a2

b2

( a1 + qb1) b1

a2

b2

( a2 + qb2) b2

whatever the values of p or q.


So determinants have properties which can be proved and
utilized in solving linear equations. So far the determinants
considered have two rows and two columns and called second
order determinants, used in solving a pair of simultaneous
linear equations in two unknowns. Once the method of
solution is understood, it can be extended to larger systems
of equations where the practical merits of the method are
much more apparent.
Given the two equations at the beginning of this section, the
first step is to take the four coefficients on the left-hand sides
and write them as a determinant. This forms the
denominator of the formula for x. The numerator is the same
determinant with the coefficients of x replaced by the column
of values from the right-hand sides of the equations. So the
formula is:
h1
h2
a1
a2

b1
b2
b1
b2

Notes

__________________
__________________
__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

By expanding these determinants, it can be seen that this is


the same solution as that obtained by elimination. Since the
setting-up of the determinants is a routine procedure and
their evaluation is also routine, the original objective has
been achieved. The equations can now be solved using a
computer, which is able to carry out routine calculations but
unable to think for itself.
We still need to point out that the value of y is represented
in the same way as the value of x. The denominator is the
same and the numerator is obtained by replacing the
coefficients of y by the column of values on the right-hand
sides of the equations:
a1

h1

a2

h2

y = a1

b1

a2

b2

Three different systematic techniques have been presented


for the solution of sets of simultaneous linear equations:
determinants, row operations and matrices. It is reasonable
to ask which is the best method.
The method of determinants is usually the quickest if one
requires the value of a few unknowns. This often applies
when the equations arise from problems in probability. The
determinant of the coefficients on the left-hand side of the
equations should always be evaluated if there is a doubt
whether a set of equations has a unique solution.
The quickest way to obtain a complete solution for all the
unknowns is usually by row operations. If the method of
determinants gives zero divided by zero for each of the
unknowns, only row operations will provide the solution,
which expresses the unknown in terms of one another, there
being no unique solution. Row operations are obviously
quicker than matrix methods, since they obtain the whole
solution with no more work than is involved in inverting
the matrix.
Matrix methods are extremely valuable if there are a large
number of different sets of equations, all of which have the

same matrix of coefficients. For instance, it might be desired


in illustration 3.2 to consider all the possible product mixes
obtainable by adding or subtracting one or two machines of
each of the four kinds. There would then be several hundred
possible combinations of machines, each giving rise to a set
of equations with the same matrix of coefficients. The inverse
of this matrix can be found and then each set of equations is
solved simply by multiplying this inverse matrix by the
appropriate vector.
All three methods can be applied using computer, but in
practice computer programs tend to prefer matrix methods
because of the wider applications of matrices.
It will be observed that a set of equations which has no unique
solution always gives a determinant of value zero and always
provides a singular matrix. It is customary to speak of the
determinant of a matrix provided that it is a square matrix.
A singular matrix may be defined either as a matrix which
has no inverse or as a matrix whose determinant is zero.

The practical importance of determinants increases when it


is necessary to solve large numbers of equations with
correspondingly large number of unknowns. It would be very
convenient if the solution of a large set of linear equations
could be written out in determinant form by the same routine
as has been described for a pair of equations. If we followed
this procedure, the solution of the equations:
a1x + b1y + c1z = h1

(1)

a2x + b2y + c2z = h2

(2)

a3x + b3y+ c3z = h3

(3)

ought to be:
h1
h2
h3

b1
b2
b3

c1
c2
c3

a1
a2
a3

b1
b2
b3

c1
c2
c3

x=

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

But is it that simple?


Until now, a determinant with three rows and three columns,
termed a Third Order determinant, has not been defined;
there are, therefore, no rules for evaluating it. Since the
purpose of determinants is to enable the solution of a set of
equations to be determined, the rules for evaluating
determinants of higher order are designed to fulfil this
purpose.
The value of x in the solution of the given set of equations is
found by elimination to be:

__________________

h1 (b2 c3 - b3 c2) b1 (h2 c3 h3 c2) + c1 (h2 b3 h3 b2)


a1 (b2 c3 - b3 c2) b1 (a2 c3 a3 c2) + c1 (a2 b3 a3 b2)
The terms have here been arranged so that all the expression
in brackets can be expressed as second order determinants.
It can now be seen that this will be the same formula as given
previously. So the rule for evaluating a third order
determinant is:
a1

b1 c 1

a2

b2

c2 a1

a3

b3

c3

b2 c2

a2 c2
- b1

b 3 c3

a2 b2
+ c1

a3

c3

a3 b3

The values of y and z are obtained by expressing the


information in the original equations in the form of
determinants by the same routine as for a pair of equations;
that is, the denominator is the same as in the formula for x,
and the column of values from the right hand sides of the
equations replaces the coefficients of y or z in the
determinants forming the respective numerators.
a1
a2
a3

h1
h2
h3

c1
c2
c3

a1
a2
a3

b1
b2
b3

h1
h2
h3

a1
a2
a3

b1
b2
b3

c1
c2
c3

z=

y=
a1
a2
a3

b1
b2
b3

c1
c2
c3

The same principles can be applied to solve larger sets of

equations, the determinants being set up by exactly the same


routine. A determinant always has the same number of rows
as columns, corresponding to the number of equations and
unknowns that have to be solved.
The rule for evaluating a determinant of n rows and columns
is to take the n elements in the first row and multiply each
by the smaller determinant obtained by missing out the
first row and the column which contains the elements in
question. Thus, the first element is multiplied by the original
determinant reduced by the first row and the first column;
the second element is multiplied by the original determinant
reduced by the first row and the second column and so on.
Finally, the products are collected together by adding all
the odd ones and subtracting all the even ones. So a fourth
order determinant is expanded as:
a1 b1 c1 d1
a2 b2 c2 d2
a3 b3

c3 d3

a4 b4

c4 d4

a1

b2

c2

d2

a2

c 2 d2

b3

c3 d3

a3

c 3 d3

b4

c4 d4

a4

c 4 d4

a2

b2 d2

a2

b 2 c2

a3

b 3 c3

a4

b 4 c4

+ c1 a3 b3

d3

a4

d4

b4

-b1

-d1

Third order determinants, which occur in this expansion are


termed as minors of the original determinant. If the minor
is prefaced by the appropriate sign, it is termed as the
co-factor of the element by which it is multiplied. Thus, the
cofactor of b1 is:

a2
a3
a4

c2
c3
c4

d2
d3
d4

Unless a computer is being used, a larger determinant is


hardly ever multiplied out as it stands. It is much easier to

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

start off by simplifying it, using the principal properties of


determinants to reduce large numbers to small numbers or
to zero wherever possible.
Three of the principal properties have already been
mentioned when discussing second order determinants and
it can be proved that they remain true for determinants of
higher order. The main properties of the determinants can
be summarized as under:
1.

The value of the determinant is unchanged if the rows


and columns are interchanged with each other.

2.

The value of the determinant is multiplied by 1 if any


row is interchanged with any other row. It follows from
these first two properties that the value of the determinant
is multiplied by 1 if any two columns are interchanged.

3.

If any multiple of any row or column is added (or


subtracted) to any other row or any other column
respectively, the value of the determinant is unchanged.
However, one cannot add a multiple of a row to a column,
or vice versa.

4.

If any row or column has a factor common to all its


elements, then this factor may be divided out. For
instance, it can be seen by expanding both sides that:

__________________
__________________

a1

b1

c1

pa2

pb2

pc2

a3

b3

c3

a1

b1

c1

a2

b2

c2

a3

b3

c3

5.

It follows from (4) that if any determinant has a row of


zeros or a column of zeros, the value of the determinant is
zero because then you can take out 0 common and
multiply the whole determinant with it resulting in a zero.

6.

It follows from (3) and (5) that if any row is identical to


any other row or a multiple of any other row (or if any
column is identical to any other column or a multiple of
any other column) then the value of the determinant is
zero.

We would like to illustrate the use of these six properties


before we introduce the final three properties.

Notes

Illustration 3.3
1

__________________
__________________
__________________

is obviously zero, using the properties (5) or (6) discussed


above.

__________________
__________________
__________________

Illustration 3.4
3

__________________

__________________

is zero, using (6)


2

__________________

__________________

The first row is 1 1/2 times the second row.


Illustration 3.5
4

is equal to
11

13

using (3)
3

-1

By subtracting twice the first row from the second row, the
numbers are made smaller and so easier to manipulate.
In evaluating a large determinant, property (4) is first applied
wherever possible. Then property (3) is used where possible
to make the numbers small and in particular to obtain as
many zeros as possible.
Illustration 3.6
12
11
9
13

17
13
15
5

36 19
12 17 36 19
28 14
11 13 28 14
3
12 6
3
5 4
2
8 11
13 5 8 11

12 17 9 19
12 1 9
1
11 13 7 14
11 1 7 0
12
12
3
5
1 2
3
3 1 0
13 5 2 11
13 1 2 15
13 1 9
1
12 1 9
1
10 1 7 0
10 1 7 0
12
12
0
3 1 0
0
3 1 0
14 1 2 15
1 1 2 15

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The numbers have now been reduced quite a lot and there is
a row which includes two zeros. The next step is to make a
third zero in this row. The third row is then interchanged
with the second row and again interchanged with the first
row. The fourth order determinant then reduces to a single
third order determinant.
12 28 9 1
10 22 7 0
12
12
0
0
1 0
1
5 2 15

12 28 9 1
0
0
1 0
10 22 7 0
1
5 2 15

__________________

0
12
12
10
1

0
28
22
5

1
9
7
2

0
12
1
12 10
0
1
15

28

22
5

0
15

The same procedure can now be applied to this smaller


determinant. A second zero is obtained in the final column,
this column is moved to the first column by applying property
(2) twice and property (1) is then applied before the final
evaluation.
12 28
24 5 11
1

1
12
0 24 5
15
1

24 0
0

24

28 1
12
11 0 24 5

181 425 0
12
5
181

28

1 28
0 11

181 0 425
1

11 24 12
425
28

5
11

181
425

5 181
11 425

= 24 [(- 5 425) (- 11 181)]


= 24 (-2125 1991) = 24 4,116
= - 98,784
This illustration is a long one, since the original determinant
was quite a large one. It should also be remembered that it

is much easier to apply the rules for simplifying and


evaluating determinants oneself than to follow the steps
which someone else has chosen. Often there is more than
one route to the answer and the route adopted may be a
matter of choice.

Notes
__________________
__________________
__________________
__________________

The last three of the principal properties of determinants


combine together some of the earlier properties, enabling
one to amalgamate two or three steps into one.

__________________

7.

__________________

If any row is moved up or down an even number of rows,


the value of the determinant remains unchanged. If it is
moved an odd number of rows, the value of the
determinant is multiplied by 1. Similarly, if any column
is moved an even number of columns, the value is
unchanged; if it is moved an odd number of columns, the
value is multiplied by 1.
12
-10
0
-1

-28 9
-22 7
0 1
5 -2

12
-28
-5
-11
-181 425
8.

1
0
0

1
0
0
15

= 1

1
0
0

12
-5
-181

0
12
-10
-1

0
-28
-22
5

1
9
7
-2

0
1
0
15

-28
-11
425

A determinant can be expanded using the elements in


the first column instead of the elements in the first row.
Each element is multiplied by its minor, the smaller
determinant obtained by deleting the first column and
the row which contains the element. The products
are then alternately added and subtracted in the same
way as when expanding by the first row.
12 28

a f

11 10
28 1
28 1
5 11 0 12
(5)
1
5 15
5 15
11 0
1 5 15

9.

If there is any row or column in which all the elements,


except one, are zero, the determinant is equal to the
product of that one non-zero element and its cofactor. For
an element in the jth row and the kth column, the cofactor
is obtained by deleting the jth row and kth column to obtain

__________________
__________________

__________________
__________________

Notes
__________________

the minor and then multiplying by 1 if (j + k) is odd but


leaving it unchanged if (j + k) is even.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

12 28
10 22
0
1

0
5

9
7

1
0

12 28 1
10 22 0
1 0
1
5 15
2 15

12
5

28 1
5 11
11 0
181 425
181 425 0

__________________

2 7
1 3
4 2
5 4

0 5
2 7 5
0 4
7 1 3 4
0 2
4 2 2
7 1

All these properties of determinants can be proved, but the


proofs are of interest only to mathematicians. They are all
intuitively reasonable and the manager who needs to use
determinants should be content to accept that they are valid
for determinants of any order.

The systematic nature of determinants makes it possible to


use them in solving sets of simultaneous linear equations
with the aid of a computer and so take the drudgery out of
mathematics. It is still necessary to understand the nature
of determinants and their principal properties in order to
know what the computer is doing and why it may sometimes
fail to produce a solution. It is also an advantage to be able
to solve fairly simple sets of equations without the
complications of computers, either by calculation on papers
or by using a calculator.
An obvious managerial application of determinants is in the
type of problem illustrated below, where it is required to
find the correct product mix in order to make full use of
machine capacities.

Activity 3B

Illustration 3.7
In a petroleum engineering workshop there are seven
machines for drilling, two for turning, three for milling and
one for grinding. Four types of brackets are made. Type A is
found by work study to require 7 minutes drilling, 3 minutes
turning, 21/2 minutes milling, and 11/2 minutes grinding, and
the corresponding times in minutes for the other types are:
B: 5, 0, 1 1/2, 1/2; C: 14, 6, 9, 3 1/2; D: 26, 9, 11, 11/2. How many of
each type of brackets should be produced per hour in order
to keep all the machines fully occupied?

How do determinants simplify


the matrices?
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Solution

__________________

The four equations could be set up in the same way as done


earlier, each equation representing the total minutes of work
per hour on a particular type of machine. The numbers on
the right-hand sides would therefore be 420, 120, 180, and
60 respectively. The coefficients on the left-hand sides would
form the determinant constituting the denominator in the
solution. This can be written out directly from the
information given in the question and then evaluated:
7

14

26

7
3
0
6
9
31
1
1
2
1
9 11
45
2
2
1 1
1
1
3
1
3
1
2 2
2
2

5 14 26
0 2 3
3

3 18 22 4
1 7
3

0 5 0
1 0 2
0 3 8

5
3
7

0 1 1 6

The operations has been as follows. From second row 3, and


from 3rd and 4th row has been taken out as common.
Then,1 in the second determinant the second row multiplied
by 7, 5 and 3 respectively is subtracted from 1st, 3rd and 4th
row respectively. In the fifth determinant first column is
subtracted from third column.
5 0 5
1 0 1
3
15
3 8 7
3 8 7
4
4
1 1 6
1 1 6

__________________

Activity 3C
W hy matrices have better
suitability for solving than
determinants?

1
3
1

0
8
1

0
4
-7

8
1

4
-7

__________________
__________________

__________________
__________________

__________________
__________________
__________________
__________________
__________________
__________________

-15 (-14 1) =

-15

2
1

1
-7

225

The numerator for type A appears more formidable at first,


but can be quickly simplified:
420
120
180
60

5 14 26
7
0 6 9
2
60
3 9 11
3
1 3
1
1

3
2
15
0
2

5 14 26
7
0 6 9
2
50
3 9 11
6
1 3
1
2

5 14 26
0 6 9
3 18 22
1 7 3

0 21 11
3 21 11
0 6
9
15 2
6
9
0 3 13
0 3 13
1 7
3

3 7 11
45 2 2 9
0 1 13

3 7 16
45 2 2 35
0 1 9
225

45

3 80
2 35

3 16
2
7

= 225(-21+32) = 225 11
Since the denominator is 225, the number of brackets per
hour of type A in the solution is 11. There are no new
problems involved in finding the solution for the other three
types of bracket and so this task is left for you. You should
note that it has not been necessary either to introduce
symbols for the four unknowns or to write out the four
equations.

Most managerial applications of determinants are as straightforward as this example.


Determinants are useful when you have to solve a number
of equations, but are difficult to use when you have to look
at computerization of the firms operations. We then turn to
matrices, to see how these can be used to solve managerial
problems.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Let us take up a managerial application of matrices, which


is of theoretical as well as practical interest. This is mostly
used by advertising agencies and big companies in brand
management.
Illustration 3.8
Three brands of detergent share the market, 40% of
customers buying brand A, 50% brand B, and 10% brand C.
Each week there are changes in the customers choices. Of
those who bought brand A previous week, 50% buy it again,
but 15% change to brand B and 35% to brand C. Of those
who bought brand B, 60% buy it again, 10% buy brand A and
30% buy brand C. Of those who bought brand C, 85% buy it
again, 5% buy brand A and 10% buy brand B. What proportion
of the market will each of the three brands eventually hold?
Solution
It is simplest to express the brand switching percentage as
decimals, keeping percentage figures for the market shares.
The change in market shares in the first week can be obtained
as the product of a matrix representing the brand switching
and a vector representing the initial market shares:
0.50

0.10

0.05

40

0.15

0.60

0.10

50

0.35

0.30

0.85

10

25.5
=

37.0
37.5

It will easily be seen that the terms involved in this product,


(0.50 40), (0.10 50), etc., are the correct calculations from
the information given in the example in order to obtain the
new market shares. In this type of model, each of the matrixces
adds up to 1 and the elements in each vector total 100.

__________________
__________________

YO

UNIV ER SIT

ST U DI ES

n ~;

EN

GY

__________________

OLEUM &

fo

Notes

E TR

ER

FP

k~ e
k ; k k f D r o

For the following week, the new market-share vector must


multiply the same brand switching matrix:

__________________

0.50

0.10

0.05

25.5

__________________

0.15

0.60

0.10

37.0

__________________

0.35

0.30

0.85

37.5

18.325
=

29.775
51.900

__________________
__________________
__________________
__________________
__________________
__________________

As could have been guessed from the original information,


brand C is getting a steadily larger share of the market. It
cannot, however, obtain a monopoly. A little arithmetic will
show that if brand C has 80% of the market one-week it
cannot have more than 75% the next week. Clearly its
eventual share will be somewhere between 51.9% and 75%,
but obviously a direct method of finding the eventual share
is desirable.
Let the brand switching matrix be M and the successive
market-share vector a, b, c so that the above two equations
can be written as:
Ma = b
Mb = c
A vector x will represent the final market shares such that
pre-multiplying it by the brand switching matrix produces
an unchanged result. That is:
Mx = x
Using I for the unit matrix and 0 for a zero matrix or vector,
this can be rearranged:
Mx = Ix
Mx Ix = 0
(M I) x = 0
The fact that the last equation is equivalent to the preceding
one depends on the property of matrix multiplication termed
distributive. The full summary of the properties of matrix
multiplication is that it is associative and distributive but
not, in general, commutative. The latter property makes it
essential to place M and I before x in both equations to be
sure that they are equivalent.

Since M must be a square matrix and I is chosen to have the


same number of rows and columns as M, there is no difficulty
in finding the matrix (M I):

LM0.50
15
MM00..35
N

OP
PP
Q

LM
MM
N

OP
PP
Q

LM
MM
N

0.10 0.05
1 0 0
0.50 0.10 0.05
0.60 0.10 0 1 0 0.15 0.60 0.10
0.30 0.85
0 0 1
0.35 0.30 0.15

OP
PP
Q

One cannot find x by obtaining the inverse of (M-I), because


it is a singular matrix. Since each column of M totals 1, each
column of (M-1) must total zero. The determinant of this
matrix must therefore be equal to zero, applying properties
(3) and (5) of determinants section in adding the second and
subsequent rows to the first row. It follows that (M-I) will
always be a singular matrix in this type of problem.
The equations represented by the matrix equation are clearly
not all independent. When it has been stated what
proportions of previous customers are retained or gained by
all brands except the last, the equations giving the market
share of the last brand is made up of what is left.
There is an additional fact not included in the matrix
equation. Expressing the market shares as percentages, all
the elements in the vector x must total 100. This gives an
additional row to permit a unique solution to be found by
row operations:
-0.50
0.15
0.35
1

0.10
-0.40
0.30
1

0.05
0.10
-0.15
1

0
0
0
100

In the process of solution, one of the rows will become a row


of zeros and the remaining rows will give a unique solution.
It is left to you to confirm that the final shares are
respectively 12/109, 23/109 and 74/109, or approximately 11.0%,
21.1% and 67.9%.
Forming it into a vector and pre-multiplying this vector by
the brand switching matrix can check the solution. It will be
noted that the information about the initial market shares
is not used in finding the solution. The final market shares

Activity 3D
Can Markov chains be used in
petroleum industry? Explain.
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

will be exactly the same whatever shares the brands started


with.
As a practical application, the brand switching example has
two drawbacks. First, repeated matrix multiplications will
eventually involve fractions of customers, which is
impossible. Second, it is highly improbable that there will
be a persistent pattern of brand switching; either customers
will become less inclined to switch brands, or
the pattern of switching will be disrupted by special sales
campaigns or dynamic market forces.
However, the model has many other applications more
realistic than brand switching, particularly when the
discussion is transferred from proportions to probabilities.
Where an operator attends several machines which are
subject to random stoppages at differing average frequencies,
so that two or more may be stopped at the same time causing
machine interference, the technique here discussed enables
the average productivity of each machine to be accurately
calculated. It can also be used to calculate average stock
levels and the probability of running out of stock when
demand and supply are random and so assist in finding
the optimum stockholding policy. The model has the
impressive title ergodic Markov Chains.
You would have noticed that one or two theoretical
difficulties have been ignored in presenting the technique.
How can it be proved that the market shares will settle down
to a stable vector? Is it always permissible to use an equation,
which assumes the existence of the stable vector?
The techniques are, in fact, valid for the types of problems,
which have been discussed. But it would be very unwise to
apply it unthinkingly in quite different situations, where not
all the elements in the matrix are positive. Pure
mathematicians go deeply into all questions of validity and
it is necessary to seek their guidance whenever the
appropriateness of certain techniques is in doubt.
Several examples of problems involving sets of simultaneous
linear equations were discussed in unit 2, which could be
solved either by elimination or by substitution. However,

all these methods require certain amount of ingenuity and a


great deal of calculation. Attempting to solve ten or twelve
simultaneous equations by elimination would be a task not
meant for the faint-hearted and usually managers do not have
the time to do it. Yet in many important applications, such
as finding the product mix that will keep a number of
different types of machines fully occupied, it is not unusual
to have large numbers of equations and unknowns. If this
amount of detail cannot be avoided, it is reasonable to look
for a routine method of solving equations which is so
automatic that a computer can be employed to find the
solution. Such routine methods, involving determinants and
matrices, are discussed here.
Keeping all the machines fully occupied is not necessarily
the managers main objective. There may be other product
mixes which leave some machines partly unoccupied but
yield a greater total profit. The managers usual objective is
to find the product mix which will maximize the total profit
without exceeding the capacities of the machines.

3.1 Nahar Chemical Mills produces three varieties of base


oil, Super fine Grade (A grade), fine grade (B grade) and
coarse grade (C grade). The total annual sales in lacs of
rupees of these products for the year 1999 and 2000 in
the four cities is given below, find the total sales of three
varieties of base oil for two years.
For the year 1999
City Product
Superfine base oil (A grade)
Fine base oil (B grade)
Coarse base oil (C grade)

Calcutta Mumbai Chennai


30
10
16

16
48
8

12
14
62

Delhi
24
16
12

For the year 2000


City Product
Superfine base oil (A grade)
Fine base oil (B grade)
Coarse base oil (C grade)

Calcutta Mumbai Chennai


34
10
26

20
44
12

10
22
78

Delhi
14
8
10

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Notes
__________________
__________________

3.2 A 2T oil manufacturer produces three products A, B, C


which he sells in the market. Annual sale volumes are
indicated as follows:

__________________

Market

__________________
__________________
__________________

Products
A

8,000

10,000

15,000

II

10,000

2,000

20,000

If the unit sale price of A, B and C are Rs 2.25, 1.50 and


Rs 1.25 respectively, find the total revenue in each
market with the help of matrices. (ii) If the unit costs of
above three products are Rs 1.60, Rs 1.20 and Re 0.90
respectively, find the gross profit with the help of
matrices.

__________________
__________________
__________________
__________________

3.3 In a development plan of a cracker complex, a contractor


has taken a contract to construct certain buildings for
which he needs building materials like stone, sand, etc.
There are three firms A, B, C that can supply him these
materials. At one time these firms A, B, C supplied him
40, 35 and 25 truck loads of stones and 10, 5 and 8 truck
loads of sand respectively. If the cost of one truck load
stone and sand are Rs 1,200 and Rs 500 respectively
then find the total amount paid by the contractor to each
of these firms A, B and C separately.
3.4 HEG Ltd. maintains the records of the daily cost C of
operating the stress relieving furnaces division",
which is a linear function of the number of incoming
electrodes 1 and outgoing electrodes P, plus a fixed cost
a, i.e.,
C = a + bp = d1
Given the following data for 3 days find the values of a,
b, and d by setting up a linear system of equations and
using the matrix inverse.
Day

Cost
(in Rs)

No. of Incoming
Electrodes To stress
relieving furnances
division

No. of Outgoing
electrodes From stress
Relieving furnaces
division

6,950

40

10

2.

6,725

35

3.

7,100

40

12

3.5 Mr Bhattacharya has retired from service in 1999 from


IOC. He received Rs 14 lacs as the provident fund and
retirement benefits from the company. He decides to
invest a sum of Rs 40,000 in three different stocks that
yield 10%, 12 % and 15% respectively. The income from
the third stock (which yields 15%) is twice the income
from the first stock (which yields 10%). After one year,
Mr Bhattacharya earned an income of Rs 5000 from his
investments. What is the amount that he has invested
in each type of stock.
3.6 Robin Singh & Company Ltd stocks lubes of Castrol
brand and Mak brand. The matrix of transition
probabilities of the lubes is shown below:
Castrol

Mak

Castrol

0.9

0.1

Mak

0.3

0.7

Determine the market share of each of the brand in


equilibrium position.
3.7 By expanding the determinants, prove the identities:
(a) a1 b1

a2 b2
c)

a1 b1
a2 b2

a1

a2

(b) a1

b1

b1

b2

a2

b2

a1 + pa2

b1 + pb2

a2

b2

a2

b1

a1

b1

a1 + qb1

b1

a2 + qb2

b2

3.8 Evaluate the determinants:


a)

(b) 1

(c ) 1

(e) 3

-1

(g ) 3

-3

(f)

(d)

(h)

11

13

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Notes

__________________

3.9 Evaluate the following determinants, first using


property (9) and then the other properties as
appropriate:

__________________

(a) 3

__________________

__________________

__________________

(b)

-2

-1

-6

(c)

9 21

11

-3

-2

21

5 15

-7

(e)

10

-5

__________________
__________________

(d) 3

__________________

-2

-8

__________________

-1

-3

__________________

-10 17

-3

3.10 Evaluate the following determinants, first using


property (4) three times, then property (3) once or twice
to make the number smaller and finally property (9), or
other properties as appropriate.
(a) 2

1 (b)

12 24

35

(d) 1

-3

-44

30

32

35

-18

28

10

-63

34

40

(e)

(c)

6 12

12

-1

-3

12

-6

12

3.11 Four boys order in a fish-and-chips restaurant. A orders


fish, chips and coke. B orders two fish with chips. C
orders fish and coke. D orders chips and coke. The prices
are Rs 50 for fish, Rs 18 for chips, and Rs 15 for coke.
(a) Express each boys order as a row vector.
(b) Add together these four vectors to obtain a fifth row
vector representing the total quantities ordered.
(c) Express the prices as a column vector.
(d) Multiply each of the five row vectors by the price
vector, to obtain the amount owed by each boy and
the total amount owed.
(e) Check that the fifth result in (d) is equal to the sum
of the other four results.

85
Notes
__________________
__________________
__________________
__________________
__________________

Objectives
After reading this unit, you will be able to:

__________________
__________________
__________________

Understand how probability matters in business

Understand the basic methods of solving probability problems

Understand what is probability distribution and why is it important

If all business decisions could be made under conditions of


certainty, the only valid justification for a poor decision would
be failure to consider all the pertinent facts. With certainty,
one can make a perfect forecast of the future. Unfortunately,
however, the manager rarely, if ever, operates in a world of
certainty. Usually, the manager is forced to make decisions
when there is uncertainty as to what will happen after the
decisions are made. In this latter situation, the mathematical
theory of probability furnishes a tool that can be of great
help to the decision maker.
The idea of probability is normally associated and
remembered in connection with games or gambling. If a
bookmaker offers odds of 2 to 1 against a team winning the
world cup, it means that he considers that the probability,
that the team will win, is not more than 1/3. When he accepts
a bet of Re 1, he believes he has a probability of at least 2/3
that he will gain the money and a probability of not more
than 1/3 that he will have to pay out Rs 2 as well as return the
original Re 1. So he expects to make a profit, not necessarily
on this particular bet but on all the bets he takes when
applying the same general principles.
Herein lies the basis for a definition of probability as the
degree of belief that an event will occur. Most statisticians

__________________
__________________

86
Notes
__________________
__________________
__________________
__________________
__________________
__________________

would not accept this as a satisfactory definition, but there


is an important minority, usually known as Bayesians, who
do in fact take degree of belief as their definition of
probability. The fact that there are differences of opinion
even over the definition of probability will serve as a warning
to the reader of the difficulties of the subject, for there were
no such disagreements over the subjects covered in the
previous chapters.

__________________
__________________
__________________
__________________

It may be added that it is only the early quotations of betting


odds that are based on the bookmakers degree of belief. When
a reasonable number of bets have been taken, the bookmaker
will adjust the odds in such a way that he makes a profit
whichever team wins. It is the balance of opinion among
punters, which really decide the odds.
An alternative definition of probability is based on counting
numbers of equally likely events. In a situation in which there
are n possible outcomes, all equally likely, the probability
of each of these outcomes is 1 divided by n. The outcomes
must be defined in such a way that exactly one will occur;
that is, it must be impossible for none of them or for more
than one of them to occur. They are termed as simple events.
If an ordinary six-sided die is thrown, the possible outcomes
are the numbers 1, 2, 3, 4, 5 and 6. One and only one of these
will in fact occur. If the die is unbiased, which is a way of
saying that all the outcomes are equally likely, then the
probability of each simple event is 1/6.
It is now possible to discuss compound events, such as less
than 3, 3 or more, or an even number when throwing an
unbiased die. If a compound even is defined to include m
equally likely simple events, then the probability of the
compound event is m/n. The first example given consists of
the simple events 1, 2; the second consists of 3, 4, 5, 6; and
the third consists of 2, 4, 6. So their respective probabilities
are 1/3, 2/3, and 1/2.
This definition of probability is called the classical or a
priories definition of probability. It is a priories because the
probability can be determined before the die has even been
thrown. But it has two great weaknesses. Its theoretical

87

weakness is that the phrase equally likely is synonymous


with equally probable, and so the definition is circular. Its
practical weakness is that it cannot cope with situations
where the simple events are not all equally likely; it would
be an unusual horse-race in which one could say with
confidence that all the horses are equally likely to win.
Later in this chapter a third definition of probability will be
introduced. However, the classical definition is perfectly
adequate for discussing permutations and combinations, and
nothing said later will detract from the validity of the
conclusions reached using the classical definition.

Most of us are familiar with the laws of chance regarding


coin flipping. If someone asks about the probability of a head
on one toss of a coin, the answer will be one-half, or 0.50.
This answer is based on common experience with coins, and
assumes that the coin is a fair coin and that it is fairly
tossed. This is an example of objectivity probability. There
are two interpretations of objective probability. The first
relies on the symmetry of outcomes and implies that
outcomes that are identical in essential aspects should have
the same probability. A fair coin is defined to be one that is
evenly balanced and has two sides that are identical (except
for minor differences in the image). Hence, each side should
have equal probability of one-half (ignoring the possibility of
the coin landing on its edge). If the coin was bent, weighted,
or two-tailed, the answer would be different. As another
example, suppose we have a box containing three red and
seven black balls (that are of the same size, have the same
feel, and are otherwise identical except for colour), and the
balls are thoroughly mixed. The symmetry of outcome
interpretation would assign a 0.10 probability to each ball,
and hence a 0.30 chance of drawing a red ball.
The relative frequency interpretation of objective probability
relies on historical experience in identical situations. Thus,
if a coin has been flipped 10,000 times with 4,998 heads, we
would conclude that the probability was 0.50 (i.e. 4,998/10,000
rounded) for a head the next time the coin was flipped in the
same manner as before.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

88
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

A subjective interpretation of probabilities is often useful


for business decision making. In the case of objective
probability, definitive historical information, common
experience (objective evidence), or rigorous analysis lie
behind the probability assignment. In the case of subjective
interpretation, quantitative historical information may not
be available; and instead of objective evidence, personal
experience becomes the basis of the probability assignment.
For managerial decision-making purposes, the subjective
interpretation is frequently required, since reliable objective
evidence may not be available.
Assume that a manager is trying to decide whether or not to
build a new factory, and the success of the factory depends
largely on whether or not there is a recession in the next
five years. A probability assigned to the occurrence of a
recession would be a subjective weight, which would be
assigned after. There would certainly be less agreement on
this probability than there would be on the probabilities of
drawing a red ball, or of a fair coin coming up heads. Since
we are primarily concerned in this book with management
decisions, we shall often assign subjective probabilities to
events that have a critical bearing on the management
decision. This procedure aims to ensure consistency between
a decision-makers judgment about the likelihood of the
possible states of nature and the decision that is
made.
One important objective of the suggested decision process is
to allow the decision maker to think in terms of the possible
events that may occur after a decision, the consequences of
these events and the probabilities of these events and
consequences, rather than having the manager jump
immediately to the question of whether or not the decision
is desirable.

Two fundamental statements about probabilities are:


1.

Probabilities of all the various possible outcomes of a


trial must sum to one.

89

2.

Probabilities are always greater than or equal to zero


(i.e., probabilities are never negative) and are less than
or equal to one. The smaller the probability, the less
likely the chance of the event happening.

Notes
__________________
__________________
__________________

The first statement indicates that if A and B are the only


candidates for an office, the probability that A will win plus
the probability that B will win must sum to one (assuming a
tie is not possible).

__________________

The second statement results in the following


interpretations. If an event has a positive probability, it may
possibly occur; the event may be impossible, in which case it
has a zero probability; or the event may be certain to occur,
in which case the probability is equal to one. Regardless of
whether probabilities are interpreted as objective
probabilities or as subjective weights, it is useful to think in
terms of a weighting scale running from zero to one. If
someone tosses a coin of unknown characteristics 500 times
to obtain an estimate of objective probabilities and the results
are 225 heads and 275 tails, the range of possible results may
be converted to a zero-to-one scale by dividing by 500. The
actual results are 225/500 = 0.45 heads and 275/500 = 0.55
tails. Hence, if we wish to derive probabilities, we shall
manipulate the data so as to adhere to the zero-to-one scale.
The 0.45 and the 0.55 may be used as the estimators of the
true probabilities of heads and tails (the true probabilities
are unknown).

__________________

Two or more events are mutually exclusive if only one of the


events can occur on any one trial. The probabilities of
mutually exclusive events can be added to obtain the
probability that one of a given collection of the events will
occur.
Illustration 4.1
The probabilities shown in Table 4.1 reflect the subjective
estimate of a newspaper editor regarding the relative chances
of four candidates for a public office (assume a tie is not
possible).

__________________
__________________
__________________

__________________
__________________

90
Notes
__________________

Table 4.1: Election Probabilities


Event: Elect

Probability

Candidate A

0.18

Candidate B

0.42

__________________

Candidate C

0.26

__________________

Candidate D

0.14

__________________
__________________
__________________

__________________

1.00

__________________
__________________
__________________

These events are mutually exclusive, since in one election


(or in one trial) only one event may occur; therefore, the
probabilities are additive. The probability of a Democratic
victory is 0.60; of a Republican victory, 0.40; or of either B or
C winning, 0.68. The probability of both B and C winning is
zero, since only one of the mutually exclusive events can occur
on any one trial.

Events may be either independent or dependent. If two


events are (statistically) independent, the occurrence of one
event will not affect the probability of the occurrence of the
second event.
When two (or more) events are independent, the probability
of both events (or more than two events) occurring is equal
to the product of the probabilities of the individual events.
That is:
P (A and B) = P(A) x P(B) if A, B independent
Where
P (A and B)

= Probability of events A and B both occurring

P (A)

= Probability of event A occurring

P (B)

= Probability of event B occurring

The above equation indicates that the probability of A and B


both occurring is equal to the probability of A multiplied by
the probability of B, if A and B are independent. If A is the
probability of a head on the first toss of the coin and B is the

91

probability of a head on the second toss of the coin, then:

Notes
__________________

P (A)

P (B)

__________________

P (A and B)

=x =

__________________

The probability of A and B occurring (two heads) is one-fourth.


P (A and B) is the joint probability of events A and B. Where
appropriate, the word and can be omitted to simplify the
notation and the joint probability can be written simply as P
(AB).
To define independence mathematically, we need the symbol
P (BA). The symbol P (BA) is read the probability of
event B, given that event A has occurred. P (BA) is the
conditional probability of event B, given that event A has
taken place. Note that P(BA) does not mean the probability
of event B divided by A the vertical line followed by A means
given that event A has occurred.
With independent events:
P (BA) = P (B) if A, B independent
That is, the probability of event B, given that event A has
occurred, is equal to the probability of event B if the two
events are independent. With two independent events, the
occurrence of the one event does not affect the probability of
the occurrence of the second [in like manner, P (AB) =P
(A)].
Two events are dependent if the occurrence of one of the
events affects the probability of the occurrence of the second
event.
Lets take an example.
Flip a fair coin and determine whether the result is heads or
tails. If heads, flip the same coin again. If tails, flip an unfair
coin that has a three-fourths probability of heads and a onefourth probability of tails. Is the probability of heads on the
second toss in any way affected by the results of the first
toss? The answer here is yes, since the result of the first
toss affects which coin (fair or unfair) is to be tossed the
second time.

__________________

__________________
__________________
__________________
__________________
__________________
__________________

92
Notes
__________________
__________________
__________________
__________________
__________________
__________________

Another example of dependent events involves mutually


exclusive events. If events A and B are mutually exclusive,
they are dependent. Given that event A has occurred, the
conditional probability of B occurring must be zero, since
the two events are mutually exclusive. Lets take up two
examples where we clear the difference between joint
probabilities, conditional probabilities and unconditional (or
marginal) probabilities.

__________________
__________________
__________________
__________________

Illustration 4.2
Assume we have three boxes, which contain red and black
balls as follows:
Box 1 :

3 red and 7 black

Box 2 :

6 red and 4 black

Box 3 :

8 red and 2 black

Suppose we draw from a ball from box 1; if it is red, we draw


a ball from box 2. If the ball drawn from box 1 is black, we
draw a ball from box 3. Consider the following probability
questions about this game:
1.

What is the probability of drawing a red ball from box


1? This probability is an unconditional or marginal
probability; it is 0.30. (The marginal probability of
getting a black is 0.70).

2.

Suppose we draw a ball from box 1, and it is red; what is


the probability of another red ball when we draw from
box 2 on the second draw? The answer is 0.60. This is
an example of a conditional probability. That is, the
probability of a red ball on the second draw if the draw
from box 1 is red is a conditional probability.

3.

Suppose our first draw from box 1 was black; then the
conditional probability is 0.80. The draw from box 1 (the
conditioning event) is very important in determining
the probabilities of red (or black) on the second draw.

93

4.

Suppose, before we draw any balls, we ask the question:


What is the probability of drawing two red balls? This
would be a joint probability; the event would be a red
ball on both draws. The computation of this joint
probability is a little more complicated than the above
questions, and some analysis will be of value.
Computations are as follows:
P(A and B) = P(B|A) P (A)

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Figure 4.1

Table 4.2 shows the joint probability of two red balls as 0.18
[i.e., P(R and R) or more simply P(RR), the top branch of the
tree]. The joint probabilities may be summarized as follows:
Two red balls

P(RR) =

0.18

A red ball on first draw and a black ball on second draw

P(RB) =

0.12

A black ball on first draw and a red ball on second draw

P(BR) =

0.56

Two black balls

P(BB) =

0.14
1.00

Table 4.2: Probabilities Calculations

Event

Marginal

Conditional

Joint

P(A)

P(BA)

P(A and B)

RR

P(R) = 0.30

P(R|R) = 0.60

P(RR) = 0.18

RB

P(R) = 0.30

P(B|R) = 0.40

P(RB) = 0.12

BR

P(B) = 0.70

P(R|B) = 0.80

P(BR) = 0.56

BB

P(B) = 0.70

P(B|B) = 0.20

P(BB) = 0.14

94
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

This is a very useful device for illustrating uncertain


situations. The first fork shows that either a red or a black
may be drawn, and the probabilities of these events are given.
If a red is drawn, we go to box 2, where again a red or black
may be drawn, but with probabilities determined by the fact
that the draw will take place in box 2. For the second forks,
we have conditional probabilities (the probabilities depend
on whether a red or a black ball was chosen on the first draw).
At the end of each path are the joint probabilities of following
that path. The joint probabilities are obtained by multiplying
the marginal (unconditional) probabilities of the first branch
by the conditional probabilities of the second branch.
Table 4.3 presents these results in a joint probability table;
the intersection of the rows and columns are joint
probabilities. The column on the right gives the unconditional
probabilities (marginals) of the outcome of the first draw;
the bottom row gives the unconditional or marginal
probabilities of the outcomes of the second draw. Table 4.3
effectively summarizes the tree diagram.
Now, let us compute some additional probabilities:
1.

Probability of one red and one black ball, regardless of


order:
= 0.56 + 0.12 = 0.68

2.

Probability of a black ball on draw 2:


Explanatory calculation:

3.

Probability of red-black

= 0.12

Probability of black-black

= 0.14

Probability of black on draw 2

= 0.26

Probability of second draw being red if first


draw is red:
= 0.60
If first draw is red, we are in the R row of
Table 4.3, which totals 0.30. The question is,
What proportion is 0.18 of 0.30? The answer is

95
Notes

0.60; or in terms of the appropriate formula:

__________________

P(R2|R1) = P(R2 and R1) = 0.18 = 0.60


P(R1)

__________________

0.30

__________________

Table 4.3: Joint Pobability Table

__________________
__________________

Second

__________________

Draw

__________________

First
Draw

__________________
R

__________________

P(RR)

P(RB)

0.18

0.12

P(BR)

P(BB)

0.56

0.14

0.70

0.74

0.26

1.00

__________________

Marginal Probability of
outcome on Second Draw

0.30

Illustration 4.3
We give a further illustration of the basic probability
definitions. A survey is taken of 100 families; information is
obtained about family income and about whether the family
purchases a speciality food product. The results are shown
in Table 4.4.
Table 4.4: Survey of 100 Families, Classified by Income and
Buying Behaviour
Low Income
(family income
below Rs.30,000)

High Income
(family income
of Rs.30,000 or more)

Total
number of
Families

Buyer of speciality
food products

18

20

38

Non-buyer

42

20

62

Total number of families

60

40

100

Family is:

96
Notes
__________________
__________________

Suppose a family is to be selected at random from this group.


1.

What is the probability that the family selected will be


a buyer? Since 38 of the 100 families overall are buyers,
the probability is 0.38. Note that this is a marginal or
unconditional probability.

2.

What is the probability that the selected family is both


a buyer and with high income? Note that this is a joint
probability. P(Buyer and High income) = 20/100 = 0.20.

3.

Suppose that a family is selected at random and you are


informed that it has high income. What is the probability
that this family is a buyer? Note that this asks for the
conditional probability P (Buyer|High income). Of the
40 families with High income, 20 are Buyers. Hence, the
probability is 20/40 = 0.50.

4.

Are the events Buyer and High income independent for


this group of families? Note from question 1 that
P(Buyer) = 0.38; from question 3, P(Buyer|High income)
= 0.50. These are not the same. Hence, the two events
are dependent. Knowing the family has high income
affects the probability that it is a buyer. Another way of
expressing this dependence is to say that the percentage of
buyer is not the same for the high and low-income families.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Having discussed joint and conditional probabilities, let us


investigate how probabilities are revised to take account of
new information.
Suppose we do not know whether a particular coin is fair or
unfair. If the coin is fair, the probability of a tail is 0.50; but
if the coin is unfair, the probability of a tail is 0.10. Assume
we assign a prior probability to the coin being fair of 0.80
and a probability of 0.20 to the coin being unfair. The event
fair coin will be designated A1, and the event unfair coin
will be designated A2. We toss the coin once; say, a tail is the
result. What is the probability that the coin is fair?
The conditional probability of a tail, given that the coin is
fair, is 0.50; that is P(tail|A1) = 0.50. If the coin is unfair, the
probability of a tail is 0.10; P(tail|A2) = 0.10

97

Let us compute the joint probability P (tail and A1). There is


an initial 0.80 probability that A1 is the true state; and if A1
is the true state, there is a 0.50 conditional probability that
a tail will result. The joint probability of state A1 being true
and obtaining a tail is (0.80 x 0.50) = 0.40. Thus:

Notes
__________________
__________________
__________________
__________________

P(tail and A1) = P(A1) x P(tail|A1) = 0.80 x 0.50 = 0.40

__________________

The joint probability of a tail and A2 is equal to:

__________________

P(tail and A2) = P(A2) x P(tail|A2) = 0.20 x 0.10 = 0.02


A tail can occur in combination with the state fair coin or
in combination with the state unfair coin. The probability
of the former combination is 0.40; of the latter, 0.02. The
sum of the probabilities gives the unconditional probability
of a tail on the first toss; that is, P(tail) = 0.40 + 0.02 = 0.42:
P(tail and A2)

= 0.02

P(tail and A1)

= 0.40

P(tail)

= 0.42

If a tail occurs, and if we do not know the true state, the


conditional probability of state A1 being the true state is:
P(A1|tail) =

P(tail and A1) 0.40


=
= 0.95
P(tail)
0.42

Thus, 0.95 is the revised or posterior probability of A1, given


that a tail has occurred on the first toss.
Similarly:
P(tail and A2)
P(A1|tail) =

P(tail)

0.02
=

0.42

= 0.05

In more general symbols:


P(Ai|B) = P(Ai and B)
P(B)
Conditional probability expressed in this form is known as
Bayes theorem. It has many important applications in
evaluating the worth of additional information in decision
problems.

__________________
__________________
__________________
__________________

98
Notes
__________________
__________________
__________________
__________________
__________________
__________________

In this example, the revised probabilities for the coin are


0.95 that it is fair and 0.05 that it is unfair (the probabilities
were initially 0.80 and 0.20). These revised probabilities exist
after one toss when the toss results in a tail. It is reasonable
that the probability that the coin is unfair has decreased,
since a tail appeared on the first toss, and the unfair coin
has only a 0.10 probability of a tail.

__________________
__________________
__________________
__________________

It is sometimes necessary to know the number of different


ways in which r objects can be selected from n objects
without regard to sequence. For instance, the number of
different permutations of given cards from a pack of 52 cards
is 52P5. The same five cards dealt in different sequences are
different permutations. In practice, the interest is usually
in the number of different possible hands irrespective of the
sequence in which they were dealt. To obtain this, one must
divide by the number of ways in which five cards can be
arranged among themselves, which is 5P5.
Each different way of selecting r objects from n without regard
to the sequence in which they were selected is termed a
combination, and each such combination consists of rP r
permutations. The number of combinations is represented
by nCr, and so nCr divided by rPr, Since rPr is r! :

It will be seen that nCr works out the same as nCn-r. This is
reasonable, since the number of ways of selecting five cards
from a pack of 52 is obviously the same as the number of
ways of selecting 47 cards and leaving a hand of five cards
behind.
Illustration 4.5
In how many different ways three bolts can be selected from
a box containing eight bolts?
Solution
The answer is 8C3. It is convenient to adopt the practice of
using dots in place of multiplication signs when several

99

numbers are all multiplied together:

Notes
__________________

__________________

__________________

There is no need to write out the factorials in full. The number


of integers to be multiplied in both numerator and
denominator is the smaller of r and (n-r). The denominator
always cancels completely, since nCr must be an integer.

__________________

Illustration 4.6

__________________

If three bolts are selected at random from a box containing


six sound and two faulty bolts, what is the probability of
obtaining (I) three sound, (ii) two sound and one faulty, (iii)
one sound and two faulty bolts?

__________________

Solution
The word random indicates that all the 56 different ways of
selecting three bolts from a total of eight bolts are equally
likely, and so the classical definition of probability can be
applied. The number of ways of obtaining three sound bolts
is 6C3, which is 20, and dividing this by 56 gives 0.357.
The number of ways of obtaining two sound bolts is 6C2, which
is 15. Each of these combinations can be associated with
either of the 2C1 ways of obtaining one faulty bolt, and so 15
is multiplied by two and divided by 56 to give 0.536.
Following the same principles, the probability of obtaining
one sound and two faulty bolts is:

A check confirms that the three probabilities calculated of


possible combinations of sound and faulty bolts total 1, which
must be so since one of the three results must occur.
Probabilities may be similarly calculated in any situation
involving permutations or combinations provided that, all
the outcomes are equally likely. Using illustration 4.4 the
probability of choosing the winning list from ten models in a
fashion competition is 1 divided by 1,51,200 if six models have
to be listed, provided that no skill is involved.

__________________
__________________
__________________

__________________

100
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

A probability function is a rule that assigns probabilities to


each element of a set of events that may occur. If, in turn,
we can assign a specific numerical value to each element of
the set of events, a function that assigns these numerical
values is termed a random variable. The value of a random
variable is the general outcome of a random (or probability)
experiment. It is useful to distinguish between the random
variable itself and the values that it can take on. The value
of a random variable is unknown until the event occurs (i.e.,
until the random experiment has been performed). However,
the probability that the random variable will be any specific
value is known in advance. The probability of each value of
the random variable is equal to the sum of the probabilities
of the events assigned to that value of the random variable.
For example, suppose we define the random variable Z to be
the number of heads in two tosses of a fair coin. Then the
possible values of Z, and the corresponding probabilities, are:
Possible Values of Z

Probability of Each Value

Random variables can be grouped into probability


distribution, which can be either discrete or continuous.
Discrete probability distributions are those in which the
random variable can take on only specific values. The table
above is an example of such a distribution since the random
variable Z can be only 0, 1, or 2. A continuous probability
distribution is one in which the value of the random variable
can be any number within some given range of values say,
between zero and infinity. For example, if the random
variable was the height of members of a population, a person
could be 5.3 feet, 5.324 feet, 5.32431 feet, and so on, depending
on the ability of instruments to measure. Some additional
examples of random variables are shown in the Table 4.5.
A discrete probability distribution is sometimes called a
probability mass function (p.m.f.) and a continuous one is
called a probability density function (p.d.f.) Graphs of the
two types of distributions are shown in Figure 4.3 and 4.4.

101

For a discrete distribution, the height of each line represents


the probability for that value of the random variable. For
example, 0.30 is the probability that tomorrows demand will
be 0.2 tons in Figure 4.3. For a continuous random variable,
the height of the probability density function is not the
probability for an event. Rather, the area under the curve
over any interval on the horizontal axis represents the
probability of taking on a value in that interval.
Table 4.5: Examples of Random Variables

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Random Variable Description of the Values


(denoted by a
of the Random Variable
capital letter)

Discrete or
Continuous

Values of the
Random Variable

Possible outcomes from


throwing a pair of dice

Discrete

2, 3, . . ., 12

Possible number of heads,


tossing a coin five times

Discrete

0, 1, 2, 3, 4, 5

Possible daily sales of a


newspaper, where S
represents the inventory
available

Discrete

0, 1, 2, . . . 5

Time between arrivals of


calls at 911 Emergency
Call Center

Continuous

0 to

Life of an electronic
component of a computer

Continuous

0 to

__________________
__________________

Figure 4.3: Discrete Probability Distribution


(probability mass function or p.m.f.)

102
Notes
__________________
__________________
__________________
__________________

Now let us take a look at the first discrete probability


distribution, the binomial probability distribution and how
it is derived. The binomial probability distribution is the base
on which several other probability distributions are based
upon.

__________________
__________________
__________________
__________________
__________________
__________________

Figure 4.4: Continuous Probability Distribution

Understanding of the Bernoulli process is necessary before


we understand the binomial distribution. The characteristics
of a Bernoulli process are described below:
1.

The outcomes or results of each trial in the process are


characterized as one of two types of possible outcomes,
such as:
a. Success, failure.
b. Yes, no.
c. Heads, tails.
d. Zero, one.

2.

The probability of the outcome of any trial is stable


and does not change throughout the process. For
example, the probability of heads, given a fair coin, is
0.50 and does not change, regardless of the number of
times the coin is tossed.

3.

The outcome of any trial is independent of the outcome


of any previous trial. In other words, the past history of

103

the process would not change the probability assigned


to the next trial. In our coin example, we would assign
a probability of 0.50 to the next toss coming up heads;
even if we had recorded heads on the last 10 trials (we
assume the coin is fair).
4.

The number of trials is discrete and can be represented


by an integer such as 1, 2, 3, and so on.

Given a process, we may know that it is Bernoulli, but we


may or may not know the stable probability characteristic of
the process. With a fair coin, we may know the process is
Bernoulli, with probability 0.50 of a success (say heads) and
probability 0.50 of a failure (tails). However, if we are given
a coin and told it is not fair, the process (flipping the coin)
may still be Bernoulli, but we do not know the probability
characteristic. Hence, we may have a Bernoulli process with
a known or unknown probability characteristic.
Many business processes can be characterized as Bernoulli
for analytical purposes, even though they are not true
Bernoulli in every respect. If the fit is close enough, we
may assume that the Bernoulli process is a reasonable
characterization. Let us discuss some examples.
Suppose we are concerned with a production process where
a certain part (or product) is produced on a machine. We
may be interested in classifying the parts as good or
defective, in which case the process may be Bernoulli. If
the machine is not subject to fast wear and tear that is, if a
setting will last for a long run of parts 0 the probability of
good parts may be sufficiently stable for the process to qualify
as Bernoulli. If, on the other hand, more defect occur as the
end of the run approaches, the process is not Bernoulli. In
many such processes, the occurrence of good and defective
parts is sufficiently stable (no pattern over time is observable)
to call the process Bernoulli. The probability of good and
defective parts may remain stable through a production run,
but it may vary from run to run (because of machine setting,
for example). Here, the process could still be considered
Bernoulli, but the probability of a success (or failure) will
change from run to run.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

104
Notes
__________________
__________________
__________________
__________________
__________________
__________________

A different example of a Bernoulli process is a survey to


determine whether or not consumers prefer liquid to
powdered soaps. The outcome of a survey interview could
be characterized a yes (success) or no (failure) answers
to the question. If the sample of consumers was sufficiently
randomized (no pattern to the way in which the yes or no
answers occur), Bernoulli (with an unknown probability) may
be a useful description of the process.

__________________
__________________
__________________
__________________

Note that if the probability of a success in a Bernoulli process


is 0.50, the probability of a failure is also 0.50 (since the
probabilities of the event happening and the event not
happening add to one). If the probability of a success is p,
the probability of a failure is (1 p).

Let us start by using an example.


Illustration 4.7
The probability that a salesman makes a sale on a visit to a
prospect is 0.2.
What is the probability, in 2 visits, of:

making no sales?

making one sale?

making two sales?

Solution
Here p = probability of sale = 0.2
And q = probability of no sale = 1 0.2 = 0.8
The various outcome possibilities are
Visit 1

Visit 2

Probabilities

Sale

Sale

i.e. p x p = p 2 = 0.22 = 0.04

Sale

No sale

i.e. p x q = 0.02 x 0.8 = 0.16

No sale

Sale

i.e. q x p = 0.8 x 0.2 = 0.16

No sale

No sale

i.e. q x q = q2 = 0.82 = 0.64


1.00

105

Thus P (no sales) = 0.64


P (one sale) = 0.32
P (two sales) = 0.04
In this simple example, it is easy to show the whole process
but this becomes lengthy and cumbersome where the number
of trials (visits, in the above example) becomes larger.
Fortunately there is a simpler approach which is by the
expansion of the binomial expression. The general form of
the binomial expression is

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(p + q)n
Wherep = probability of an event occurring
q = probability of an event not occurring
and

n = number of trials

In the above example p = probability of a sale, i.e., 0.2,


q = probability of no sale, i.e., 0.8 and n = number of visits i.e. 2.
Use the binomial expansion to confirm the probabilities.
(p + q)2 = p2 + 2pq + q2
= 0.22 + 2(0.2 x 0.8) + 0.82
= 0.04+0.32 + 0.64, i.e., the values previously obtained.
Where n becomes larger it is useful to be able to calculate
the coefficients of each part of the expansion in a direct
manner rather than writing out the whole expansion.
It is easy to show by direct multiplication that:
(a + b)2 = a2 + 2ab + b2
(a + b)3 = a3 + 3a2b + 3ab2 + b3
These are called expansions of the expressions on the left,
which are called binomial expressions, because they each
contain two terms within the bracket. The binomial theorem
is concerned with the general form of the expansion of (a+b).

106
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

It is not difficult to see from the above examples that the


first term in the expansion of (a + b)n is an, the last term is bn,
and there are (n + 1) terms in all. In general, the jth term
consists of an+1j bj-1 preceded by an appropriate coefficient,
where j takes all integral values from 1 to (n + 1).
To find the appropriate coefficient, one can investigate why
the coefficient of a2 b2 in the expansion of (a + b)4 works out
as 6. Writing out the four terms, which have to be multiplied
together:
(a + b) (a + b) (a + b) (a + b)
It is easy to see that multiplying a in the first bracket by a in
the second, b in the third, and b in the fourth will give the
term a2b2. But the same result is obtained by taking the
sequence a, b, a, b, from the four brackets, or by taking any of
the other possible sequences which include a from any two
of the brackets and b from the remaining two brackets.
The number of different ways of selecting two brackets from
four is 4C2, which is six. The brackets selected are the ones
which contribute the symbol a, whilst the remaining brackets
contribute the symbol b.
We can now consider the general term in the expansion of (a
+ b)n. It is usual to refer to the jth term as the (r + 1)th term, so
that the (r + 1)th term has the symbol an-r br and r takes all
integer values from 0 to n. The coefficient in the (r + 1) th
term is then the number of ways of selecting (n-r) brackets
from n, which is nCn r.
It would be equally correct to select the r brackets which are
to provide the symbol b, letting the remaining (n - r) brackets
provide the symbol a. The coefficient is then nCr, but it has
already been seen earlier that nCr. is always equal to nCnr
and so there is no contradiction. It follows that the series of
coefficients is always symmetrical; the coefficient of a nr br is
always the same as the coefficient of ar bn-r.
Hence the binomial theorem states that for any positive
integer n:
(a + b)n = an + nC1an-1b + nC2an-2b2 +....+ nCranrbr +....+nCn-1abn-1 + bn

107

Sometimes nC1 and nCn-1 are written simply as n. On the other


hand, one can complete the uniformity of the expansion by
writing the co-efficients in the first and last terms as nC0, nCn
respectively, these both being equal to 1.
Illustration 4.8

Notes
__________________
__________________
__________________
__________________
__________________

For components of a certain type, the probability that a


component is faulty when it leaves the production line is
0.05. If 10 components are selected at random, what is the
probability of obtaining two faulty components?

__________________
__________________
__________________
__________________

Solution
This is the same problem as in illustration 4.8 except that
the size of the batch must now be regarded as infinite. It is
not difficult to see that the previous approximate formula
now becomes exact, so that the answer is 0.0746.
The exact probability distribution for the number of faulty
components in illustration 4.8 is called the hyper geometric
distribution. It is not very important in practice. If a sample
is drawn from a large batch, the binomial distribution is
usually a good enough approximation. If only a small batch
is involved, it can be examined in its entirety and then no
probability problem arises.
In some type of problems the binomial distribution can be
derived directly from a consideration of probabilities without
reference to batch sizes.
Illustration 4.9
The probability of meeting the buyer at a random visit to a
certain firm is 0.05. If a salesman makes 10 random visits,
what is the probability that he will meet the buyer on two
occasions?
Solution
The probability that the salesman will meet the buyer on
both the first two visits is 0.05 multiplied by 0.05. The
probability law of multiplication states that the probability
that both of two events will occur is the product of their
separate probabilities, provided that the events are

__________________

108
Notes
__________________
__________________
__________________
__________________

independent. Two events are said to be independent if the


fact that one has occurred makes no difference to the
probability that the second will occur. We can assume that
this will apply to random visits by the salesman.

__________________

Further visits are again independent. The probability of not


meeting the buyer being 0.95. Multiplying all the
probabilities, it follows that the probability of meeting the
buyer on the first two occasions and not on any of the
following eight occasions is:

__________________

(0.05)2 (0.95)8

__________________
__________________
__________________

__________________

The probability of meeting the buyer on none of the first eight


visits but on both of the last two visits is similarly found to
be:
(095)8 (0.05)2
This is the same as the previous probability, and in fact any
arrangement of two successes among the 10 visits will have
the same probability.
The probability law of addition states that, the probability
that either of the two events will occur is the sum of their
separate probabilities, provided that they are mutually
exclusive; that is, provided that it is impossible for both to
occur. This applies to any two different arrangements of two
successes among 10 visits, and so adding the probabilities
for the 10C2 different arrangements gives the total probability
of two successes in 10 visits as:
10

C2(0.95)8 (0.05)2

which is again 0.0746


In this example the concept of probability was not derived
from counting equally likely events. The figure 0.05 indicates
that on an average one will meet the buyer five times in every
100 visits. This does not mean that there will be exactly five
successes in each 100 visits, but that the relative frequency
of successes will tend to the figure 0.05 in an unlimited
number of visits. So probability is here defined as the limiting
relative frequency of a success; this is the definition of
probability most widely applied by statisticians.

109
Notes
__________________

A probability distribution is a rule that assigns a probability


to every possible outcome of an experiment. An event
whose numerical value is determined by the outcome of
an experiment is called a variate or often a random
variable.
There are two kinds of probability distributions, discrete and
continuous. Discrete probability distributions are those
where only a finite number of outcomes are possible. For
example, the throw of a dice has a discrete probability
distribution as only six outcomes are possible and are
known beforehand. Continuous probability distributions are
those which represent a continuously variable random
variables.

The Binomial Probability Function


If the assumptions of the Bernoulli process are satisfied and
if the probability of a success on one trial is p, then the
probability distribution of the number of successes, r, in n
trials, is a binomial distribution. The binomial probability
distribution function has been derived in the fifth chapter
and is given below:
Performing computations using the above equation can be
tedious if the number of trials is large.

The Excel and Quattro spreadsheet programs have a function


that can be used for evaluating binomial probabilities, both
individual terms and cumulative probabilities. The form of
the function is:
= BINOMDIST(r, n, p, 0 or 1)
Where r is the number of successes, n is the number of trials,
and p is the probability of success on each trial. The last
term is either a zero or one; if a zero is entered, the individual
binomial term is given; if a one is used, the cumulative value
(of the type) is given. Suppose, for example, we want the
probability of exactly three successes in five trials, with a
probability of success of p = 0.4 - P(r = 3n = 5, p = 0.4). This

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

110
Notes
__________________

is:
= BINOMDIST(3,5,0.4,0), which equals 0.2304

__________________
__________________
__________________

For the probability of 3 or fewer successes in five trials, with


p = 0.4 - P(r 3n = 5, p = 0.4)

__________________

= BINOMDIST (3, 5, 0.4, 1), which equals 0.9130

__________________

Illustration4.10

__________________
__________________

Suppose we plan to toss a fair coin three times and would


like to compute the following probabilities:

__________________
__________________

a.

The probability of three heads in three tosses.

b.

The probability of two or more heads in three tosses.

c.

The probability of fewer than two heads in three tosses.


Possible Outcomes
HHH
HHT
HTH
THH
TTH
THT
HTT
TTT

In this example, the Bernoulli process p is 0.50, and a head


constitutes a success. The number of trials (n) is three.
The first probability is the probability of three heads
(successes) in three tosses (three trials), given that the
probability of a head on any one toss is 0.50. This probability
can be written as follows:
P(r = 3p = 0.50, n = 3) =?
Where P = Probability, r = Number of successes, n = Number
of trials, and p = Probability of success on any one trial. The
left side of the equation should be read the probability of
three successes, given a process probability of 0.50 and three
trials.

111

In answering the probability questions, let us first list all


the possible outcomes of the three trials and compute the
probabilities (see Table 7.8).

Notes
__________________
__________________

The above probabilities can also be calculated using the


equation. You can try it yourself.

__________________

Illustration 4.11

__________________

A very large lot of manufactured goods is to be sampled as a


check on its quality. Suppose it is assumed that 10 percent
of the items in the lot are defective and that a sample of 20
items is drawn from the lot. What are the following
probabilities:
1.

Probability of exactly zero defectives in the sample?

2.

Probability of more than one defective in the sample?

3.

Probability of fewer than two defectives in the sample?

Solution
We can answer as follows. Let p = 0.10 and n = 20. Then:
a.

P(r = 0p = 0.10, n = 20) = P(r 0) - P(r 1)


= 1.0 - 0.8784 = 0.1216
The probability of zero or one defectives is 1.0, and P(r
1) is read directly from Appendix 9.

b.

P(r > 1) = P(r 2) = 0.6083

c.

P(r < 2) = 1.0 - P(r 2) = 1.0 - 0.6083 = 0.3917

Mean and Standard Deviation of the Binomial


Distribution
The mean of a binomial distribution is found by multiplying
the probability of the event in which we are interested by n,
the number of trials:
Mean = np
This value is the same as the expected value as previously
discussed.

__________________

__________________
__________________
__________________
__________________
__________________

112
Notes

The variance of a binomial distribution is calculated as:

__________________
__________________
__________________
__________________
__________________

Variance = npq so that the standard deviation is


Characteristics of the Binomial Distribution
a)

It is a discrete distribution of the occurrences of an event


with two outcomes - success or failure, good or bad.

b)

The trials must be independent of one another. This


assumption implies sampling from an infinite
population. Sampling with replacement fulfils this
requirement, but where sampling without replacement
is used, the binomial distribution is still useful provided
that the sample size is less than 20.

c)

As the number of trials grows and if p = 0.5 then the


binomial distributions approaches the normal
distribution. For the normal distribution to be an
appropriate approximation, np should be > 5.

__________________
__________________
__________________
__________________
__________________

Illustration 4.12 (unsolved)


Components are placed into bins containing 100. After
inspection of a large number of bins the average number of
defective parts was found to be 10 with a standard deviation
of 3.
Assuming that the same production conditions continue,
except that bins containing 300 were used:
a)

What would be the average number of defective


components per larger bin?

b)

What would be the standard deviation of the number of


defectives per larger bin?

c)

How many components must each bin hold so that the


standard deviation of the number of defective
components is equal to 1% of the total number of
components in the bin?

113
Notes

The probability of getting specified number of successes from


a repeated number of trials can be obtained with the help of
binomial distribution provided the probability of success or
failure is known. However, the probability of success or
failure is finite in this case. As the number of trials
approaches infinity, Poisson distribution is the limit of the
binomial distribution.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

Here c is the number of defects per sample, a is the expected


number of defects per sample, and e = 2.71828, the base of
the natural logarithm.

__________________
__________________
__________________

To explain this, we may think about the defects as being


distributed over the area of a surface, any unit being defective
if it contains one or more defects. If the surface area is divided
into very small units, so that no unit of the area has more
than one defect, the distinction between defect c and
defective d disappears. As the number n of units increases,
nP remaining constant, probability P must get smaller and
smaller, and the binomial distribution gradually approaches
the Poisson.
Table 4.9 Calculation of a Poisson Distribution for c = 0
through 5 and a = 0.5.
Calculations of Poisson Distribution Probabilities for a = 0.5
c

ac

e-a

C!

Prob
(c)=ace-a/c!

Cum. Prob. (c)

1.0

0.6065

0.6065

0.6065

0.5

0.6065

0.3033

0.9098

0.25

0.6065

0.0758

0.9856

0.125

0.6065

0.0126

0.9982

0.0625

0.6065

24

0.0016

0.9998

0.03125

0.6065

120

0.0002

1.0000

The Poisson distribution is useful in ways other than as an


approximation to the binomial; for example, the analysis of
queues, or arrival and waiting line patterns at bridges and
airports, tool distribution points in factories, etc.

114
Notes
__________________

a)

It is a discrete distribution and is a limiting form of the


binomial distribution when n is large and p or q is small.

__________________

b)

Mean and variance are equal.

__________________

c)

It is usually, definitely positively skewed but cannot be


negatively skewed

d)

As n becomes very large the poisson distribution


approximates to the normal distribution

e)

The mean = np

__________________
__________________

__________________
__________________
__________________
__________________
__________________

The Poisson distribution is similar to the binomial but


is used when n, the number of items or events, is large
or unknown and p, the probability of an occurrence, is
very small relative to q, the probability of non-occurrence.
A rule of thumb is that the poisson distribution may be used
when n is greater than 50 and the mean np is less than 5.
Some examples follow but it is important to realize that
the poisson distribution only applies when the events occur
randomly, i.e., they are independent of one another.
Illustration4.13
Customers arrive randomly at a service point at an average
rate of 30 per hour. Assuming a poisson distribution calculate
the probability that:
a)

no customer arrives in any particular minute.

b)

exactly one customer arrives in any particular minute.

c)

two or more customers arrive in any particular minute.

d)

Three or fewer customers arrive in any particular


minute.

Solution
The time interval to be used is one minute with a mean of 30/

115
Notes

60 = 0.5

__________________

a)

P (no customer) = 0.6065 from Table VI(a)

b)

P (1 customer) = 0.3033 from Table VI(a)

__________________

c)

P (2 or more)

__________________

= 1 - 0.9098

__________________

__________________

= 0.0902

__________________

The value of 0.9098 is the cumulative probability of 1 or


fewer customers arriving in a particular minute. As the
sum of the probability of every possible number of
arrivals equals 1, the probability of 2 or more = 1 - P(1 or
fewer).
d)

P(3 or fewer)

= 0.9982

Illustration 4.14
A firm buys springs in very large quantities and from past
records it is known that 0.2% are defective. The inspection
department sample the springs in batches of 500. It is
required to set a standard for the inspectors so that if more
than the standard number of defectives is found in a batch
the consignment can be rejected with at least 90% confidence
that the supply is truly defective.
How many defectives per batch should be set as the standard?
Solution
With 0.2% defective and a sample size of 500 m= 500 X 0.2=1.
To find the probability of 0, 1, 2, 3, etc. or more defectives
the respective probabilities are deducted from 1.
P(0 or more defectives) = certainty

=1

P(1 or more defectives) = 1 - 0.3679 = 0.6321


P(2 or more defectives) = 1 - 0.7358 = 0.2642
P(3 or more defectives) = 1 - 0.9197 = 0.0803
P(4 or more defectives) = 1 - 0.9810 = 0.0190
These probabilities mean, for example, that there is a 26.42%
chance that 2 or more defectives will occur at random in a

__________________
__________________
__________________
__________________

116
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

batch of 500 with a 0.2% defect rate. If batches with 2 or


more were rejected then there can be 73.58% (1 - 0.2642)
confidence that the supply is defective.
As the firm wishes to be at least 90% confident, the standard
should be set at 3 or more defectives per batch. This level
could only occur at random in 8.03% of occasions so that
the firm can be 91.97% confident that the supply is truly
defective.

__________________
__________________
__________________

Till recently we have been making curves which illustrated


some of the forms that a frequency distribution may assume.
These curves were based upon data of a few tens or hundreds
of cases; each was a sample drawn from a much larger,
possibly infinite, universe. Being a sample, a given curve
would not necessarily have exactly the same shape as the
curve for the universe, but if the sample is properly selected,
the curve for the sample will tend to be of the same general
shape as the curve for the universe.
The normal curve represents a distribution of values that
may occur, under certain conditions, when chance is given
full play. In every case the necessary conditions include the
existence of a large number of causes, each operating
independently in a random manner.
Graphically, the normal distribution looks like a bell shaped
curve:
There are certain properties of the normal distribution that
you can notice from the above curve:
1.

It is symmetrical on both sides of the mean, i.e., mean,


median and mode coincide at the central value.

2.

The curve never touches the x-axis and extend to -infinity


on the left hand side to +infinity on the right hand side.

The normal distribution is an extremely important


distribution. It is easier to manipulate mathematically than
many other distributions and is a good approximation for
several of the others. In many cases, the normal distribution

117

is a reasonable approximation for binomial probability


distribution for business decision purposes; and in the
following chapters, we shall use the normal distribution in
many of the applications. Despite its general application, it
should not be assumed that every process can be described
as having a normal distribution.

Notes
__________________
__________________
__________________
__________________
__________________

The normal distribution is a function of z, the standard normal


variate, and is defined as:

__________________

Here the value of z is given by:

__________________

__________________

__________________

The normal distribution is completely determined by its


expected value or mean (denoted by m) and standard
deviation (s); that is, once we know the mean and standard
deviation, the shape and location of the distribution is set.
The curve reaches a maximum at the mean of the
distribution. One half of the area lies on either side of the
mean. The greater the value of s, the standard deviation, the
more spread out the curve?
With any normal distribution, approximately 0.50 of the
area lies within 0.67 standard deviation from the mean;
about 0.68 of the area lies within 1.0 standard deviations;
and 0.95 of the area lies with 1.96 standard deviation.

The Excel and Quattro spreadsheet function NORMSDIST


provides exactly the same values as in the Appendix 3. That
is, for any value of the standardized value of Z, it provides
the left tail cumulative normal probability. For example, for
P(Z 0.67), use:
NORMSDIST (0.67) = 0.7486
The next logical question might be: Given a normal
distribution with mean 4 and standard deviation s, what
percentage of the total area lies to the right or the left of a
given X value? Alternatively, we may ask: What is the
probability of obtaining a value of X that is as large as or
larger than one specified? Since there are infinitely many
normal curves, each depending upon a particular
combination of mean and SD, the answer would vary from

__________________

118
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

normal curve to normal curve. The problem would be much


less difficult if we could cause all the normal distributions
to have the same mean and SD.
This is exactly what we do when we standardize
distributions; i.e., we cause them to have a mean of zero and
a variance of one. Thus, if we transform the given X value
into a z value, where z is defined as given above, and call
Q(z) the probability of obtaining a value of z that is equal to
or larger than the one specified, tables are already calculated
to allow us to find Q(z), given z. Such a table is given in
Appendix 3.
Illustration 4.15
Assume that your working hours X are distributed normally
with m = 5 and s = 2. What is the probability of your working
9 hours or more than 9 hours?
Solution
First we standardize the specified X value: z=
Looking in the Appendix 3, we find that Q(z), the probability
of obtaining a value of z as large or larger than specified, is
Q(z) = Q(2) = 0.02275
Alternatively, we may say that 2.275 per cent of the area in
the distribution is to the right of z = 2. This means that there
is 2.275 per cent probability that you would be working for 9
or more than 9 hours in the day.
This could be looked at in another way. What is the
probability that you would be working less than 9 or 9 hours
in a day? The solution of this question can be found by
subtracting the above value from 1, so that we get the value
which lies on the left of the 9 hour line and not on the right
in the normal distribution. The answer is 0.97725 or 97.725
per cent probability that you would be working for 9 or less
than 9 hours on any given day.

119

Illustration 4.16
An assembly line contains 2,000 components each one of
which has a limited life. Records show that the life of the
components is normally distributed with a mean of 900 hours
and a standard deviation of 80 hours.
a)

What proportion of components will fail before 1,000


hours?

b)

What proportion will fail before 750 hours?

c)

What proportion of components fail between 850 and


880 hours?

d)

Given that the standard deviation will remain at 80 hours


what would the average life have to be to ensure that
not more than 10% of components fail before 900 hours?

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Solution
a)

Here the value being investigated, 1000 hrs, is 1.25


standard deviation away from the mean of 900 hours).
If Appendix 3 is examined it will be seen that the value
for a z score of 1.25 is 0.3944. As one half of the
distribution is less than 900, the proportion which fail
before 1,000 hours is 0.5 + 0.3944 = 89.44%.
If required this could be expressed as the number of
components which are expected to fail, thus
2,000 X .8944 = 1788.8 which be rounded to 1789

b)

From the tables in appendix 3 we obtain the value 0.4696.


In this case as we require the proportion that will fail
before 750 hours, the table value is deducted from 0.5.

Proportion expected to fail before 750 hours.

= 0.5 = 0.4696 = 0.0304, i.e., 3.04%


c)

When it is required to find the proportion between two


values (neither of which is the mean) it is necessary to
use the tables to find the proportion between the mean
and one value and the proportion between the mean and
the other value. Then find the difference between the
two proportions.

__________________
__________________

120
Notes

Which gives a proportion of 0.2340?

__________________

Which gives a proportion of 0.0987?

__________________

Proportion between 850 and 880 is

__________________

__________________

0.2340 - 0.0987 = 0.1353, i.e., 13.53%

__________________
__________________
__________________
__________________

Note: This part of the example illustrates the proportion


between two values on the same side of the means. If the
two values are on opposite sides of the mean, the calculated
proportions would be added.

__________________
__________________

This problem is the reverse of the earlier questions based on


the same principles. The earlier problems started with the
mean and standard deviation, found the z score and then the
proportion from the tables. We now start with the proportion
and work back, through the tables, to find a new mean value.
If not more than 10% should be under 900 it follows that 90%
of the area of the curve must be greater than 900.
Bearing in mind that the tables only show values for half the
distribution (because both halves are identical) we have to
look in the tables for a value close to 0.4 (i.e. 0.9 - 0.5).
It will be seen that three is a value in the Table in Appendix
3 of 0.3997 i.e. virtually 0.4. This value has a z score of 1.28.
Thus

102.4 = mean - 900

mean = 1002.4 hours.

Thus if the mean life of the components is 1002.4 hours with


a standard deviation of 80 hours, less than 10% of the
components will fail before 900 hours.

It is not always easy to recognize when a Normal, Binomial


or Poisson distribution should be used. The following hints
might help.

121
Notes

a)

b)

c)

d)

Outcomes have discrete values and do not have


continuous ranges of possible values. For example, in
dealing with people, there can only be discrete values;
1, 2, 3 etc., 2.35 people is not possible.
There are only two possible conditions; good/bad, black/
white, male/female, acceptable/not acceptable and so on.
The probability of an item having one of the two possible
conditions is p, thus the probability of the item having
the other condition is (1 - p). (1 - p) is usually referred to
as q, so that (p + q) = 1.
When the number of items, n, is large and p is not close
to 0 or 1 so that the distribution is approximately
symmetric, the binomial probabilities can be
approximated using a Normal Distribution with the
same mean (m = np) and standard deviation (s = npq).

Similar to a binomial distribution, but used for rare events.


a)

The number of items, n, is large; say greater than 50.

b)

p is small in relation to q so that np (the mean of a Poisson


Distribution) is less than, say, 5.

The most commonly applied probability distribution.


a)

It applies to variables with a continuous range of possible


values. Examples include: time, weights, distances,
sizes, growth rates, etc.

b)

Where the quantities are large, the Normal Distribution


can also be used for discrete variables (see note above
binomial distribution).

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

122
Notes
__________________
__________________
__________________

4.1. A batch of 5,000 electric lamps have a mean life of 1,000


hours and a standard deviation of 75 hours. Assume a
Normal Distribution.

__________________
__________________

a) How many lamps will fail before 900 hours?

__________________

b) How many lamps will fail between 950 and 1,000


hours?

__________________
__________________
__________________
__________________

c) What proportion of lamps will fail before 925 hours?


d) Given the same mean life, what would the standard
deviation have to be to ensure that not more than
20% of lamps fail before 916 hours?
4.2 A mail-order company is analyzing a random sample of
its computer records of customers. Among the results
are the following distributions:
Size of order

Number of customers

No. of lubes

April

September

Less than 1

1 and less than 5

19

18

5 and less than 10

38

39

10 and less than 15

40

69

15 and less than 20

22

41

20 and less than 30

13

20

30 and over
Total

144

196

Required:
a) Calculate the arithmetic mean and standard
deviation order size for the April sample;
b) Find 95% confidence limits for the overall mean
order size for the April customers and explain their
meaning;
c) Compare the two distributions, given that the
arithmetic mean and standard deviation for the
September sample were 13.28 and 7.05 orders
respectively.

123

4.3 The chief accountant of the Hotels Group is analyzing


the profitability of the groups smallest hotel, the Unity.
The Unity has 120 bedrooms, each of which can be let as
either a single or a double room. The price of a single
room is Rs 45 per night whereas for a double room it is
Rs 35 per person per night. The hotels other main
source of revenue is the restaurant where the average
price of an evening meal is Rs 9. A proportion of
residents can be expected to have an evening meal in
the restaurant, and experience has shown that for the
Unity this proportion is usually about 60%.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Required:
a) Develop an expression for the average daily revenue
(R) from residents at the hotel, including both the
letting of rooms and evening meals in the restaurant.
Assume that:
i) the overall proportion of rooms occupied is p, and
ii) a proportion q of the rooms that are occupied are
let as single rooms.
b) The proportion (q) of rooms which are let as single
rooms is usually between 10% and 30%, and it can
be assumed that this is independent of the overall
proportion of rooms occupied (p).
Determine the average daily revenue (R) in terms of
the parameter p for values of q of 10%, 20% and 30%
and plot these three functions on a graph. What is
the lowest percentage occupancy (p) which will yield
an average daily revenue of Rs 8,000, and for which
of the given values of q does this occur?
c) What is the overall relationship between R and p if
the percentage of rooms let as single rooms is given
by the following probability distribution?
q

Probability

10%

0.3

20%

0.5

30%

0.2

124
Notes

d)

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Alternatively, if q has a normal distribution with a mean


of 19% and a standard deviation of 7%, determine a 95%
confidence interval for R in terms of the parameter p.
(For the purposes of this calculation, you may ignore
any variability in the proportion of residents who take
an evening meal in the restaurant.)

4.4. Castrol Company Limited are planning to introduce a


new motor lube. The companys marketing department
estimates that the prior distribution for likely sales is
normal with a mean of 10,000 tonnes. In addition it has
determined that there is a probability of one half
that the likely sales will lie between 8,000 and
12,000 tonnes.
The lube will sell for Rs 10 per 100 gms but the publishing
company pays the parent company 10% of revenue in
royalties and the fixed costs of printing and marketing
the lube are calculated to be Rs 25,000. Using current
facilities, the variable production costs are Rs 4 per
100 gms. However, the Castrol Company has the option
of hiring a special machine for Rs 14,000 which will
reduce the variable production costs to Rs 2.50 per
100 gms.
Required:
a)

Show that the standard deviation of likely sales is


approximately s = 3,000.

b)

Using s = 3,000 determine the probability that the


company will at least break even if
i) existing facilities are used,
ii) the special machine is hired.
c) By comparing expected profits, decide whether or
not the company should hire the special
machine.
d) By using the normal distribution it can be shown that
the following probability distribution may be applied
to lube sales:
Sales ('000)

0-5

5-8

8-10

10-12

12-15

15-20

Probability

0.05

0.20

0.25

0.25

0.20

0.05

125

By assuming that the actual sales can only take the


midpoints of these classes, determine the expected
value of perfect information and interpret its value.
4.5. In each of the following three situations, use the
binomial, poisson or normal distribution according to
which it is most appropriate. In each case, explain why
you selected the distribution and draw attention to any
feature which supports or casts doubt on the choice of
distribution.
a) Situation 1
The lifetimes of a certain type of electrical component
are distributed with a mean of 800 hours and a
standard deviation of 160 hours.
Required:
i) If the manufacturer replaces all components that
fall before the guaranteed minimum lifetime of
600 hours, what percentage of the components
have to be replaced?
ii) If the manufacturer wishes to replace only the
1% of components that have the shortest life,
what value should be used as the guaranteed
lifetime?
iii) What is the probability that the mean lifetime of
a sample of 25 of these electrical components
exceeds 850 hours?
b) Situation 2
A green grocer buys peaches in large consignments
directly from a wholesaler. In view of the perishable
nature of the commodity, the green grocer accepts
that 15% of the supplied peaches will usually be
unsaleable. As he cannot check all the peaches
individually, he selects a single batch of 10 peaches
on which to base his decision of whether to purchase
a large consignment or not. If no more than two of
these peaches are unsatisfactory, the greengrocer
purchases the consignment.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

126
Notes
__________________
__________________

Required:
Determine the probability that, under normal supply
conditions, the consignment is purchased.

__________________
__________________
__________________
__________________
__________________
__________________

c) Situation 3
Vehicles pass a certain point on a busy singlecarriageway road at an average rate of two per tensecond interval.
Required:

__________________
__________________

Determine the probability that more than three


cars pass this point during a twenty-second
interval.
4.6. Thirty chief executive officers in an oil and gas
industry are classified by age and by their
previous functional position as shown in the table
below:
Previous Functional

Age

position

Under 55

55 and older

Total

Finance

14

18

Marketing

Other

Total

21

30

Suppose an executive is selected at random from this


group.
a. What is the probability that the executive chosen is
under 55? What type (marginal, conditional, joint)
of probability is this?
b. What is the probability that an executive chosen at
random is 55 or older and with Marketing as the
previous functional position? What type of
probability is this?
c. Suppose an executive is selected, and you are told
that the previous position was in Finance. What is
the probability that the executive is under 55? What
kind of probability is this?

127

d. Are age and previous functional position


independent factors for this group of executives?
4.7 Assume that the probability of a salesperson making a
sale at a randomly selected petrol station is 0.1. If a
salesperson makes 20 calls a day, determine the
following:
a. The probability of no sales.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

b. The probability of one sale.

__________________

c. The probability of four or more sales.

__________________
__________________

d. The probability of more than four sales.


e. The probability of four sales.
4.8 Daily sales of a certain product are known to have a
normal distribution of 20 per day, with a standard
deviation of 6 per day.
a. What is the probability of selling fewer than 16 on a
given day?
b. What is the probability of selling between 15 and 25
units on a given day?
c. How many units would have to be in hand at the start
of a day in order to have less than a 10 percent chance
of running out?
4.9 On a midterm exam, the scores were distributed
normally with mean of 72 and standard deviation of 10.
Student Wright scored in the top 10 percent of the class
on the midterm.
a. Wrights midterm score was at least how much?
b. The final exam also had a normal distribution, but
with mean of 150 and standard deviation of 15. At
least what score should Wright get in order to keep
the same ranking (i.e., top 10 percent)?
4.10 An investor wishes to invest in one of two projects. The
returns for both projects are uncertain, and the
probability distribution for returns can be expressed by

128
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

a normal distribution in each case. Project A has a mean


return of Rs 240,000 with a standard deviation of Rs
20,000. Project B has a mean return of Rs 250,000 and a
standard deviation of Rs 40,000.
a. Consider a return of Rs 280,000. Which project has a
higher chance of returning this much or more?
b. Consider a return of Rs 220,000. Which project has
a higher chance of returning this much or more?

__________________
__________________
__________________

4.11 A survey was conducted among the readers of a certain


magazine. The results showed that 60 percent of the
readers were homeowners and had incomes in excess of
Rs 25,000 per month; 20 percent were homeowners but
had incomes of less than Rs 25,000; 10 percent had
incomes in excess of Rs 25,000 but were not homeowners;
and the remaining 10 percent were neither homeowners
nor had incomes in excess of Rs 25,000.
a. Suppose a reader of this magazine is selected at
random and you are told that the person is a
homeowner. What is the probability that the person
has income in excess of Rs 25,000?
b. Are home ownership and income (measured only as
above or below Rs 25,000) independent factors for
this group?
4.12 A survey was conducted of families in an urban and the
surrounding suburban area. The families were classified
according to whether or not they customarily watch two
TV programs. The data are shown in the table in
percentages of the total.
a. If a family is selected from this group at random,
what is the probability that it views both programs?
b. If the family selected views program A, what is the
probability that it also views program B?
c. Are the events (views program A) and (views
program B) independent events?
d. Is the event (views program B) independent of the
event (urban)?

129

e. Consider the event (view either program A or B or


both). Is this event independent of the event (urban)?

Notes
__________________
__________________

Watch

Watch Program A

Program B

__________________

Yes

No

__________________

Urban

Suburban

Urban

Suburban

Total

Yes

10%

14%

5%

1%

30%

No

15

21

20

14

70

25%

35%

25%

15%

100%

Total

__________________
__________________
__________________
__________________

4.13 Newspaper articles frequently cite the fact that in any


one year, a small percentage (say, 10 percent) of all
drivers are responsible for all automobile accidents. The
conclusion is often reached that if only we would single
out these accident-prone drivers and either retrain them
or remove them from the roads, we could drastically
reduce auto accidents. You are told that of 100,000
drivers who were involved in one or more accidents in
one year, 11,000 of them were involved in one or more
accidents in the next year.
a. Given the above information, complete the entries
in the joint probability table below.
b. Do you think searching for accident-prone drivers
is an effective way to reduce auto accidents? Why?
Table of accidents
Second Year
First Year

Accident

No
Marginal Probability of
Accident
Event in First Year

Accident

0.10

No Accident

0.90

Marginal Probability of
Event in Second Year

0.10

0.90

1.00

4.14 Gati India Ltd., maintains kilometer records on all of


its rolling equipment. Here are weekly kilometer
records of its trucks.
810

450

756

789

210

657

589

488

876

689

1450

560

469

890

987

559

788

943

447

775

__________________
__________________

130
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(a) Calculate the median kilometer a truck travelled.


(b) Calculate the mean for 20 trucks.
(c) Compare part (a) and (b) and explain which one is
better measure of central tendency of the data.
4.15 Premier Automobiles Ltd. does statistical analysis for
an automobile racing team. Here are the fuel
consumption figures in kilometer per litre for the teams
cars in the recent races.

__________________

4.77

6.11

6.11

5.05

5.99

4.91

5.27

6.01

__________________

5.75

4.89

6.05

5.22

6.02

5.24

6.11

5.02

(a) Calculate the median fuel consumption.


(b) Calculate the mean fuel consumption.
(c) Group the given data into equally sized classes.
What is the fuel consumption value of the modal
classes.
(d) Which of the three measures of central tendency is
best for Premier to use when she orders fuel?
Explain.
4.16 A machine is assumed to depreciate 40% in value in first
year, 25% in second year and 10% per annum for the
next three years. Each percentage being calculated on
the diminishing value. What is the average percentage
depreciation for the five years?
4.17 Philips India Ltd., manufactures the famous Philips tube
lights of 40 watts. The company has developed a new
variety of Flouroscent 24 watt tubelights for specific
applications in control equipments used in defence
components. Before it is commercially launched the
manager R and D desires to ensure its reliability and
quality. The test results conducted on 400 such tube
lights are shown below. Compute the coefficient of
skewness.

131
Notes

Life Time (hours)

No. of Tubes

300-400

14

__________________

400-500

46

__________________

500-600

58

__________________

600-700

76

__________________

700-800

68

__________________

800-900

62

900-1000

48

1000-1100

22

1100-1200

__________________
__________________
__________________
__________________
__________________

4.18 From past experience it is known that a machine is set


up correctly on 90% of occasions. If the machine is set
up correctly then 95% of good parts are expected but if
the machine is not set up correctly then the probability
of a good part is only 30%.
On a particular day the machine is set up and the first
component produced and found to be good. What is the
probability that the machine is set up correctly?
4.19 If the probability of obtaining heads when tossing a
certain coin is , what is the probability of obtaining
heads four times in nine tosses?
4.20 If the probability of obtaining a 6 when throwing a certain
dice is , what is the probability of obtaining a 6 four
times in nine throws?
4.21 In how many different ways the first three places can
be filled in a race in which there are 11 horses?
4.22 If from our statisticians and seven engineers a
committee is to be formed which must consist of two
statisticians and two engineers. In how many different
ways the committee can be formed if (i) There is no other
restriction on the membership of the committee. (ii) A
particular statistician must be included, (iii) Two of the
engineers are unable to serve on the committee?
4.23 Five bolts are selected at random from a box containing
six sound and three faulty bolts. What is the probability
of obtaining (i) five sound, (ii) four sound and one faulty,
(iii) three sound and two faulty, (iv) two sound and three
faulty bolts?

133
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________

After reading this unit you will be able to understand:

__________________

The significance of Decision theory in decision making


environment.

How to solve decision theory problems by calculating EMV, EVPI


or EOL.

The problems applying decision tree.

Introduction: It assists the managers / executives in making


the decisions. It deals with the methods helpful to decision
makers to select the best course of action from amongst the
alternative plans of action. It provides a method of rational
decision making when consequences are not fully known. It
provides a framework for better understanding of decision
situations and for evaluating alternatives.

Decision Maker: An individual / group responsible for


making a choice of appropriate course of action.
Objectives: Which decision maker wants to achieve.
Preference/value System: Criteria that decision maker uses
in decision making a choice of best course of action.
Courses of Action: Decision alternatives under control
available to decision maker.
States of Nature: Decision maker prepares the list of
possible future events before applying theory. These future
events are called states of nature, are mutually exclusive

__________________
__________________
__________________

134
Notes
__________________
__________________
__________________
__________________

and collectively exhaustive. It can have / not have a numerical


description (Low/high demand of items, lockout or strike etc.)
Payoff: It is the effectiveness of particular combination of a
course of action and state of nature. These are also called as
conditional values/profits.

__________________
__________________
__________________
__________________
__________________
__________________

Opportunity Loss: It is incurred due to failure of not


adopting most favourable course of action or strategy.
Decisions: Broadly there are three types of decisions which
are as follows:
Strategic decisions are concerned with external
environment of the organization. (Selection of location,
product, market or technology etc.)
Administrative Decisions are concerned with structuring
and acquisition of the organizations resources so as to
optimize the performance of the organization. (Layout,
distribution channel, purchase of assets etc.)
Operating Decisions are primarily concerned with day to
day operations of the organization. (Production scheduling,
inventory, packing and dispatching).

(i)

Decision making environment

(ii) Objective of a decision maker


(iii) Alternative plans of actions or strategies
(iv) Decision Payoff
(i)

Decision making environment

Deterministic situation
Where the information is completely known and the
outcome of a specified decision can be
predetermined with certainty. Decision maker has
the complete information of impact of each course of
action. The techniques used under such situations
are: (i) Linear Programming, (ii) Input-output

135

analysis, (iii) Transportation and assignment models


(iv) CPM.

Stochastic situation

Notes
__________________
__________________
__________________

In this situation, the decision maker knows the


likelihood of the occurence of each of state of nature
. The probability of occurrence of each state is known.
The techniques used are Decision Theory, PERT,
Simulation, Markov Chain, Bayesion Theorem.

__________________
__________________
__________________
__________________
__________________

Situation of uncertainty
Wherein the probabilities associated with the states
of nature are unknown. For example, the success of
new product launched in the market or the success
of branch office opened abroad. Game Theory is used
to analyse such situations.

(ii) Objective of a decision maker


The objective should be defined explicitly. Whether he
wants to continue the existing state or switch over to
other state. The problem relates to single goal or
multiple goals. Example: Maximizing return or profit,
minimizing loss or wastage.
(iii) Alternative plans of actions or strategies
There must be more than one course of action, otherwise
there is no need of any decision. Exhaustive list of
alternatives is prepared and then feasible alternatives
should be considered for the analysis.
(iv) Decision Payoff: It represents the effectiveness of the
strategies. Generally, it is measured in monetary terms.
It can be fixed in advance or can be random variable.
Payoff Table: List of states of nature (events) which are
mutually exclusive and collectively exhaustive and a set of
given courses of action (strategies) and for each combination
of these, payoff is calculated.

__________________
__________________

136
Notes

State of nature

__________________

Courses of action
S1

S2

S3

Sn

__________________

O1

a11

a12

a13

__________________

O2

a21

a22

a23

a2n

O3

a31

a32

a33

a3n

..

..

..

..

am1

am2

am3

__________________
__________________

.
Om

a1n

amn

__________________
__________________
__________________
__________________
__________________

EMV (Expected Monetary Value): For a given course of


action is the weighted average payoff, which is the sum of
the product of payoff for the several combinations of
strategies and states of nature multiplied by the probability
of occurrence of each outcome.

(i)

Define systematically the states of nature (Oi) and


course of action (Sj).

(ii) List the payoff associated with each combination of state


of nature and course of action along with probability of
each state of nature.
(iii) Calculate EMV for each course of action by multiplying
the conditional payoffs by associated probabilities and
sum up these weighted values for each course of action.
(iv) On the basis of specified decision objective, determine
the courses of action corresponding to optimal EMV.
EPPI (Expected Profit with Perfect Information): It is
the maximum obtainable expected monetary value based on
perfect information as to which state of nature will occur.
EVPI (Expected Value of Perfect Information): It is the
maximum amount one would be willing to pay to obtain
perfect information.
EVPI = EPPI EMV
EOL (Expected Opportunities Loss): It occurs due to lack
of perfect information. It shows the expected differences
between the payoff of right decision (maximum) and payoff
of actual decision.

137

EOL = Summation of COL (Oi, Sj) . P(Oi)


Where COL is the conditional opportunity loss.

Notes
__________________
__________________
__________________

This approach is used for complicated situations. A decision


tree is a decision flow diagram that includes branches leading
to alternatives one can select among the usual branches
leading to events that depend on probabilities. Expectation
principle is used i.e. to choose the alternative that maximizes
the expected profit or minimizes the expected cost.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

(i)

It provides a graphical presentation of sequential actions


or decisions.

(ii) It makes the analysis simple as the computed values


can be written on the tree diagram.
(iii) It clearly depicts when decisions are expected to be
made along with their possible consequences and
results.
(iv) The complex problems can be solved with tree diagram.

(i)

The probability estimates may not be accurate thus, may


not give true results.

(ii) Certain extraneous variables may be out of control of


the decision maker and thus, objective may not be
achieved completely.
(iii) When number of states of nature is large, the problem
becomes complicated.

138
Notes
__________________
__________________

1.

__________________
__________________
__________________

A distributor Doon Enterprises buys tankers of diesel


for Rs.10,00,000 and sells them at Rs. 12,00,000 each.
All the tankers left at the end of the day are worthless.
Over the last 100 days he recorded the daily sale and
the distribution of sale is as follows:

__________________
__________________

Tankers Sold

No. of Days

20

21

20

22

30

23

35

24

10

__________________
__________________
__________________

Determine the number of tankers the distributor should


buy so as to maximize his profit.
2.

An auto parts retailer sells headlights for Rs. 35 each


which he buys at Rs. 30 each. He cannot return the unsold
headlights. The daily demand has the following
distribution:
No. of Headlights 23
Probability

24

25

26 27 28

29 30

31

0.01 0.03 0.06 0.1 0.2 0.25 0.1 0.050 0.05

If each days demand is independent of previous days


demand, how many headlights should he order each day?
3.

An Oil exploration company wants to make a decision


regarding buying the site on lease or not, the cost has
been calculated for these two states for getting oil or
not getting oil. Draw a decision tree for the data given
in the cost matrix:
State of Nature
Prob.

Getting Oil
0.01

Not getting Oil


0.99

Go for Lease

Rs. 1,00,000

Rs. 1,00,000

Not go for Lease

Rs. 80,00,000

Rs. 0

Also advise the company which decision it should take.


(Ans. Rs. 80,000)

139

4.

A company owns a lease on a certain property in a


European country. It may sell the lease for US $ 18,000
or it may drill this property for exploring gas. Various
possible drilling results were obtained after a research
conducted by its engineers. The data is given in the
following table:

Possible Results

Notes
__________________
__________________
__________________
__________________
__________________

Probability

Outcomes (US$)

__________________

Dry well

0.10

- 1,00,000

__________________

Oil well

0.20

45,000

Gas well only

0.40

98,000

Oil and gas combination

0.30

1,99,000

__________________
__________________
__________________

Construct a decision tree diagram for the above problem


and determine EMV for the company. What will you
suggest the company to sell or drill?
5.

A company wants to increase its production beyond its


existing capacity. It has finally arrived at two approaches
to increase the capacity (1) Expansion, at a cost of Rs.
80 Lakh or (2) Modernization at a cost of Rs. 50 Lakh.
Both approaches would require 8 months for
implementation. The Board of Directors feels that during
implementation and thereafter the demand will either
be very high or moderate. The probability for very high
demand is estimated as 0.35 and for moderate it is 0.65.
If demand is very high, expansion would result
additional profit of 120 Lakh, but on the other hand
modernization would bring additional 60 Lakh only. It
is estimated that when demand is moderate, the
comparable profit would be 70 Lakh and 50 Lakh for
modernization.
(a) Construct the Payoff (Profit) Table.
(b) What is the optimum strategy for the company?
(c) Calculate EMV and EVPI.
(d) Calculate EOL.
(Ans. (b) Expansion, (c) 7.5 Lakh, 6.5 Lakh (d) 6.5 Lakh)

140
Notes
__________________
__________________
__________________
__________________
__________________

6. A medical practitioner purchases a specific vaccine on


Sunday evening each week. The vaccine has to be used
in the following week otherwise it becomes worthless.
It costs Rs. 20 per dose and he charges Rs. 22 only per
dose. The practitioner administered the vaccine in
following quantities in last 50 weeks:

__________________
__________________

20

25

40

60

Number of doses/ week

15

25

Calculate the number of doses the practitioner must buy


every week.

__________________
__________________
__________________

Number of weeks

7.

The marketing department of a certain petrochemicals


manufacturing company has worked out the payoff
(given in the following table) in terms of profit expressed
in million dollars, concerning a technical proposal
depending upon the rate of technological changes.

Technological
Changes

Course of action
Accept

Reject

Fast

Slow

None

-1

Whether the technical proposal should be accepted or


not?

141
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________

After reading this unit you will be able to understand:

__________________

The significance and advantages of Linear Programming.

__________________

The conversion of the LP problem into mathematical model.

__________________
__________________

How to solve the LP problems of minimization and maximization


using graphical and simplex methods.

How to obtain the dual of primal.

It was introduced by George Dantzig in 1947. It is used as a


scientific approach to decision making and an optimization
technique used in operations research. It is applied to
minimize the cost or maximize the profit.
A Linear Programming problem is a special case of a
Mathematical Programming problem. From an analytical
perspective, a mathematical program tries to identify an
extreme (i.e., minimum or maximum) point of a function, which
furthermore satisfies a set of constraints, e.g., Linear
programming is the specialization of mathematical
programming to the case where both, function Z - to be called
the objective function - and the problem constraints are linear.
It allows the rationalization of many managerial and/or
technological decisions required by contemporary technosocio-economic applications. An important factor for the
applicability of the mathematical programming methodology
in various application contexts is the computational
tractability of the resulting analytical models. Under the
advent of modern computing technology, this tractability
requirement translates to the existence of effective and

142
Notes
__________________
__________________
__________________

efficient algorithmic procedures able to provide a systematic


and fast solution to these models.
Let us discuss the meaning of terms Linear and Program
used in LP:

__________________
__________________
__________________
__________________

Linear: The relationship between the variables is directly


proportional. For example, if a wooden table requires 30 cubic
feet of wood then 10 tables would require 300 cubic feet
wood.

__________________
__________________
__________________

Program: A program is a set of instructions arranged in a


logical sequence.
Optimal: It means if a programme maximizes or minimizes
some measure or criterion of effectiveness. Ex. Maximization
of profit / sales or minimization of cost or distance etc.
Limited: Availability of resources during planning horizon.
The related problem of Integer Programming requires some
or all of the variables to take integer (whole number) values.
Integer programs (IPs) often have the advantage of being
more realistic than LPs, but the disadvantage of being much
harder to solve. The most widely used general-purpose
techniques for solving IPs use the solutions to a series of
LPs to manage the search for integer solutions and to prove
optimality. Thus, most IP software is built upon LP software.
Linear programming has proved valuable for modeling many
and diverse types of problems in planning, routing,
scheduling, assignment, and design. Industries that make
use of LP and its extensions include transportation, energy,
telecommunications, and manufacturing of many kinds.

Certainty- in Linear Programming, it is assumed that all


model parameters such as available resources, coefficients
of objective function and coefficients of the constraints are
known with certainty.
Proportionality- all relationships in objective functions and
constraints are linear. In economic terminology, this means
that there are constant returns to scale i.e., if one unit of a

143

product contributes Rs.10 as profit, then five units will


contribute Rs. 50.
Additvity- the total of all the activities is given by sum total
of each activity performed separately. For example, the total
profit in the objective function will be sum of the profits
contributed by each of products separately.
Continuity- means that the decision variables are
continuous. Accordingly, the solutions of decision variables
and resources are assumed to have whole numbers or
fractions. When only integers are required such as the
number of tables/books/workers/machines, Integer
Programming is more suitable.
Finite Choices- implies that finite numbers of choices are
available to a decision maker and decision variables do not
assume negative values.

It is used to determine the product mix.


It helps in attaining the optimum use of production factors.
It gives possible and practical solutions, optimal solution for
the decision maker.
It improves the quality of decisions as these are objective in
nature.
It is also helpful in re-evaluation of a basic plan for changing
conditions.

Linearity: It treats all relationships among decision


variables and objective functions as linear but in real world
situation it may not be so.
Integer Value: The solution of LPP may not result into
integer values, the values are in fractions, rounding off to
the nearest integers does not give the optimal solution.
Integer Programming is the modified LPP which overcomes
this limitation.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

144
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Multiple Goals: It deals with only single objective


(Minimizing the cost or maximizing the profit) but practically
a manager has to make decision considering various goals at
the same time i.e. sales maximization, minimizing idle labour
time and maximizing capacity utilization etc. Goal
Programming is the technique which is suitable for such
situations.
Constant Parameters: All the parameters are assumed to
be constant in LPP but it is not so in real life situations.
Number of Variables: It can be solved manually when there
are two or three variables, in case of large number of variables
computing can be done through software only.
Deterministic Nature: It is assumed that all the coefficients
and resources are known with certainty and it is not affected
by time and uncertainty.
Problem Formulation (Mathematical Model): When a
decision has to be taken regarding the number of units of
two or more than two products to be manufactured under
certain constraints like manpower, material or space etc.,
the mathematical or analytical model of linear programming
has to be used. There are three components of mathematical
model:
(i)

Objective Function (Z): The linear function which is


to be optimized is called Objective Function. The
objective functions can be in terms of minimization or
maximization. In case of minimization, we can have cost,
time, distance etc. whereas in case of maximization,
these can be profit, revenue or sales etc. For example:
Max. Z = 100 X1 + 200X2
Where X1 and X2 are the quantities (number of units) of
two products P1 and P2.
Rs.100 and Rs.200 are the profit values for these two
products P1 and P2.

(ii) Constraint (Inequality) Equations: These represent


the linear relationship of constraints in a given situation.

145

These can be of types such as: material, manpower,


space, machine or budget etc. The equations will have
the signs less than equal to (<=), greater than equal to
(>=) or equality (=).
2X1 + 5X2 <= 2000

Notes
__________________
__________________
__________________
__________________
__________________

2 and 5 are the quantities of material consumed (cubic


feet for wooden material) for P1 and P2.

__________________

2000 is the total quantity of material available.

__________________

3X1 + 4X2 <= 4200

__________________

__________________

__________________

3 and 5 are the man-hours for P1 and P2.


4,000 is the total man-hours available.
(iii) Non-negative Equations: The decision variables have
the values zero or positive, not negative. Thus, these
are also called as variable sign restrictions
X1 and X2 >= 0

Two families of solution techniques are in wide use today.


Both visit a progressively improving series of trial solutions,
until a solution is reached that satisfies the conditions for
an optimum. Simplex methods visit basic solutions
computed by fixing enough of the variables at their bounds
to reduce the constraints to a square system, which can be
solved for unique values of the remaining variables. Basic
solutions represent extreme boundary points of the feasible
region. The simplex method can be viewed as moving from
one such point to another along the edges of the boundary.
Barrier or interior-point methods, by contrast, visit points
within the interior of the feasible region. These methods
derive from techniques for nonlinear programming that were
developed and popularized in the 1960s by Fiacco and
McCormick, but their application to linear programming
dates back only to Karmarkars innovative analysis in 1984.

146
Notes

Simplex Method for Standard Maximization Problem

__________________

Step 1. Convert to a system of equations by introducing slack variables to


turn the constraints into equations, and rewriting the objective function in
standard form.

__________________
__________________

Step 2. Write down the initial tableau.

__________________

Step 3. Calculate Zj and then Cj and then calculate Cj-Zj. Select the pivot
column: Choose the positive number with the largest magnitude in the
index row. Its column is the pivot column. (If there are two candidates,
choose either one.) If all the numbers in the bottom row are zero or
negative (excluding the rightmost entry- Minimum Ratio), then the solution
obtained is the feasible solution.

__________________
__________________
__________________
__________________

Step 4. Select the pivot in the pivot column: The pivot must always be a
positive number. For each positive entry b in the pivot column, compute
the ratio a/b, where a is the number in the Answer column in that row. Of
these test ratios, choose the smallest one. The corresponding number b
is the pivot.

__________________
__________________

Step 5. Divide the pivot row by pivot element to make it unity. Construct
the new tableau by writing the previous pivot row first at the same position
(as it was having previously).
Step 6. Write the values of other rows so as the corresponding element in
the pivot column becomes zero. Again calculate Zj and Cj-Zj and iterate
step 3 onwards to reach the feasible solution.

For every given linear programming problem, there exists


an intimately related L.P. Problem referred to as its Dual.
The given (original) problem is known as Primal. The duality
theorem states that for every maximization (minimization)
problem there is a unique similar problem of minimization
(maximization) involving the same data which describes the
original problem. The DUAL of a DUAL is PRIMAL.
The characteristics of dual problem:

If the objective of the primal is maximization, the


objective of the dual is minimization.

The primal has m-constraints while its dual has munknowns and vice-versa.

The coefficients of the objective function of primal


become the constraints of its dual and vice-versa.

The variables of the primal are replaced by the new


variables of its dual.

147

The sign of the inequalities in the set of restrictions of


the primal (<=) is reversed in the set of restrictions in
its dual (>=).

Notes
__________________
__________________

For finding the dual of the given maximization problem,


all the constraint inequalities should be of (<=) type and
for minimization, these should be of (>=) type.

__________________

Example: For obtaining the dual of following primal problem:

__________________

__________________

__________________

Max. Z = 3X1 + X2 + 2X3 X4


St:

__________________

__________________

2X1 X2 + 3X3 + X4 = 10

__________________

X1 + X2 X3 + X4 = 11

__________________

X1, X2 >= 0, X3, X4 unrestricted in sign


For coefficients of objective function, the matrix
is: [3 1
2
-1]
Another matrix for coefficients and resources are:
[2

-1

1]

[X 1]

[1

-1

1]

[X 2]

<=

[10]
[11]

The variables X 3 and X 4 in Primal are restricted in sign;


therefore, the third and fourth constraints in the dual will
have equality sign. As both the constraints in primal are of
equality sign, corresponding dual variables will be
unrestricted in sign.
Let W1, W2 be the corresponding dual variables. The dual is
as follows:
[2

1]

[-1

1]

[W1

W2]

<=

[ 3 1]
[1

[10]
[11]

1]
2W1 + W2 >= 3
W1 + W2 >= 1
3W1 W2 = 2
W1 + W2 = -1

W1, W2 are unrestricted in sign.

148
Notes
__________________
__________________

1.

A home decorator Glamour Enterprises manufactures


two types of lamps which go under two first technicians,
a cutter, second a finisher. Lamp A requires 2 hours of
cutters time and 1 hour of finishers time, Lamp B
requires 1 hour of cutters time and 2 hour of finishers
time. The cutter has 104 hours and finisher 76 hours of
time available each month. Profit on one lamp A is Rs. 6
and on one lamp B is Rs. 11. Assuming that he can sell
all the lamps he produces, formulate the problem.

2.

A firm manufactures three products A, B and C. The


profits are Rs. 30, Rs.20 and Rs. 40 respectively. The
firm has two machines M1and M2 and required processing
time in minutes on each product in given below:

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Product
Machine

M1

M2

Machines M1 and M2 have 2,000 and 2,500 machine


minutes respectively. The firm manufactures 100 As,
200 Bs and 50 Cs but not more than 150 As. Formulate
the problem as a LPP.
3.

The objective of a diet problem is to ascertain the


quantities of foods that should be eaten to meet certain
nutritional requirement at a minimum cost. The
consideration is limited to milk, beef and eggs, and to
vitamins A, B and C. The number of grams of each of
these vitamins contained in a unit of each food is given
below:

Vitamin

Gallon of Milk Pound of Beef Dozens of Eggs Minimum daily


requirement

10

100

10

10

50

10

100

10

10

Cost (Rs.)

40

90

20

Formulate the mathematical model.

149

4.

5.

A textile unit has two grades of inspectors, I and II, who


are to be assigned for the quality control inspection. It
is required that 2,000 pieces be inspected per 8 hours a
day. Grade I inspectors can check pieces at the rate of
50 per hour with an accuracy of 97%, and grade II
inspectors can check pieces at the rate of 40 per hour
with an accuracy of 95%. The wage rate of grade I
inspectors is Rs. 4.50 per hour and that of grade II is
Rs.2.50 per hour. Each time an error is made by an
inspector, the cost to the company is one rupee. The
company has available I 10 grade 1 and 5 grade II
inspectors for the inspection job . Formulate the problem
to minimize the total cost of inspection.
a) Solve the following problems graphically:
Max Z = 20 X1 + 30 X2
Sub. to:
X1 + X2 <= 1
3X1 + X2 <= 4
X1, X2 >= 0
(Ans. X1=0, X2=1, Z=30)
(b) Max Z = 30 X1 + 50 X2
Sub. to:
X1 + 2X2 <= 2000
X1 + X2 <= 1500
X2 <= 600
X1, X2 >= 0
(Ans. X1=1000, X2=500, Z=55,000)
(c) Max. Z = 4X1 + 5X2
Sub. to:
X1 + X2 >= 1

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

150
Notes
__________________

-2X1 + X2 <= 1

__________________

4X12X2 <= 1

__________________

X1, X2 >= 0

__________________

(Ans. Unbounded Solution)

__________________
__________________
__________________
__________________
__________________
__________________

(d) Min. Z = -X1 + 2X2


Sub. to:
-X1 + 3X2 <= 10
X1 + X2 <= 6
X1 X2 <= 2
(Ans. X1=2, X2=0, Z= - 2)
(e) Min. Z = 20 X1 + 10 X2
Sub. to:
X1 + 2X2 <= 40
3X1 + X2 >= 30
4X1 + 3X2 >= 60
X1, X2 >= 0
(Ans. X1=6, X2=12, Z=240)
(f) Min. Z = 30 X1 + 15 X2
sub to:
5X1 + X2 >= 10
X1 + X2 >= 6
X1 + 4X2 >= 12
X1, X2 >=0
Ans. X1=1, X2= 5, Min. Z = 105)
(g) Min. Z = 12 X1 + 15 X2
sub to:

151
Notes

X1 <= 5

__________________

X2 >= 3

__________________

X1 + X 2 = 6

__________________

X1, X2 >= 0

__________________
__________________

(Ans. X1 =3, X2 = 3, Z = 81)


6.

Solve the following problems by simplex method:

__________________

Sub. to:

__________________

2X1 + 3X2 <= 48


X1 + 3X2 <= 42
X1 + X2 <= 21
X1, X2 >= 0
(Ans. X1=6, X2=12, Z=60)
Max. Z = 4X1 + 10X2
Sub. to:
2X1 + X2 <= 50
2X1 + 5X2 <= 100
2X1 + 3X2 <= 90
X1 X2 >= 0
(Ans. X1=0, X2=20, Z=200)
8.

__________________
__________________

Max. Z = 2X1 + 4X2

7.

__________________

Min. Z = 3X1 + 2X2+ X3


Sub. to:
-3X1 + 2X2 + 2X3 = 8
-3X1 + 4X2 + X3 = 7
X1, X2, X3 >=0
(Ans. Unbounded Solution, Hint: all the elements in key
column have negative sign and Cj-Zj >0)

152
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

9.

An advertising agency wants to finalise its compaign and


it plans to target two types of audiences: customers with
cell phone (Type A) and customers not having cell phones
(Type B). The total ad budget is Rs. two lakhs. One
insertion of TV ad on movie channel costs Rs.50,000 and
one insertion on FM radio costs Rs. 20,000. As per
agreement, at least three insertions must be there for
movie channel and it can not exceed five in number. As
per the findings of research agency, a single TV ad
reaches 3,50,000 customers in target audience A and
1,50,000 in target audience B. One FM radio ad reaches
10,000 in target audience A and 90,000 in target audience
B. Determine the media mix to maximize the total reach.
(Formulate the problem and use simplex method)
(Ans. 4 insertions in TV & 0 in FM radio)

10. Obtain the dual of the following:


Max. Z = -3X1 2X2
sub to:
X1 + X2 >= 1
X1 +2X2 >= 10
X1 + X2 <= 7
X2 <= 3
and X1, X2 >=0
11. Find out the dual of the primal given below:
Min. Z = 10 W1 + 8 W2
sub to:
W1 + W2 >= 1
W1 + 3W2 >= 4
2W1 - W2 <= 12
and W1, W2 >=0

153

12. Write the dual of the following primal problem:


Max. Z = 5W1 + 6 W2

Notes
__________________
__________________

sub to:

__________________

W1 + 2W2 = 5

__________________
__________________

-W1 + 5W2 >= 3

__________________

4W1 + 7W2 <= 8

__________________

W1 unrestricted in sign and W2 >=0

__________________
__________________
__________________

155
Notes
__________________
__________________
__________________
__________________
__________________
__________________

Objectives
After reading this unit you will be able to understang:

The significance, types and applications of transportation models.

How to solve the transportation problems using IFS and optimality


test.

The objective and applications of assignment models.

The Hungarian method for solving the assignment problems.

The TRANSPORTATION model of linear programming is a


flow optimization technique. It is a special case that is
somewhat easier to solve than the general L.P. model. It is
used to produce optimal assignments of origin quantities to
destinations. A condition is that the sum of the origin
quantities must equal the destination demands.
F.L.Hitchcock contributed significantly in developing
transportation models. Since transportation is an economic
activity all the various accounting, financial, economic, &
econometric models are used. Much of the transport activity
that uses civil engineering is involved with public works.
Economic and accounting models for private and public
enterprises are generally used to assist major decisions &
policy formulation.
The transportation problem is a special case of Linear
Programming which is concerned with the distribution of a
certain product/commodity from a number of sources
(origins) to the number of destinations. It can be tabulated
in a matrix called transportation matrix. In this matrix, each
row denotes sources (factories), each column denotes the

__________________
__________________
__________________
__________________

156
Notes
__________________
__________________
__________________

destination (warehouses) and each cell shows the cost of


transportation per unit (Cij) from ith factory to jth warehouse.
Ai and Bj denote the supply and demand respectively.

__________________

Factory/
Warehouse

__________________
__________________
__________________
__________________
__________________
__________________

D1

D2

D3

D4

Supply

O1

C11

C12

C13

C14

A1

O2

C21

C22

C23

C24

A2

O3

C31

C32

C33

C34

A3

O4

C41

C42

C43

C44

A4

B1

B2

B3

B4

Total

Demand

Objective: The objective of transportation problem is to


minimize the cost of transportation under the given supply
and demand constraints.

At any one price, there is some quantity of a product which


an individual consumer is willing and able to purchase over
a given period of time. If the price changes, the quantity
purchased will change too. Economists call the relationship
between the PRICE of a commodity and the QUANTITY
purchased during some specified period of time the
DEMAND for that commodity.
When a consumer is willing and able to purchase some
quantity of a commodity at the existing market price, he is
said to have an EFFECTIVE DEMAND for that good. This
means that the buyer has the:
1.

desire to make a purchase,

2.

willingness to pay the price,

3.

ability to pay the price.

Transport Demand is usually categorized as:

Commodity or Goods movements, i.e. quantified by mass,


weight, volume or number of items.

Person Traveling, Passengers, Passenger Trips, People


Movements, etc.

157

Demand or Market Models tend to be disaggregated by trip


purpose, or commodity. These models are either:

Cross Section, i.e. they quantify the movements for a


short time period for a number of Origins and
Destinations.

Notes
__________________
__________________
__________________
__________________
__________________

Time Series, i.e. they represent one movement over a


number of time periods.
General, i.e. they attempt to combine cross section and
time series aspects much like space-time
representations.

The models may focus on a single market segment or several.


Some further categorization:

Aggregate, i.e. the producing activity is treated as single


strata.

Disaggregate which subdivides the activity into a


number of strata.

Multimodal where a number of competing and


complimentary modes are used to satisfy a market.

Abstract Mode, an econometric approach that is useful


in multimodal models.

The common uses are for planning, design, operations and


management. Each of the functions requires that the models
reflect behaviour of the chosen system and scenario. Each of
the comments below applies equally to the four activities
mentioned above. However the costs and difficulties of
complicated modeling should be matched with the expected
payoff from the exercise.
1.
Identify technically efficient solutions to transport
resource allocation:
The models can be run as:

TRANSPORT COSTING MODELS. (also net revenue)

TRANSPORT PRODUCTION MODELS

__________________
__________________
__________________
__________________
__________________

158
Notes

RESOURCE COSTING MODELS (also net revenue)

RESOURCE PRODUCTION MODELS

2.

They can be used to OPTIMIZE the system for a given


level of production for a given amount of service. This
can lead to development of cost effective relationships
with different levels of production with fixed levels of
plant, labour or investment.

3.

Give the ability to model as described above the domain


can be increased to simulate performance with some of
the inputs as random variables, and for different
scenarios.

(i)

Minimization (Cost or Distance)

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(ii) Maximization (Return or Revenue)


(iii) Balanced (Supply = Demand)
(iv) Unbalanced (Supply not equal to Demand)
(v) Restricted
(vi) Transshipment
Rim Condition: This is the necessary and sufficient
condition for determining the optimal solution any
transportation problem. The condition is:
Total Supply = Total Demand
ai=bj

The degeneracy occurs in the transportation problem when


we find IFS and observe that if the number of occupied cells
are less than the total number of rows plus columns minus
one). Thus the formula becomes:
Co < m + n 1
Where Co = No. of occupied cells

159

m = No. of rows
n = No. of columns

Notes
__________________
__________________
__________________

The steps for solving the transportation problem:


I.

Formulate the problem and set up in the matrix form.

__________________
__________________
__________________
__________________

II

Obtain IFS by applying any of the methods.

III Apply the optimality test (MODI or STEPPING STONE


METHOD) for obtaining the optimal solution.
The transportation problem can be solved in two phases. In
the first phase, IFS (Initial Feasible Solution) is calculated
and in the second phase, optimal solution is calculated.
For obtaining the IFS, the following methods are used:
(i)

NWCM (North West Corner Method)

(ii) LCM/MMM (Least Cost Method/Matrix Minima Method)


(iii) VAM (Vogels Approximation Method)
For feasible solution, the methods used are:
(i)

MODI (Modified Distribution Method)

(ii) SSM (Stepping Stone Method)


Before applying IFS, check the rim condition (ai = bj), which
must be satisfied.
NWCM (North West Corner Method):
(I)

Choose the cell in north west corner.

(II) Find out minimum of supply and demand i.e. Min. (ai,
bj).
(III) Allocate min.(ai, bj) in the north west cell and exhaust
the row (column) if supply(demand) is satisfied and
adjust the balance.
(IV) Repeat the steps I) onwards till all the supply and
demand are satisfied.

__________________
__________________
__________________

160
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(I)

Choose the cell with minimum cost.

(II) Find out minimum of supply and demand i.e. Min. (ai,
bj).
(III) Allocate min.(ai, bj) in the north west cell and exhaust
the row (column) if supply(demand) is satisfied and
adjust the balance.
(IV) Repeat the steps I) onwards till all the supply and
demand are satisfied.

__________________

(I)

Calculate the penalty for each row or column by


subtracting the smallest element from the next smaller
element.

(II) Choose the highest penalty out of all the penalties and
choose the cell with minimum cost in the corresponding
row (column). In case of tie between two highest
penalties, choose arbitrarily.
(III) Find out minimum of supply and demand i.e. Min. (ai,
bj) corresponding to the chosen cell.
(IV) Allocate min.(ai, bj) in the cell with minimum cost and
exhaust the row (column) if supply(demand) is satisfied
and adjust the balance.
(V) Repeat the steps I) onwards till all the supply and
demand are satisfied.

Modi Method:
Prior to applying this method, the following condition must
be satisfied:
Co = m+n-1
If this condition is not satisfied, degeneracy occurs.
Degeneracy can be removed by putting Delta in the
unoccupied cell having minimum cost.

161

I.

II.

Construct a transportation table with given cost and


allocations as per IFS through any of these methods.
For occupied cells, calculate index numbers U i and Vj
for rows and columns respectively. Values of these index
numbers are calculated by:

Notes
__________________
__________________
__________________
__________________
__________________

Cij = Ui + Vj
III. Opportunity cost is computed for all the unoccupied cells
by using the following equation:
Dij = Cij (Ui + Vj)

__________________
__________________
__________________
__________________
__________________

IV. Examine unoccupied cells evaluation for opportunity


cost (Dij);
(a) If Dij > 0, Cost of transportation will increase, the
solution is optimal.
(b) If Dij < 0, Cost of transportation will decrease, the
solution is not optimal.
(c) If Dij = 0, Cost of transportation will not change,
alternate solution exists.
In case of Dij < 0, loop is constructed for which following
steps are required:
V

Select an unoccupied cell with largest negative


opportunity cost.

VI

Constructed a closed path for the unoccupied cell


determined in previous step and assign plus (+) and
minus (-) sign alternatively beginning with plus sign for
the selected unoccupied cell.

VII Assign as many units as possible to the unoccupied cell


satisfying the rim condition. The smallest allocation in
a cell with ve sign on the closed path indicates number
of units that can be shipped to the unoccupied cells. This
quantity is added to all the occupied cells on the path
marked with + ve sign.
VIII Go to step II and iterate all the steps until all Dij become
positive to reach the optimal solution. Then calculate
the transportation cost.

162
Notes
__________________
__________________
__________________

* The method discussed here is applicable for case of


minimization. In case of maximization, the problem is
converted into minimization by subtracting all the elements
from the highest element.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

An assignment problem is a special type of transportation


problem. The method to solve assignment problems was
introduced by D.Konig, a Hungarian mathematician. In such
models, only one unit can be supplied to each destination
from each source. For example, to assign one job to each
facility in order to achieve the minimum possible cost.
Objective: The objective of assignment model is to assign a
number of resources to an equal number of activities so as to
minimize the cost or maximize the profit by optimal
allocation.

(i)

Assignment of machines to jobs.

(ii) Assignment of workers to various tasks.


(iii) Assignment of sales representatives to sales territories.
(iv) Assignment of contracts to bidders.
(v) Assignment of buses/airlines/trains to various routes.
(vi) Assignment of officers to various offices.
Types of assignment problems:
(i)

Minimization (Cost, time or Distance)

(ii) Maximization (Profit or Revenue)


(iii) Balanced (No. of rows = No. of columns)
(iv) Unbalanced (No. of rows /= No. of columns)
(v) Restricted
(vi) Crew assignment

163
Notes

The pre condition for solving the assignment models is that


matrix must be a square matrix i.e. number of rows and
columns should be equal.

__________________
__________________
__________________
__________________

In case of unbalanced problem, add dummy row (column) so


as to make it square matrix.

__________________
__________________
__________________

I.

II.

Subtract the smallest element of the row from all the


corresponding elements of the row and repeat this for
each row.
Repeat the same step for all the columns.

III. Start making the assignments by considering the rows


first. Start from the first row and see if any single zero
is there, if it occurs, make an assignment (square over
that element) there and strike off any other zero in that
column and continue the same for other rows.
IV

Repeat the same procedure as mentioned in the previous


step for the columns till all the zeros are either assigned
or strike off.

Count the number of assignments, if it is equal to number


of rows/columns, the optimal solution is obtained.
Otherwise, go to step VI.

VI

Draw the minimum number of lines (horizontal and


vertical) as given below:
(a) Mark all the rows that do not have assignments.
(b) Mark all the columns that have zero in the marked
row.
(c) Mark all the rows that have assignments in marked
columns.
(d) Repeat above steps from a) to c) until no more rows/
columns can be marked.
(e) Draw straight lines through unmarked rows and
marked columns.

__________________
__________________
__________________

164
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

VII Select the smallest element from all uncovered


elements. Subtract this smallest element from all the
uncovered elements and add it to the element which
lies at the intersection of two lines.
*The above procedure is applicable for the case of
minimization. In case of maximization, the profit matrix is
converted into the cost matrix by subtracting all the elements
from the highest element of the matrix.

165
Notes
__________________

1.

An oil company has three refineries located in the


country. The daily oil production (in million tonnes) is
as follows:

__________________
__________________
__________________

Refineries

II

III

Oil produced

10

__________________
__________________
__________________

Each day, the refineries must satisfy the needs for their
distribution centres. Minimum requirement at each
centre is as under:

__________________
__________________
__________________

Distribution Centre

II

III

IV

Oil supply

The cost in thousands of rupees of shipping one million


tonnes from each refinery to each distribution centre is
as per the table given below:
Dist. Centres\ Refineries

D1

D2

D3

D4

R1

11

R2

R3

15

Determine initial feasible solution by a) NWCM b) LCM


c) VAM method
(Ans.Rs.11,600, Rs.11,200, Rs.10,200)
2.

A company has plants P1, P2, P3 which supply to


distribution centres D1, D2 and D3. Monthly factory
capacities are 200, 160 and 90 units respectively whereas
requirement of distribution centres are 180, 120 and 150
respectively. Unit transportation cost is given in the
following table:

Distribution

D1

D2

D3

P1

16

20

12

P2

14

18

P3

26

24

16

Centres\Plants

Determine the optimal solution for the company.


(Ans. Rs.5,920)

166
Notes

3.

__________________
__________________
__________________

A car hire company has one car at each of five depots


A,B,C,D and E. A customer requires a car in each town
P,Q,R,S,T. Distances (given in kms) between depots and
towns are given in the following table:

__________________

Depot/Town

__________________

160

130

175

190

200

__________________

135

120

130

160

175

__________________

140

110

155

150

185

__________________

150

50

80

80

110

__________________

55

35

70

80

105

(Ans. 570 kms)

__________________

4.

Due to absence of a workman, an officer has to assign


four out of five different jobs to four workers with the
performance (cost) matrix given below, determine an
optimal solution.

Worker/Jobs

11

10

11

12

14

10

(Ans.Rs.13,000)
5.

A company is faced with the problem of assigning five


different machines to five different jobs with a view to
minimize total cost. The costs (in thousands) are
estimated and shown in the following table:
Jobs

Machines

2.5

II

1.5

III

6.5

IV

3.5

4.5

(Ans. Rs.20,000)

167

6.

A trip from Delhi to Ajmer takes six hours by bus. The


manager has designed the time table of bus service both
ways which is as under:

Notes
__________________
__________________
__________________

Departure
from Delhi

Route
No.

Arrival
at
Ajmer

Arrival
at Delhi

Route
No.

Departure
from Ajmer

06.00

12.00

11.30

05.30

07.30

II

13.30

15.00

09.00

11.30

III

17.30

21.00

15.00

19.00

IV

01.00

00.30

18.30

__________________

00.30

06.30

06.30

00.00

__________________

__________________
__________________

The cost of providing service by the travel agency


depends upon the time spent by the bus crew away from
their places in addition to service time. There is a
constraint that every crew should be provided with at
least 4 hours of rest before the return trip and at the
most he can stay for 24 hours before he goes for the
return trip. The crews are residing at rest house hired
by travel agency in Delhi as well as Ajmer. The manager
wants to minimize the waiting time of crews. Determine
the optimal assignment for the crews for various routes.
(There are five crews and five routes)
(Ans. 33.5 hours, IV-a, V-b, I-c, II-d, III-e)

__________________
__________________

__________________

169
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________

After reading this unit you will be able to understand:

__________________

The Game theory and its assumptions.

__________________

The types, limitations and rules of game theory.

__________________

The value of game using various methods.

__________________

Professor John Von Neumann and Oscar Morgenstern


published their book entitled The Theory of Games and
Economic Behaviour wherein they provided a new approach
to many problems involving conflicting situations. The theory
of games attempts to provide the rational decision in the
confronting situations. The term game represents a conflict
between two or more individuals or groups or organizations.
It is the science of conflict. It is applicable to those
competitive situations which are technically known as
competitive games. The objective of game theory is to
determine the rules of rational behaviour in situations
wherein the outcome resulting from a decision made by one
individual depends not only on that individuals choice but
also on the course of action taken by other interested
individuals.

The players act rationally and intelligently.

II

The players attempt to maximize gains and minimizes


losses.

III All relevant information is known to each player.

170
Notes

IV

Each player has available to him a finite set of possible


courses of action.

The players make individual decisions without direct


communication.

VI

The players simultaneously select their respective


course of action.

VI

The payoff is fixed in advance.

Two-person games and n-person games

II

Zero sum and non-zero sum games

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

III Games of perfect information and imperfect information


IV

Games with finite number of moves

Cooperative and non-cooperative games

VI

2 2 two-person games, 2 m and m 2 games

VI

3 3 and larger games

VII Constant sum games

Businessmen do not have the knowledge of game theory


and all the alternative strategies available to them or
their competitors.

The business environment is turbulent and there is a


lot of uncertainty. Thus game theory may not be giving
accurate results in such cases. As outcome of a strategy
may not be known with certainty.

The game theory may not be suitable for oligopoly


situations where there are number of companies/firms
involved.

Larger size games are very complicated and cannot be


solved manually.

171

The assumption in game theory that one player tries to


maximize the gains and other tries to minimize the
losses. It may not be true in case of todays dynamic
businessman.

Strategy: The strategy for a player is the list of all possible


courses of actions that he will take for every pay-off that
might arise.
Pure Strategy: It is the decision rule which is always
followed by the player to select the particular course of
action.
Mixed Strategy: When the player has alternative courses
of action and he has to select combination of these with some
fixed probabilities.

Row Dominance: When each element in a row are less than


or equal to the corresponding element in another row, this
row is dominated and hence can be deleted from the payoff
matrix.
Column Dominance: When each element in a column are
less than or equal to the corresponding element in another
column, this column is dominated and hence can be deleted
from the payoff matrix.
Average Dominance: A strategy can also be dominated
when it is inferior to an average of two or more pure
strategies.
Value of a game: It is the average payoff per play of game
over an extended of time.
Saddle Point: In a payoff matrix saddle point is one which
is the smallest value in its row and the largest value in the
column. In other words, it is the point where maximin is equal
to minimax.
Example1: Determine the value of the following payoff
matrix:
Player B
Player A
[2
3]
[5
4]

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

172
Notes
__________________
__________________
__________________
__________________

Solution:

Player B
Player A
[2
3]
Row min. 2
[5
4]
Row min. 4
Col. Max. 5
4 Minimax = 4

Maximin = 4

__________________
__________________
__________________

Thus in this game, saddle point is 4. This is value of the game.


Such game is known as pure strategy game.

__________________
__________________
Formulate the pay off matrix

__________________

Apply maximin or manimax principle

Identify the value of game and


write the optimal strategy
of the players

Yes

Is
there a saddle
point?
No

Solve by using algebraic or matrix


method
for mixed strategic games

Yes

Is it
a 2 x 2 pay-off
matrix
game?
No
Use dominance rule to reduce the size
of the pay-off matrix to either 2 x 2.
2 x n or m x 2 size (order)
No

Yes

Is
pay-off matrix
reduced
to a 2 x 2
size?
No

Use graphical method to solve the


problem

Yes

Is
pay-off matrix
reduced to a
2 x n or m x 2
size?
No
Formulate and solve as an LP problem

173
Notes

(i)

__________________

Short cut method

__________________

(ii) Graphical method

__________________

(iii) Algebraic method

__________________
__________________

(iv) Linear Programming method

__________________
__________________
__________________

1.

Define game and its types.

__________________

2.

Define the following:

__________________

a) Saddle Point
c) Zero sum game

b) Pure and Mixed Strategies

3.

What are the assumptions made in the theory of games?

4.

Describe rules of dominance with examples.

5.

What are the various methods for solving a game theory


and their suitability?

6.

Find out the value of the game, the payoff (in rupees) is
given in the following matrix:

Player A / Player B

Strategy I

Strategy II

Strategy I

10

14

Strategy II

12

7.

The conditional gains to the workers association (in


thousands) against management strategies are given in
the following payoff table:

Association

Management Strategies

Strategies

M1

M2

M3

M4

A1

20

15

12

35

A2

25

14

10

A3

40

10

A4

-5

11

174
Notes

8.

__________________
__________________

Two petroleum companies X and Y are competing for


business. The matrix shows the gains to the company X
(Assume the game is zero sum).

__________________

Company Xs Gain

__________________

X\Y

__________________
__________________
__________________

Sales Promotion

Advertising

Exhibition

Sales Promotion

60

50

40

Advertising

70

70

50

Exhibition

80

60

75

__________________

Determine the optimal strategies and value of the game.

__________________
__________________

9.

Determine the value of the following game:


Player Y
I

II

III

IV

II

III

IV

Player X

10. The payoff matrix is given below. Determine the


optimum strategies and value of the game. (Solve
graphically)
Player B
3

-1

-3

-4

-1

Player A

11. Determine the value of the game and optimum strategies


for the following matrix:
Player B
Player A

-3

-4

-6

-2

-7

-1

-9

175
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________

After reading this unit you will be able to understand:

__________________

The properties and applications of Markovian model.

__________________

The types of Markov process and various states involved.

__________________

How to determine the future market share and market share in


steady state condition.

__________________

Markov models are used to analyze states of stochastic system


to describe its position at any instant of time.

(i)

It is a stochastic (probabilistic) process.

(ii) Markov Process is a sequence of experiments in which


each experiment has certain possible outcomes.
(iii) There is a finite set of states.
(iv) The process can be only in one state at a given time.
(v) The probability of moving from one state to another or
remaining in the same state in a single time period is
called transition probability (Pij). It always remains
constant and 0 <= Pij <= 1
(vi) The probability of transition from a given state to future
state is dependent on the present state.

(i)

It is a technique applied to solve various management


problems. For example it can be used to determine the
future market share of various products/brands. It can

176
Notes

be used in inventory management also to decide the


order size.

__________________
__________________
__________________
__________________

(ii) It is very useful for studying the buying pattern of


consumers or organizations particularly in terms of
brand loyalty and switching patterns.

__________________
__________________
__________________
__________________
__________________
__________________

(i)

First order markov process

(ii) Second order markov process


(iii) Higher order markov process

When state probabilities may become constant. The system


is in steady state condition if following conditions are
satisfied:
(i)

The transition matrix elements remain positive from one


period to the next. This property is known as the regular
property of Markov chain.

(ii) It is possible to move from one state to another state in


a finite number of steps, irrespective of the present state.
Absorbing state: A state is said to be absorbing (trapping)
state if it does not leave the state. It occurs when if any
transition probability in the retention diagonal from upper
left to lower right is equal to one (1).
Transient State: A state is said to be transient if it is not
possible to move to that state from any other state except
itself.
Cycling Process: A cycling (periodic) process is one in which
transition matrix contains all zero elements in the retention
cells (diagonal elements) and all other elements are either 0
or 1.

Transition Probability Matrix

Decision Tree Diagram

177
Notes
__________________

1.

There are two brands of oil engine A and B. Both have


exactly equal market share in the town presently. The
market size is also fixed. The transition matrix is given
below:
To

__________________
__________________
__________________
__________________

A
[0.8

B
0.2]

[0.5

0.5]

__________________
__________________

From

__________________
__________________

Determine their future market share for the next year


and market share in the steady state.
2. (a) M/s. Manoj Kumar Kamal Lal stocks three brands of
lubes at its various petrol pumps. Calculate the
equilibrium market share for three brands of
lubricants; the transition matrix is as follows:
To

From

Castrol

Castrol
[0.8

Elf
0.1

Mak
0.1]

Elf

[0.05

0.85

0.10]

Mak

[0.10

0.06

0.84]

(b) If the present market share of these three brands is


40%, 30% and 30% respectively, determine their
market share for the year 2008.
3.

Three brands of toothbrush are available in a provision


store. It has been observed that 50% of customers buy
brand Oral-B, 30% brand Pepsodent and 20% brand
Ajanta. The owner of provision store Sajjan Khatri found
that each quarter the customers change their preference.
Of those who bought Oral B last quarter, 50% buy it
again, but 15% change to brand Pepsodent and 35% to
brand Ajanta. Of those who bought brand Pepsodent,
70% buy it again, 10% switched to Oral B and 20% to
Ajanta. Of those who bought brand Ajanta 80% buy it
again, 5% switched to Oral-B and 15% to Pepsodent.
Construct the transition matrix and determine their
share in this situation.

__________________

179
Notes
__________________
__________________
__________________
__________________
__________________
__________________

Objectives

__________________
__________________

After reading this unit you will be able to :

Understand the basic concept of data.

Understand the basic problems in getting the right data.

Understand various methods of data collection.

Understand various ways of data presentation.

To solve any managerial problem that you face in the


organization you need relevant information. This relevant
information has to meet the tests of sufficiency and accuracy
to be useful to solve the problem in hand. This information,
which is the processed form of data, refers to collection of
numbers, letters, or symbols, maintained or produced for the
management when required.
In order, the numbers that you have collected, may be called
data, the following characteristics must be present:
(i)

It should be an aggregate of facts; for example, single


unconnected figures cannot be called data, as they
cannot be used to study characteristics of any event or
operation of any industry or organization.

(ii) There should be a reasonable standard of accuracy as is


required for the problem in hand, for example, in the
measurement of length one may measure correctly up
to 0.01 cm or 1cm or 1m as required, the quality of the
product could be reasonably estimated by certain tests
of small samples drawn from big lots of products.
(iii) It should be collected in a systematic manner for a predetermined objective.

__________________
__________________

180
Notes
__________________
__________________
__________________
__________________
__________________
__________________

(iv) The elements of the data must be related to one another.


The base used in the data should be the same for the
data of different times or firms to arrive at any
meaningful decision. For example, you cannot compare
two different companies figures if one company reports
in rupees and another reports in dollars without making
the base currency same.

__________________

(v) It must be numerically expressed in measurable units.

__________________

Data, when processed and presented in proper context,


becomes information which controls the activity of the
organization. Data is one of the major resources of the
organization, developed over a period of time and therefore
needs to be properly managed and safeguarded. It can be
treated as inventory because it may be procured, stored and
supplied when needed. Also just like any other physical stock
it suffers from deterioration and obsolescence. Data may
have different interpretations if not properly defined so the
proper definition is very important. Data also has time
dimension as its use and value will change with time and
obsolete data is not very relevant for information needs. So
it is important to understand the methods of collection of
data so that the most relevant data can be collected and used
as soon as possible for the effective management of the
organization.

__________________
__________________

All the managerial processes require information in some


form or the other and therefore, accurate and relevant data
is required to accomplish almost all the tasks a manager has
to perform. The following tasks are only indicative:

To set the objectives of the company, organization,


industry, Government or any other business entity.

To formulate major strategies and policies to meet


specific objectives.

To report the result of operations of the business to the


share holders.

To inform others of selected policies of the company.

181
Notes

To keep abreast of current operations of the business.

To inform employees on various matters.

To prepare long range plans.

__________________

To explore new opportunities.

__________________

To allocate capital resources.

To exercise necessary control over day-to-day


operations.

__________________
__________________

__________________

To determine the costs underlying various activities of


the firm.
To provide for proper co-ordination and control of
business activities.

All organizations, whether social, political, religious or


economic, are designed to achieve certain objectives. Notwith-standing the differences in the nature of their activities,
the underlying management processes are common. The
management must plan for and control the usage of various
organizational
resources
namely,
manpower, materials, production facilities and capital in the
most effective manner to achieve the organizations
objectives.This involves decision making, which is dependent
on the data and its quality. Data management is, therefore,
being increasingly recognized as fifth organizational resource,
which needs to be managed just like other four traditional
resources of man, machine, material and money.

The major problems and limitations of data collection can be


classified under the following broad headings:
(a) Lack of identification of data needs: The tasks
performed by various levels of management are
different. Therefore, data needs of the manager vary
with the level at which they are operating and the
function within which they are working. Very few
organizations have made a conscious and deliberate
effort to identify specific data needs of various
managerial positions in the organization.

__________________
__________________
__________________
__________________
__________________

182
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(b) Response time: This is one of the major data processing


problems. The data is not collected and processed fast
enough to allow enough time for mangers to react
quickly and in time. Their is a gap between supply of
data and requirement of the data. i.e., Data is supplied
with a lead time. In many cases data is not of any value
when it is made available to the managers.
(c)

__________________
__________________
__________________

Inaccessibility of data: Useful and necessary data is


available but is often in a form or location that makes it
uneconomical and infeasible to retrieve.

(d) Differing and conflicting data: Due to different


sources used for collection, data about the some item
may differ and may conflict with each other, for example,
two market research agencies give you a different size
of the prospective market for your products.
(e) Duplication of efforts: Identical data is maintained
and similar reports are generated at several points in
the organization, thereby wasting both time and
manpower resources.
(f)

Lack of training: The lack of scientific training in


methodology of data collection is a great handicap in
most of the organizations.

(g) Absence of code of conduct: There does not exist a


code of conduct for use of data and managers often mould
the data in the way they want to suit their needs without
caring for the accuracy of the same.
(h) Inaccurate and unreliable data: The sheer volume
of data and human intervention makes it humanly
impossible to be consistently accurate and reliable.

Data may be classified as:


(i)

Primary

(ii) Secondary

183
Notes

Primary data represents those items that are collected for


the first time and first hand. The data is recorded as observed
or encountered. Essentially, this data is the raw material
and may be combined, or structured in any form. The point
to be noted here is that the data has not been statistically
processed. For example, data obtainedbycountingthe number
of bad pieces and good piecesinthe production is the primary
unprocessed data. After this the data can be statistically
processed to yield the required information.
The main advantages of collecting primary data are the
following:
(i)

They are accurate and reliable as they are collected from


the original source.

(ii) They provide detailed information according to


requirements of the users.
(iii) It is more reliable and less prone to error.
(iv) Definitions and meaning of terms used in data are
explained to make it understandable and the process
transparent.
(v) Method of collection, its limitations and other aspects
are generally highlighted.
Where there are roses there will also be thorns.
Following are the main limitations of the primary data:
(i)

Cost: It is expensive to collect primary data.

(ii) Time: It is time consuming method of data collection.


(iii) Training: It requires experts/trained personnel to
collect data.

This is also known as published data. Data which is not


originally collected but rather obtained from published
sources and is normally statistically processed is known as

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

184
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

secondary data. For example, data published by Reserve Bank


of India, Ministry of Economic Affairs, Commerce Ministry
as well as international bodies such as World Bank, Asian
Development Bank, etc.
As is the case with primary data there are advantages and
disadvantages associated with secondary data also. The
advantages are:
(i)

__________________

Cost: It is more economical than primary data, since


data is already available.

__________________
__________________

(ii) Time: It is faster to collect and process as time has


already been spent to collect the data.
(iii) Information insight: It provides a base on which
further information can be collected to update it and
finally use it. It provides valuable insights and
contextual familiarity with the subject matter.
The limitations of secondary data are as follows:
(i)

It may not be too relevant for the problem in hand as it


was originally collected for some other context.

(ii) It could be outdated and hence not of much use in a


dynamically changing environment.
(iii) The accuracy of secondary data as well as its reliability
would depend on its source as the assumptions made
during the data collection are not specified.
(iv) Locating appropriate source and finally getting access
to the data could be time consuming.
(v) The data available might be too extensive and a lot of
time and money may be spent going through it.
Table 10.1: Distinction between Primary Data
and Secondary Data
Parameter

Primary Data

Secondary Data

Source of Data

Original source

Secondary source

Method of Data Collection

Observation method,
Questionnaire method, etc.

Published data from


various sources

Statistical Processing

Not Processed

Usually processed
Contd...

185
Notes

Originality of Data

Original
First time collected by user

Not original. Data collected by


some other agency

Use of Data

Data is compiled for specific


purpose

There may not be a specific


purpose

Terms and Definitions of


Units

Incorporated

May not be incorporated

__________________

Copy of the Schedule

Included

Excluded

__________________

Method of Data Collection

Given

May not be given

__________________

Description of Sample
Selection

Given

May not be given

__________________

Time Required

More

Less

__________________

Cost to the Organization

Expensive

Comparatively cheaper

__________________

Efforts Spent

More

Less

__________________

Accuracy of Data

More accurate

Less accurate

__________________

Training

Experts/ trained people


required

Less trained personnel


required

On closer investigation, it will be noticed that the distinction


between primary and secondary data in many cases is of
degree only. Data, which would be secondary in the hands of
one, could be primary for others. For example, to a bank the
details of the customer are primary data, but to a reader of
the report of the bank these details are secondary.

While selecting the subject for primary data collection, the


following considerations should be kept in mind:
(i)

Economic Considerations:
(a) Data collection efforts cost money. The value of the
anticipated results must commensurate with the
efforts put in.
(b) Short-term data collection studies that can yield
appreciable dividends quickly should be preferred
to long term studies whose benefits may be difficult
to foresee.

(ii) Technical Considerations:


(a) It should be made sure that adequate technical
knowledge is available to carry out the right process
of data collection.

__________________
__________________

186
Notes

(b) Where a large problem throws up a number of


subjects which are independent of each other, it is
better to have small individual data collected on each
subject.

__________________
__________________
__________________
__________________

(c) Where a problem brings to light two or more subjects,


which are interrelated, independenstudies on each
might be carried out inthepreliminary stages, but
they should later be continuously integrated by coordinating the recording of the different teams
working on each subject. The critical examination
has to be the completenessofthe data and it has to
be carried out by the team as a whole.

__________________
__________________
__________________
__________________
__________________
__________________

(d) The scope and magnitude of the problem would


determine the data required.
(iii) Human Considerations
Where resistance to change or reaction is likely to be there
the data collection should not be proceeded with until
acceptance has been gained.
(iv) Other Limitations and Constraints
(a) Time Limit: Data collection must be completed
within time frame specified so as to be of maximum
utilization.
(b) Cost Considerations: Data must be collected within
the cost framework .
(c) Accuracy: Reasonable accuracy, as is required for
the problem, should be ensured.

The following four methods of primary data collection are


most widely used:
1.

Observation method

2.

Personal interview

3.

Questionnaire method

4.

Case study method

187

Let us look at each method, one by one:

Notes
__________________
__________________

This is the most commonly used method of data collection,


especially in studies relating to production management and
behavioural sciences. Accurate watching and noting down of
phenomenon, as they occur in nature or at shopfloor with
regard to cause and effect, is called the observation method
of data collection.

__________________

Differentiating characteristics of observation method are as


follows:

__________________

(i)

Direct Method: Direct contacts of sensory organs


particularly eyes and ears are involved to gather and
record the data.

(ii) Observe and Record: The observer first observes the


phenomenon carefully and then records data.
(iii) Selective and Purposeful Collection: The
observations are made with a definite purpose in mind
and only relevant data is collected.
(iv) Cause and Effect Relationship: Observation method
leads to development cause and effect relationship.
Table 10.2: Observation Method
Merits

This method of observation is common to all the discipline of research is simple


to use.

It is realistic as it is based on actual and first hand experience.

The conclusions are more accurate reliable and dependable.

This method is used for formulation of hypothesis.

This method is successfully used for verification of hypothesis.

It is useful when in-depth study is required.

Limitations

Some events cannot be observed without biases. For example, it is not possible
to observe emotions and sentimental factors, like and dislikes without bias
about the degree of emotions.

It sometimes results in illusory observations.

__________________
__________________
__________________
__________________
__________________

__________________

188
Notes

Being a long drawn process, the techniques of observation are expensive and
time consuming.

Sometimes the atmosphere tends to become artificial and this leads to a sense
of self-consciousness among the individuals who are being observed. This
defeats the purpose of observation.

The slowness of observation methods leads to disheartening and disinterest


among both the observer and observed.

The final results of observation depend upon the interpretation and understanding of the observer, the defects of the subjectivity in the explanation creep in the
description of the observed and deductions from it.

As the purpose of the observation is known to observers, therefore, it is his own


wish to record or view a particular thing.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The control can be of two forms. The observer could be a


participant or a silent observer. In group discussions he is
normally a silent observer but in interview techniques he
becomes a part of the interview and hence his lack of
objectivity may hamper the quality of his observations.
Controlled and uncontrolled observation methods are the
two sub-methods used to watch and understand the
observation.
(i)

Controlled Observation: This is a systematic


observation based on logic and reasoning. This is done
on a pre-conceived plan and deliberate effort is made to
control the phenomenon.

(ii) Uncontrolled Observation: In this method


observations are made in a natural surrounding. There
is no planning, no control and no use of any deliberate
effort to change the working of the phenomenon.
Table 10.3: Distinction between Controlled and
Uncontrolled Observation
Parameter

Controlled
Observation

Uncontrolled
Observation

Control Dimensions

Control over the phenomenon, conditions of light,


temperature, humidity, etc.
Control over the observer or
observed

No control. Observations
under natural conditions

Techniques of Control
Used

Planning of observations
situations, Use of mechanical
appliances such as recorders,
watch, etc.
Maps and sociometric scales
Hypothesis
Detailed notes
Group discussions

No need to use control


techniques

Contd...

189
Degree of Bias

Subjective study and bias


comes in during study

This is an objective study


and keeps the
observations bias free

Cause and Effect


Relationship

Well established

Difficult to establish

Degree of Reliability
of Data

High

Notes
__________________
__________________
__________________

Low

__________________
__________________

The process of observation method is used most effectively


in the field observations where the presence of the observer
does not make a difference to the observed. For example, if
you want to know how many people enter the New Delhi
railway station from the Paharganj side, you just have to
stand at the gate and count. Your presence there or not being
there does not matter to people who are being observed.
Steps in Organization of Field Observation
Following are the main steps generally followed in the
organization of field observations:
(i)

Determination of nature and limits of observation:


Depending upon the nature of research and hypothesis,
an outline of the research is prepared. This helps the
observer to guide him on what should be observed and
on what should be left out.

(ii) Determination of time, place and subject of study:


A project can be of short or long duration, it may be
studied under laboratory conditions or in the open. It
should also be decided that whether we shall observe
the behaviour of phenomenon as a whole or of the
individual items in relation to the total.
(iii) Determination of the investigators: Depending on the
nature, work content and objectives, the needs for
individual investigator or of a team are to be identified.
(iv) Provision of mechanical appliances needed: The
mechanical appliances for recording such as tape
recorder, movie camera, etc., required should be
identified in the beginning and used when needed.
When you take care of these basic steps, your data would be
useful and relevant to the problem in hand.

__________________
__________________
__________________
__________________
__________________

190
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Under this method of collecting data there is a face to face


contact with the persons from whom the information is to be
obtained (known as informants). The interviewer asks them
questions pertaining to the survey and collects the desired
information. For example, if a person wants to collect data
about the working conditions of the workers of Hindustan
Lever Ltd., Mumbai, he would go to the HLL factory at
Mumbai, contact the workers and obtain the required
information. The information obtained is direct and original.
This is the most suitable method of data collection for
business and economic problems.
Table 10.4: Personal Interview Method
Merits

In this method, direct contact between


researcher and informants is established and effective communication is
built, which helps in getting direct information about paradigms, inner feelings, emotions and sentiments.

Fine tuning of the responses can be


done so as to get out the best possible by rephrasing the questions and
probing deeper wherever required

An interview gives us knowledge of


facts, which are inaccessible to observation. The emotional attitude, secret motivation and incentives governing human life come to surface in an
interview though these are unobservable. Therefore, interview has a quality which may be called supra-observational
Through this method it is possible to
verify the information that has been
collected from other sources.

Demerits

There are certain matters, which can


be written in privacy but about which
one does not wish to speak before
others. If these matters are the subjects of interview, the likelihood is that
only a disguised version of these will
be presented.

If an interviewee is of low level intelligence he is usually unfit to give correct information. Same goes for interviewer also as interviewing is an
art rather than science and the art
has to be mastered

If the interviewer is unable to suppress his prejudices, his understanding and interpretation of data given in
the interview will be defective

In the interview, certain aspects of


the human behaviour get overemphasized at the expense of others.
There is a tendency to give too much
importance to personal factors and
minimize the role of environmental
factors. This has to be guarded
against.

191
Notes

Interviews can be classified according to their basic


characteristics.

__________________
__________________
__________________

(i)

According to Formalness:
(a) Formal Interview: In formal interviews, the
interviewer presents a set of well defined questions
and notes down the answers of informants in
accordance with the prescribed rules. Here,
emphasis is given on the order and on sequence of
question.
(b) Informal Interview: Here the interviewer has the
freedom of alterations in questions to suit a
particular situation in formal interview. He may
revise, re-order or re-phrase the questions to suit
the needs of the respondents. The emphasis is on
situation and questioning generally depends on the
situation and individual.

(ii) According to Number:


(a) Personal Interview: In personal interview only a single
person is interviewed at one time. Detailed knowledge
about intimate and personal aspects of individual can
be obtained as it is face-to-face talk.
(b) Group Interview: In this method two or more persons
are interviewed at the same time. The group interview
is, therefore, more suited for gathering routine
information rather than personal information.
(iii) According to Purpose:
(a) Diagnostic Interview: In this type of interview,
interviewer tries to understand the cause or causes
because of which a particular fact or incident
happened. For example, diagnostic interviews are
held with the operators with a purpose to grasp the
cause and nature of failure of machines and not to
ascertain whether failure has occurred.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

192
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(b) Research Interview: These interviews are held to


gather information pertaining to certain problems
but may not be as specific as diagnostic interviews.
The questions to be asked to gather the desired
information are predetermined. In as much as this
data is gathered for the purpose of research into a
problem, this is called research interview.
(iv) On the basis of Function and Methodology:
(a) Nondirected Interview (Non Directional
Interview): This is also known as free or
unstructured interview. This is a type of interview
in which the interviewer exercises no control,
provides no direction and has no brief or
predetermined set of questions to ask. The
interviewer merely engages the interviewee in talks
and encourages him to tell about his experiences and
feelings. This type of interview is suitable when the
researcher wishes to assess the amount of awareness
a person has about certain problems and the manner
in which he views them.
(b) Focussed Interview: This method is employed for
studying the socio-psychological effects of mass
media like radio, television, cinema, etc. The
specialty of the focused interview is that by its means
the personal reactions, emotions and intellectual
orientation of the persons to be interviewed towards
specific issues can be studied.
(v) Classification according to Subject Matter:
(a) Qualitative Interview: The qualitative interviews
are about complex and non-quantifiable subject
matter. For example, interviews held for case studies
for specific problem study are qualitative, because
the interviewer has to cover past, present and future
to know a case. In this a qualitative analysis
associated with a situation is performed. The
subjective opinion of the interviewer is seeked.

193

(b) Quantitative Interview: The quantitative


interviews are those in which certain set facts are
gathered about large number of cases. The census
interviews are an example of this type.
Many combinations of these types can be made to suit a
particular situation.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The main concern of the researcher employing the method


of interview is to get correct and to the point answers to the
topic of research. A research can be less expensive and
economical only if deviations from the main line of approach
are kept under control. Normally, the accuracy of the
responses depends upon the skill and tactful approach of
the interviewer and no rules can be framed in this connection.
Still the following points can be kept at the back of yourmind:

Prior to start the business with the interviewee,


interviewer must develop rapport with the interviewer,
so that he feels comfortable with him.

For allowing maximum opportunity of self-expression


to the interviewee, he should be allowed to narrate his
experience in the story form.

The interviewee and interviewer should be free and


frank. The interviewee should be allowed to describe
whatever he thinks worthwhile. Even if some irrelevant
facts are being described the interviewee need not be
checked. He should not be discouraged. Though
maximum freedom of self-expression is desirable, this
can only be within the scope of the problem being
discussed. This requires alertness and direction at the
suitable occasion. Good humor is the essence of
successful direction.

The interviewer must hear the interviewee with full


interest. Nobody should be able to guess from his
expression that he is bored or his mind is elsewhere.

If an interviewer can convince the interviewee that he


appreciates his cooperation and greatly values the
informations given by him, this word of encouragement

__________________
__________________
__________________

194
Notes

has a salutary effect on the interviewee, who then gives


more focussed responses.

__________________
__________________

__________________
__________________
__________________
__________________
__________________

The information given by the interviewee, if suspected,


can be tested through cross-examination of the
interviewee. Moreover, the emotional expression
accompanying the responses give a clue to the
interviewer about the veracity or otherwise of the
answer being given.

__________________
__________________
__________________

Following are the main causes, which render an interview


unusable. These should be taken care of when interviewing:
(i)

Often interviewees, under emotional spells, exaggerate


the facts in order to satisfy their vanities and create
impression. It should be taken with a pinch of salt.

(ii) Sometimes there is communication gap between the


interviewer and interviewee withthe result that
interviewees say one thing and the interviewer
understands somethingelse.
(iii) Some interviewees deliberately try to mislead the
interviewer and make a fool of him, the interviewer
must be mature and experienced enough to tell off and
rebuff such fake interviewees.
(iv) Sometimes an inexperienced interviewer is offended by
the behaviour of interviewees and in a revengeful mood
distorts the facts in his report.
(v) Interviewer should critically examine those aspects of
the interview in which the relationship of cause and
effect seems to hold. This helps to determine whether
the causes are always present or not when certain effects
appear.

Under this method, a formal list of questions pertaining to


the survey (known as questionnaire) is prepared and sent to
the various informants. Questionnaire contains the questions
and provides the space for answers. A request is made to the

195

respondents through a covering letter to fill up the


questionnaire and send it back within a specified time.
The questionnaires could be structured or unstructured.
Structured questionnaires are those that pose definite,
concrete and pre-ordained questions with fixed response
categories. In unstructured questionnaires, questions are not
necessarily presented to the respondents in the same
wording and do not have fixed responses. Respondents are
free to answer the questions the way they like in their own
wording and style. Questionnaires could be a mix of the two
types also leaving the field wide open to the designer of the
questionnaire.

Dichotomous Questions: When reply to a question is


in the form of one out of two alternatives given, one
answer being given in negative and other positive, it is
called a dichotomous question. Both the negative and
positive answers combined together form the whole
range of answers given. For example: Whether
respondent is educated..Yes/No.

Multiple Choice Questions: In these questions


normally three to five alternative answers are given.
These alternatives are quite comprehensive and the
respondent has to select one of them. In framing these
types of questions, the framer has to be cautious enough
that all the possible alternatives are included in it and
they are mutually exclusive.

Ranking Item Questions: A variation on multiple


choice questions, these questions are so designed as to
record the preferences of the respondent. In ranking
item questions there may be several preferences
arranged item wise.

Open-ended Questions: Questions, which are of


descriptive type and allow the respondent to cite his
experiences are known as open-ended questions.

Leading Questions: These are suggestive questions. In


these types of questions the reply is suggested in a
particular direction. They should be avoided as far as possible.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

196
Notes

__________________
__________________
__________________
__________________
__________________
__________________

Table 10.5: Questionnaire Method

__________________
__________________
__________________

Ambiguous Questions: The questions that lack clarity


and are so worded that the meaning is not clear are
known as ambiguous questions. Such questions normally
should not be included in the questionnaire as they are
likely to confuse the respondent. The meaning of such
questions is not uniformly convulsed to all the
respondents.

Merits

In comparison to other methods, the


questionnaire method is both cheaper
and quicker

It requires less skill to administer than


other methods

If the informants or the respondents


are scattered in large geographical
areas, this is the most suitable method

__________________

Demerits

Lack of interest on the part of respondents lowers the number of responses, making the study unreliable

Incomplete and illegible responses


renders the whole response bad

Besides saving money, time is also


saved as simultaneously hundreds of
persons can be approached

If a problem requires deep and long


study, it cannot be studied through
this method

It is more reliable in special cases although in most cases the reliability is


suspected

This method is very rigid since no


alteration and rephrasing of questions
can be used

Prejudices and bias of the researcher


influences the framing of the questions

The respondent is free from external


influences, such as researcher and
therefore provides reliable, valid and
meaningful information

Chances of errors are low because


respondent supplies information himself

The informants are directly involved


in the supply of information, so the
method is more original

The impersonal nature of questionnaire ensures uniformity from one


measurement situation to another.

Sometimes the questionnaire is itself


incomplete and leaves out certain
critical questions which are unearthed later rendering the whole
exercise fruitless
There is no provision in this method
for coming face to face with the respondent. This may result in manipulation of replies by the respondents.

Questionnaire is always framed with the help of certain


background material and the problem statement. The first
requirement always is the design of the problem statement
and this is the area where most of the questionnaires go
wrong. If your problem statement is faulty, your questions
are not going to point to the required direction and you are
bound to get wrong inferences. One should spend the
maximum time on it, since it will be well spent.

197

After the problem statement comes the issue of the


respondents as their intellectual level has to be kept in mind
while designing questions. If the questions, language and
wordings are not in accordance with the intellectual level of
respondents then it would not be possible for them to furnish
correct replies. In such a situation the purpose of the research
would not be fulfilled. The outcome of past experiences
enables the researcher to know the shortcomings
beforehand, enabling him to remove these deficiencies so as
to improve the response rate.
Other factors to be taken into account in the construction of
a questionnaire:

Appeal: Each questionnaire should be attached with an


appeal in which the aim and purpose of the
questionnaire is set forth and the sincere co-operation
of the respondentsis requested. The appeal may be
made more effective by giving appropriate incentives
in the form of money, books, and with a promise to give
a copy of the report to the respondents.

Instructions for filling up the questionnaire: The


questionnaire must carry a list of instructions for filling
it up and dispatching it. The respondent must not have
to pay for return postage, unless you are promising a
prize for responses. If the questionnaire is time bound,
the last date of receiving completed response and the
address should be clearly written.

Clarity of questions: For desired response it is of


utmost importance to formulate questions that are
direct, clear and precise.

Order of questions: The questions should be broken


up into sections and each section should have a number
of questions, which are mutually interrelated. Question
about personal detail should be avoided or should be
asked in the end.

Protesting of Response in Questionnaire


The basic thing that has to be kept in mind is that ambiguity
should be avoided in collecting data through questionnaire

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

198
Notes

__________________

method. For this, it is necessary that the questionnaire should


be tested before it is actually used in a business research
study. Pre-testing is nothing but testing of questionnaire
before it is actually used. If testing is to be done the right
way the following steps are required:

__________________

Testing the validity in a representative sample: The


questionnaire should be tested in every respect, before
it is actually mailed to the target segment. This testing
can be done on a limited number of people through
sampling method but while testing it within the sample,
it should not be forgotten that the sample should be
perfectly representative of the target segment.

Protesting to check whether the results are in tune


with objectives: The questionnaire should meet the
objective of research study. It means that it should help
in getting maximum possible relevant responses. It is,
therefore, necessary that it should be made suitable to
objectives of study even if it requires testing more than
once.

Poor response requires modification of the


questionnaire: The questionnaire is mailed to the
informants who are required to fill it and send it back.
If the response of the informant is poor and very few
questionnaires are returned, it means that there is
something wrong with the form and style of the
questionnaire and it requires modifications/change and
reframing. Furthermore, if the questionnaires returned
are incomplete or the replies are not satisfactory and
up to the mark, it should be presumed that the
questionnaire is defective and it requires modification.
After modification the questionnaire should again be
subjected to pre-testing.

__________________
__________________
__________________

__________________
__________________
__________________
__________________
__________________

When the questionnaire is not leading to any response, one


of the following factors is usually responsible for it:

Importance of the problem to the respondents: It is


generally seen that those who are concerned with the
problem give better response than those who are not.

199

Characteristics of the respondents and prestige of


the sponsoring body: It is seen that educated people
with social consciousness are more responsive as
compared to people belonging to lower economic group.
If the research study is sponsored by a well-known
organization it is likely to have better response.
Form and nature of questionnaire and
arrangements of the questions: Questionnaire also
plays its part in the matter of response. If the
questionnaire is short and has been printed in attractive
manner, its layout is neat and attractive, the
arrangements of questions is scientifically planned, it
is likely to invite a better response.

To get better response, inducement is needed. Inducement


may be classified under two heads: monetary and nonmonetary. Monetary inducement is given generally to people
who are economically weak or likely to be influenced by
money. This money is given in advance or after receiving the
filled questionnaire. Non-monetary inducement may be in
the form of a reward. It may be a letter of appreciation or
mentioning of the name in the report of study and so on. The
suitability of the inducement to the study and the
respondents expectations has to be kept in mind when
deciding upon which inducement to use.
To take care of the poor response situations companies
normally get students to help to find respondents and to
filling up the questionnaires. Excellent way for you to make
money while studying!
Schedule
Schedule is a variation of the questionnaire and can be defined
as a proforma that contains a set of questions which are asked
and filled by an interviewer himself in a face-to-face situation
with interviewee. Unlike a questionnaire, the schedule acts
as a guideline to the interviewer trying to get the required
response from the interviewee. Schedule is a standardized
device or a tool of observation to collect data in an objective
manner. Same guidelines as mentioned in the questionnaire
are to be kept in mind while making these schedules.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

200
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Case study method may be defined as small, inclusive and


intensive study of an situation in which investigator uses all
his skills and methods for systematic gathering of enough
information about a situation to understand the problem and
its solution. The case study is a form of qualitative cum
quantitative analysis involving the very careful and complete
observation of a person, situation or institution.

__________________

Table 10.6: Case Study Method

__________________

Merits

__________________

Intensive and deep study of the problem is possible

Demerits

Study of subjective aspects of the


problem is possible and more elaborative than other methods

Several unrealistic assumptions may


be made when structuring the case,
making it difficult to relax them later
on

Comparison of possible problem statements is easier

It is expensive in terms of money,


time and energy

Valid hypothesis can be formulated


and tested while the case is in development

If there is improper understanding


between the developer and the respondents, the data and hence the
inferences could be false and misleading

Is very useful when you have to study


processes and not isolated incidents

Very useful in situations where more


of qualitative rather than quantitative
decision making is involved.

Prejudices and biases come in more


easily as the study is more subjective

It is not possible to apply sampling


methods and generalization often
leads to false conclusions.

Although primary data is required for most of the internal


business situations, many of the strategic decisions depend
upon the information that is external to the organization.
The criticality of the decision and the time factor involved
would decide whether secondary data is to be used or the
situation calls for primary data.
If the situation calls for secondary data, this data would
normally be either published or unpublished. Unpublished
records, although dealing with the matters of public interest,
are not available to people in published form. It means that
everybody cannot have access to these records. Proceedings
of the meetings, noting on the files, private research, etc.,

201

form the category of unpublished records. Normally these


records are very reliable since there is no fear of their being
made public, the writers give out their views clearly.
Published records are available to people for investigation,
perusal and for further use, survey reports, magazine articles,
published studies, etc., fall under this category. The data
contained in these documents can be considered reliable or
unreliable depending upon the agency that is collecting the
data and the sources it had used for collecting this data. Most
of the information that is now available to people and
researchers in regard to business environment are to be
found in the form of reports. The reports published by
governments are considered more dependable on one hand
and on the other hand some people think that the reports
that are published by certain private individuals and agencies
are more dependable and reliable.
There are so many sources of published data that it is
impossible to name them all here. In spite of so many sources,
the published data usually suffers from the following
drawbacks:
(a) Data about all the aspects of business and economic
activity are not collected.
(b) Even the Government of India does not have an up-todate and latest data about many socio-economic aspects
as well as the business environment, although it is now
working towards it.
(c)

Data lacks in homogeneity and continuity.

(d) The data collected by the Government agencies is not


beyond doubt. This is due to the approach of the
administration and also because of the method of data
collection. The resources that are put at the disposal of
the machinery that is entrusted for the task of collection
of data is very meagre.
(e) Data collected by private agencies run the risk of their
biases coming into picture, as also their own aims and
objectives could make them present the data in a
improper way rendering it unuseful for you.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

202
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Therefore, before using the secondary data, it is essential


that the investigator should satisfy himself that the data is:
(a) Reliable, (b) Suitable, (c) Adequate and (d) Timely.
Reliability of data can be established by asking yourself the
following questions: Who collected the data and from which
sources? Are the methods used in collecting are standard
methods and reliable? Whether both the compiler and source
are dependable? The purpose for which the data were
originally collected is in tune with the purpose that you are
going to use the data for, the secondary data should be
suitable for the purpose of enquiry. Even if the data is reliable
it should not be used if the same is found to be unsuitable for
the enquiry. For checking the suitability of data one should
see: What was the object of the enquiry? The definitions of
various items and units of collection must be carefully
scrutinized. What was the accuracy aimed at? What is the
time of collection of data required? Can it be regarded a
normal time? Is the data homogeneous?
The secondary data may be reliable and suitable but the same
may be inadequate for the purpose of investigation. The data
collected earlier may refer to a problem area which could be
narrower or wider than the area required for the present
enquiry and if it is such, the data should be carefully
scrutinized to test whether it meets the requirements or not.
If it does not meet the requirements of the scope or the time
frame of study, do not use the data just because it is the only
data that is there. Although knowledge of the matter under
consideration and proper use of the statistical methods is
presupposed, great care is necessary in dealing with
published statistics because of the limitations or inaccuracies
that may be present.

203
Notes
__________________

10.1 Collect data on IOC, HPCL, BPCL & ONGC companys


financial performance from different sources like
internet, newspapers, magazines, etc. Are there any
differences between the same? What inferences do you
draw on the objectives of that particular type of media
when they are presenting data.
10.2 Try and collect as much data as possible from different
sources about the health levels of the people residing in
your area. What problems come in while collecting this
data?
10.3 If you have to use the sampling method in question 5.2,
what method would you use and how would you reduce
sampling and non-sampling errors in your
sample?
10.4 Associated Chamber of Commerce and Industry
(ASSOCHAM) is very much concerned about the
employment of youths and their pay rolls in small oil an
industries, with special reference to arcillary parts
manufacturing, transport for hire, taxis, dealers of new
and old vehicles, petrol stations and automobile repair
garages. The chamber has employed you to collect the
data regarding employment and payroll as on 31st March,
2000 and present it suitably through diagram so that it
can be include in the final memorandum to be submitted
to the Minister for Industries.
The data that you have collected is as follows:
Industry

Employment on Avg. Earnings per


31-3-2000
employee per year
(Rs)

1. Parts manufacturers

4,34,856

56,540

2. Transport for hire

15,26,897

26,348

3. Taxis

11,32,560

42,685

4. Dealers of new and used


vehicles

1,09,805

13,684

5. Retail filling stations

22,25,960

15,008

6. Automobile repair garages

12,35,200

12,048

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

204
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Present the data using a suitable diagram(s) so as to bring


out the finer points.
10.5 Mount Shivalik Distilleries is a progressive
manufacturer of Wasp brand export quality rum. It
follows the modern practices of presentation of data in
various board meetings. The data collected by its Finance
Director over a period of 3 years pertaining to its
operations is shown below. The Finance Director desires
that the data should be presented diagrammatically.
Would you please help him in presenting the data.
Particulars

1997-1998

1998-1999

1999-2000

(a) Raw Material Cost


per bottle of rum

15

21

(b) Other costs

10

14

(c) Packing and Distribution

Sales proceeds per bottle


(excluding excise)

20

30

40

Profit / (Loss)

(2)

Hint: Assume the sales price per bottle of rum equals to 100
and express the other figures in percentages as shown below:
Particulars

1997-98

1998-99

1999-2000

Wages

45.0

50.0

52.5

Other Costs

30.0

33.3

35.0

Packaging and Distribution

15.0

16.7

17.5

Total Cost

90.0

100.0

105.0

Sales

100.0

100.0

100.0

Profit/loss

+10.0

0.0

-5.0

10.6 Ansal Builders is engaged in the construction of a


multistory building for setting up a lube factory. It has
recently conducted a cost audit. The manager (cost
accounting) has collected the figures of total cost and
its major constituents. The information collected as
percentage of expenditure is shown below. Represent
the data with the help of a suitable diagram.

205
Item

Expenditure %

Notes

Wages

25

__________________

Bricks

15

__________________

Cement

20

__________________

Steel

15

__________________

Wood

10

Supervison and Misc

15

__________________
__________________

10.7 Chand Contractors supplies contract labour to various


industrial units for carrying out their various production
activities in and around Bhilai. Mr.R.B. Tripathi is the
chief consultant and is responsible to manage the
continuous supply of contract labor on weekly basis. The
daily wages of contract labor varies from Rs 25 to 95 per
day depending on the skill, experience and the nature
of work in the industry utilizing the services of contract
laborers. The daily wages and number of workers data
have been compiled by Shri Tripathi for estimating the
number of workers demanded and their average wages.
Draw a suitable diagram of the data to enable the chief
executive of Chand Contractors to understand the
relations between wages and number of workers.
Find out the number of workers getting wages lower
than 57 and more than 77 using Ogive graphs.
Daily Wages (Rs.) No. of Workers

Daily Wages (Rs.) No. off workers

20-25

21

60-65

36

25-30

29

65-70

45

30-35

19

70-75

27

35-40

39

75-80

48

40-45

43

80-85

21

45-50

94

85-90

12

50-55

73

90-95

55-60

68

10.8 IBM Computers (I) Ltd., has been entrusted with the
responsibility of developing a relationship between
number of employees and salary structure in Arian
Pharmaceuticals Ltd. . The statistics manager, Mr.Ayyar
has collected the following data. Draw the frequency
distribution and superimpose frequency polygon and
frequency curve on it.

__________________
__________________
__________________
__________________

206
Notes
__________________
__________________
__________________

Salary

No. of Employees

Salary

No. of Employees

300-400

20

700-800

115

400-500

30

800-900

100

500-600

60

900-1000

60

600-700

75

1000-1200

40

__________________
__________________
__________________
__________________
__________________
__________________
__________________

207
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________
__________________

After reading this unit you will be able to :

__________________

Understand the basic concept of sampling

__________________

Understand various methods of sampling.

__________________

If we want to make a study, which involves the total


population, there are two methods of doing it. One, we talk
to each and every member of the population and another
which takes a representation of the whole population and
does the study on it. It is obvious that the second method is
less costly, faster and easier for us to use. The only problem
that can crop up is that the representation is not the true
representation of the whole population or is not a
representation at all. For this there are certain special
statistical techniques used which help in checking that the
representation used actually and truly represents the
population. These techniques are called sampling techniques
and the representation is called the sample.
Sampling can be defined as the selection of some part of an
aggregate or totality on the basis of which a judgment or
inference about the aggregate or totality is made. Thus, only
after studying a part of the whole population, inference is
drawn on the whole population. The whole population (or
the desired group that we want to study from which a sample
is drawn) is called the universe. Here onwards we will only
use the term population.
Population could be finite or infinite depending upon the
number of elements in it. For example, the number of books
that one publisher sells is finite and can be known but the
same thing cannot be said about the number of people which

208
Notes
__________________
__________________
__________________
__________________
__________________
__________________

have gone through these books so that becomes an infinite


population. Population can also be divided into real and
hypothetical. Real population refers to hard facts which in
the above case are the number of books published. It can be
hypothetical or imaginary, for example, the number and types
of emotions that you displayed in last one hour! These can
only be projected or imagined but you can never be
sure.

__________________
__________________
__________________
__________________

Characteristics and elementary units are the other two terms


that you need to know about here. Characteristics refer to
the attributes (non-quantified qualities) which are the objects
of the study. Elementary units refer to those units which
possesses the characteristics of the population. The total of
such elementary units is called the population.
A sampling design is a definite plan for obtaining a sample
from the sampling frame. It refers to the technique or
procedure of selecting some sampling units from which
inferences about the population are drawn.
Sampling Errors: Since in sample survey, only a small part
of the universe is studied, as such there is every possibility
that its result would differ to some extent from that of the
universe. Even if two or three samples of the same universe
are taken, the result would differ from each other. These
differences constitute the errors due to sampling and are
known as samplingerrors. Errors due to calculations or
improper convention of observation are called non-sampling
errors.
Sampling Distribution: If we take certain number of
samples and for each sample and compute various statistical
measures such as mean, standard deviation, etc., then we
can find that each sample may give its own value for statistics
under consideration. All such values of a particular statistic,
say mean, together with their relative frequencies will
constitute the sampling distribution.
The confidence level or reliability of the sample is found from
the sampling distribution. It is the expected percentage of
times that the actual value will fall within the stated
precision limits. Thus, if we take confidence level of 95%, we

209

mean that there are 95 chances in 100 that sample results


represent the true condition of the universe. The significance
level has the opposite interpretation of that of confidence
level and indicates the likelihood that the answer will fall
outside the range. You should always remember that if the
confidence level is 95% then the significance level will be
5%, which essentially means that there are five per cent
chances that the sample will not represent the true condition
of the universe. The sample you have selected must follow
the following law.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The law states that, if a moderately large number of items


are selected at random from a given universe, the
characteristics of those items will reflect, to a fairly accurate
degree, the characteristics of the entire universe. For
example, if 500 leaves are picked from a tree at random and
the average length is found out, the result will be nearly the
same if all the leaves of the tree are picked up and measured.
The reliability in the Law of Statistical Regularity depends
on two factors:
(i)

The larger the sample, the more reliable are its


indications for the population. The reliability of a sample
is proportional to the square root of the number of items
it contains and larger the sample the more
representative and stable it will be.

(ii) The sample must be chosen at random.


With the use of law we can say that a part of the population
can represent the population. When census is not possible
due to paucity of time, money and/or labour, then with the
help of this law and random sampling, investigations can be
made about the properties of the population. This is possible
because the selection is made at random and by this law all
types of units, whether good, bad or average, have equal
chance of being selected.
However, there are certain precautions which we should keep
in mind. The selection of units should be unbiased, i.e.,
random. The two characteristics of randomization. One can

210
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

not fit any relationship between occurred values. Secondly,


the probability of occurrence of every item should be the same.
The inferences drawn from this are applicable on an all units
of the population so the sample should be identical to
universe. By collecting information from smaller units, we
cannot apply the results drawn from it to the whole universe.
The Law of Inertia of large numbers is a corollary of the Law
of Statistical Regularity. It lays down that in large masses of
data, abnormalities will occur, but in all probability,
exceptional items will offset each other, leaving the average
unchanged subject to where the elements of the time enters
in the general trend of data. The law of Inertia of large
numbers asserts that large aggregates are relatively more
stable than small ones. The movements of an aggregate are
the result of the movements of its separate parts and it is
improbable that the later will be moving in the same direction
at the same time. Consequently, their movements will tend
to compensate one another and the larger the numbers
involved, the more complete will this compensation be. Thus,
the law states that the larger the number of items we take
out of a given universe, the greater is the probability of
accuracy.

The theory of sampling is a study of relationships existing


between a universe and sample drawn from the universe. It
is applicable only to random sampling.
The theory of sampling is concerned with estimating the
properties of universe from those of the sample and also with
gauging the precision of estimation. Sampling theory deals
with the following aspects:
(a) Statistical Estimation: Sampling theory helps in
estimating unknown population parameters from the
knowledge of statistical measures based on sample
studies. The estimation can be a point estimate or it may
be an interval estimate. Point estimate is a single
estimate expressed in the form of a single figure and
interval estimate has two limits viz., the upper limit and
lower limit within which the parameter value may lie.

211

For example, we can say that the number of defective


parts in 100 pieces is 10 based on the sample of 10 in
which 1 defective part was found or we can say that the
defective parts could be between 8-12 based on the many
samples.

Notes
__________________
__________________
__________________
__________________

(b) Testing of Hypothesis: The second objective of the


sampling theory is to accept or reject the hypothesis. It
helps in determining whether observed differences are
actually due to chance or whether they are really
significant.

__________________

(c)

__________________

Statistical Inferences: Sampling theory helps in


making generalization about the universe from the
studies based on sample drawn from it. It also helps in
determining, the accuracy of such generalization.

In the quantified research, the sampling technique is made


maximum use of and in no field of research can its importance
and value be belittled. In researches in the educational,
economic, commercial and scientific domains, the sampling
technique is used and considered most apt for research.
Sampling technique also has a very high value in day to day
activities. In making our daily purchases of foodstuff,
vegetables, fruits, etc., it is not considered necessary to
examine each and every piece of the commodity. Only a
handful of goods are examined and the idea about the whole
lot is formed and this usually proves justified. For example,
the physicians make inference about a patients blood through
examination of a single drop.
Sampling technique has the following features, which
highlight its importance:
(a) Economy: The sampling technique is in expensive and
less time consuming than the census technique.
(b) Reliability: If the choice of sample units is made
with due care and the matter under survey is not
heterogeneous, the conclusion of the sample survey can
have almost the same reliability as those of census
survey.

__________________
__________________
__________________
__________________

212
Notes

(c)

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Detailed study: Since the number of sample units is


fairly small, it can be studied intensively and elaborately.
These can be examined from multiple points of view.

(d) Scientific base: This is a scientific technique because


the conclusions are verifiable from other units. By taking
random samples we can determine the amount of
deviation from the norm.
(e) Greater suitability in most situations: Most of the
surveys are made by the technique of sample survey,
because if matter is homogeneous, the examinations of
few units suffice. This is the case in the majority of
situations.
The question that arises is whether we can use sampling
techniques in any situation. No, sampling is useful only when:
(i)

Data is vast: When the number of units is very large,


sampling technique must be used as it economizes money,
time and effort.

(ii) When utmost accuracy is not required: The sampling


technique is very suitable in those situations where cent
per cent accuracy is not required otherwise census
technique is unavoidable, because 100% accuracy is
achievable only by this means.
(iii) Where census is not feasible: If we want to know the
mineral wealth in the country we cannot dig all the
mines to discover and count, we have to use the sampling
technique.
(iv) Homogeneity: If all the units of a domain are alike,
sampling technique is easier to use and is much more
accurate.
The point to remember is that if due care is not taken in the
selection of samples or if they are arbitrarily selected, the
conclusions derived from them about the universe will be
misleading, if not totally wrong. For example, in assessing
the monthly expenditure of university students, if we select
for our sample only the students who come in cars, our results
will be highly erroneous if extended to all students.

213

You should remember that the sampling technique can be


successful only if a competent and able investigator makes
the selection. If the sampling is done by an average
investigator, the selection may be prone to error.

Notes
__________________
__________________
__________________
__________________
__________________

Sampling methods can be grouped under two broad


categories:

__________________

(i)

__________________

Probability sampling methods or random sampling


methods.

(ii) Non-probability sampling methods or non-random


sampling methods.

Probability sampling methods are those in which every item


in the universe has a known chance or probability of being
included in the sample. This implies that the selection of
item for the sample is independent of the person making the
study and the items will be chosen strictly at random.

Probability sampling can be divided into four types. We will


take a look at each one of them, one by one.
(i)

Simple random sampling.

(ii) Stratified sampling.


(iii) Systematic sampling.
(iv) Cluster sampling.
(i)

Simple Random Sampling: A procedure of sampling


will be called simple random sampling where individual
items (units) constituting the samples are selected at
random. Random sampling is the form applied when the
method of selection assures each individual element or
unit in universe can have an equal chance of being
chosen. In other words, if in a sample size of n all the
possible combinations of n element items have the same
probability of being included; it is called simple random

__________________

__________________
__________________

214
Notes
__________________
__________________
__________________

sampling. It can be performed with replacement of the


taken out element or without replacement of the taken
out element.
Selecting a Random Sample

__________________
__________________
__________________
__________________
__________________

A random sample can generally be selected in following


four methods:
(a) Lottery method.
(b) Tippets numbers method.

__________________
__________________

(c) Selection from sequential list.


(d) Grid system.
A brief description of the above methods is given below:
(a) Lottery method: In this method, a unit is drawn by
writing the numbers or the names of various units
and putting them in a container. They are thoroughly
mixed and certain numbers are picked up from the
container, and those picked up are taken up for
sampling.
(b) Tippets numbers method: It is called Tippets
numbers method because it was evolved by L.H.C.
Tippet who constructed a list of 10,4000 four digit
numbers written at random on every page. From
those numbers it is not very difficult to draw samples
at random. For example, if 50 persons are to be
selected for study out of the total number of 500, then
we can open any page of Tippets numbers and select
first 50 that are below 500 and take them up for study.
On the basis of the experiments carried out through
this technique, it has been found that the results that
are drawn on the basis of this method of random
sampling are quite reliable.
(c) Selection from sequential list: In this method, the
names are arranged serially according to a particular
order. The order may be alphabetical, geographical
or only serial. Then out of the list any number may
be taken up. Beginning of selection may be made
from anywhere.

215

For example, if we want to select 10 persons, we can


start right from the 10th, and select 10, 20, 30, 40
and so on.
(d) Grid System: This method is generally used for
selecting the sample of an area and so in this method,
a map of entire area is drawn. After that a screen
with the squares is placed upon the map and some
of the squares are selected at random. Then screen
is placed upon the map and the areas falling within
the selected squares are taken as samples.
(ii) Stratified Sampling Method
In this method, the entire population is divided into a
number of groups called strata. Then a number of items
are taken from each group at random. This means that
a stratified sample is equivalent to a set of random
samples on a number of sub-populations. It can be
performed with replacement of the taken out element
or without replacement of the taken out element.
As you would have guessed by now, in this method much
depends on the process of stratification. For taking the
right strata the following precautions need to be taken:
(a) Each stratum in the population should be large
enough in size so that selection of items may be done
on random basis.
(b) There should be a perfect homogeneity in different
units of any one stratum.
(c) Stratification should be well defined and clear cut.
By this we mean that each unit or stratum should be
free from influence of the other.
Types of Stratified Sampling
Stratified samplings are of three kinds:
(i) Proportionate stratified sampling: In this method the
number of units drawn from each strata are in exact
proportion to proportion of strata to the population.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

216
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

(ii) Disproportionate stratified sampling: In this type of


stratified sampling an equal number of cases are
taken from each stratum without any consideration
to the size of strata in proportion to population. It is
also called controlled sampling or a quota is decided
for each state.
(iii) Stratified weight sampling: In this method, an equal
number of units are selected from each stratum and
averages are drawn from each stratum, but in doing
so they are given weight in proportion to the size of
stratum in relation to the whole population.
(iii) Systematic Sampling: Systematic sampling is a
variation of simple random sampling. It requires that
the universe or a list of its units may be ordered in such
a way that each element of the universe can be uniquely
identified by its order. A voters list, a telephone
directory, a card index system would all generally satisfy
this condition. Suppose, there are 5000 cases (and hence
5000 units of the population) and we want a sample of
50. We can select a number of between (and including) 1
and 10 at random, say 8. Then we can select the units
whose case are in the following position 18, 28, 38,.........,
1008,.........., 4998. This would be a systematic random
sample or commonly known as systematic sampling.
Systematic selection implies that the sample units are
picked out in a definite sequence, at equal intervals from
one another. Reduction or increase in the variability of
estimates yielded by systematic sampling depends on
the way population is arranged. If the population is
thoroughly mixed with respect to the characteristics
under study, the variability of the estimates will be
affected.
In practice, it is essential to use systematic sampling
only when we are sufficiently acquainted with the data
to be able to demonstrate that periodicities do not exist,
or that the interval between the elements of the sample
is not multiple or submultiple of the period.

217

(iv) Cluster Sampling: Cluster sampling is also called


multistage sampling or sub-sampling as it uses various
stages to reach or make samples. This method is
generally used in selecting a sample from a very large
population. The original units into which the population
is divided are known as primary units. Each primary
unit that falls into the sample is subdivided into
secondary units in preparation for the second stage of
sampling. In three stage sampling; there will be primary,
secondary and tertiary units. Sometimes four stages are
also used.
Let us take an example to understand the procedure as it is
slightly tedious. Let us say that we want to take sample from
the universe of professors/lecturers associated with Delhi
University. The list consists of 100 pages with approximately
25 names per page. These pages are numbered and constitute
the sampling units. All names are numbered and arranged
alphabetically to constitute the ultimate sampling units. Let
us suppose we want a sample of 100 professors/lecturers.
The sample may be selected as follows: we may decide to
select 5 professors/lecturers each from 20 pages. Select a
number from 1 to 5 at random say 3. Select pages 3, 8, 13, 28,
... and so on to 98. Then by the use of random numbers select
5 names from each of 20 pages. This is a combination of
systematic and simple random sampling.
The point to remember is that you should use a form of
random sampling in each of the sampling stages where there
are two or more than two stages.
The variability of estimates yielded by multistage sampling
may be greater than that of estimates yielded by simple
random sampling for equal size. The variability of estimates
in multistage sampling depends on the composition of
primary units. There are three reasons for hesitating to
recommend this method.
(i)

The cost of the travel would be too high.

(ii) Control of the non-sampling errors would be difficult;


and

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

218
Notes
__________________
__________________

(iii) a probability sample of small units drawn at one stage


requires a form which lists all the small units and such
a plan becomes costly by comparison.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Sampling methods which do not provide every element in


the universe a known chance of being included in the sample
are collectively known as non-probability sampling methods.
Here the selection process is partially subjective and does
not use randomization. In other words we use judgements
based on convenience and other considerations rather than
probability considerations.
Non-probability sampling methods can be divided into
basically three groups:
(1) Judgement or purposive sampling,
(2) Convenience sampling, and
(3) Quota sampling.

In this technique the investigator has complete freedom in


choosing his sample according to his wishes and desires.
Although he will try his best to get the sample which is
representative of the population, his judgment plays a major
part in determining which is the best sample and no other
considerations are used for the same.
When only a small number of sampling units are in the
population, simple random selection may miss the more
important elements, where judgement selection would
certainly include them in a sample. For example, when we
want to know the effectiveness of HR policies of the company
and we randomly choose the sample of 5 from a company of
50 people, it is possible that we only get marketing people
and not a representation of other functions in the sample.
This is bound to give us improper ideas about the
effectiveness of HR policies if the marketing people do not
like the HR people in the organization for whatever reasons.
This personal selection can become a disadvantage also when

219

not used properly. Another disadvantage that is associated


with this method is that there is no objective way of evaluating
the reliability of sample results.
Still when we want to study some unknown traits of a
population, some of whose characteristics are known; we may
then stratify the population according to these known
properties and select sampling units from each stratum on
the basis of judgement. This method will then result in a
more representative sample.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

As the name suggests, in this method the sample of the


population being investigated is selected neither by
probability nor by judgement but by convenience of reach. A
sample obtained from readily available lists such as
automobile registrations, telephone directories, etc., is a
convenience sample and not a random sample even if the
sample is drawn at random from these lists. For example, if
you do a survey on the internet about the social issues, your
sample is a convenience sample and not representative of
all the people to whom these issues concern. Therefore, the
results obtained by convenience sampling methods are
generally biased and unsatisfactory.
So convenience sampling is normally suitable for doing pilot
studies and in cases where the population is not well defined
or sample units cannot be clearly defined or when the
complete data about the population is not available.

Quota sampling is a special form of stratified sampling. In


this method, first the classification of the population into
various strata is done in terms of properties known or
assumed to be pertinent to the characteristics being studied.
Then proportion of the population falling into each stratum
on the basis of known or estimated composition of the
population is defined. After that the quotas for each
interviewer or investigator are determined so that the total
sample interviewed contains a proportion of each stratum
so that all investigators study all the stratums thereby doing
a complete study of the population in a mini form.

220
Notes
__________________
__________________
__________________
__________________
__________________
__________________

The advantage of using this method is that items, which are


close to each other, are clubbed together, thereby saving costs
and introducing some stratification effect. The disadvantage
is that the bias of investigator is introduced in classification
of subjects and in random selection within various strata.
Another disadvantage is that since random sampling is not
involved at any stage, the errors of the method can not be
estimated by statistical procedures.

__________________
__________________
__________________

This method is mostly used in marketing surveys and election


polls and is pretty successful in that.

__________________

Sometimes it is economical and organizationally convenient


also, to collect certain items of information from the all the
units of a sample and other items of information from some
of these units only, these latter units being selected so as to
form a sub sample of the original sample. This may be termed
as two phase sampling or double sampling. If necessary
another phase may be added.
Multiphase or sequential sampling is of great use when the
desired accuracy of different items is widely different, either
owing to the fact that the variability of the associated variates
is different or because the desired accuracy is different.
The major advantage is that it is possible to select one sample
from a universe, analyze it and use the inferences in designing
a second sample from the same universe. But this method
can only be used where a small sample can represent the
universe very well and where the number of observations
can be increased easily at any stage of enquiry.

In order to be useful, the study has to be representative in


character. If it does not possess all the characteristics of the
universe it shall not be representative enough and thus, it
shall not be able to fulfill the objective of the study. In order
to enable the investigator to keep himself away from the

221

biased samples, he has to take the following precautions:

If the universe is subject to change, an enquiry carried


out on a single occasion, how-so-ever accurate, cannot
by itself give any information on the nature of the rate
of such change. In such cases, provision must be made
for studying the samples at successive intervals if up to
date information is desired. This successive study also
gives some idea about the nature and question of change.
The size of the sample should not be too small as
compared to the universe. The size of the sample should
be large enough so that its representative character may
not be lost and selection on random basis may be possible.
Apart from it, the sampling should not be done
purposely. In such an event, the sample generally gets
biased.

If the sampling is done through stratified method, it


should not be governed by the principle of perfect
stratification. Elements of unsuitability, overlapping or
lack of proportion have no place in sampling. When these
elements are there, the samples become biased.

Lack of source list or incomplete source list makes the


sample biased.

If the cases, which were originally selected for the study,


are lost or not available for enquiry, they are replaced
by new ones. In such a situation, there is a danger of the
bias influencing the selection of samples.

When field workers are given the liberty to select


samples according to their wishes and no specific
guidelines are given to them, they are likely to select
samples according to their convenience. In that event
prejudices and bias are likely to influence the sample.

If the method of drawing samples is inadequate or not


suitable to the project, the samples drawn may be biased.
Sometimes the nature of the phenomenon makes the
selection of representative samples extremely difficult.
It generally happens in case of complex, heterogeneous
and widespread cases.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

222
Notes

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The investigator has to safeguard against the bias and


try to find out perfectly representative samples. In this,
the investigator and his skills play a vital role.

If the investigator is well equipped with the knowledge of


the universe, know the importance and the nature of the study
and makes efforts to collect the representative sample, he
would be successful in selecting representative samples and
take precautions for removing the bias. Pre-testing is very
helpful in determining whether a particular sample is truly
representative or not.

__________________

That the sample should be reliable and free from biases goes
without saying, but how that needs to be tested. The size of
the sample, its relevance and suitability to the problem, its
representation of the universe, etc., are some of the factors
that determine the reliability of the sample. Reliability may
be tested on the following parameters:

Size of the sample: The size of the sample for study


very much determines, not only its representative ness
but also its utility for study. The investigator must test
that the size is adequate for scientific and convenient
study of the problem.

By testing the representative ness of the sample: The


representativeness of the sample should also be tested.
It means that the sample selection should be
representative and possess the characteristics of other
units.

By drawing a parallel sample: It means that apart


from the samples that have been drawn, another sample
may be drawn from the same universe for testing. On
the basis of these tests, the reliability of the sample,
primarily selected, may be tested. The comparison of
two sample values gives you a better understanding of
sample.

By testing the homogeneity of the samples: Samples


should be homogeneous. They should possess all the
characteristics that are present in the population.

223

Through comparison of the measurement of the


sample with those of the population: Sometimes,
different measurements about the universe are also
known. In order to test the reliability of the samples,
the investigator may apply his knowledge and thereby
test the reliability of the sample.

Notes
__________________
__________________
__________________
__________________
__________________

Unbiased selection: The selection of sample should be


done through a method that is free from bias and
prejudices.
By taking a sub-sample from a main sample: This is
a method of sampling within sampling. Out of the
universe, we draw a sample, but in order to test the
reliability of the sample, we draw a sub-sample, from
main sample and study it intensively and compare the
findings of the study of the sub-sample with the findings
of the study of the main sample. This helps the
investigator to detect any error that might have crept in.

Errors in any statistical investigation, i.e., in collection,


processing and analysis of the data, may be broadly classified
as: (i) Sampling errors and (ii) Non-sampling errors.

In a sample survey, only a small part of the universe is


studied. As such there is every possibility that its results
would differ to some extent from that of the universe. Even
if two or more samples of the same universe are taken the
results will differ to some extent from that of the universe,
as well as from each other. The difference would be always
present even if the sample is drawn at random. These
differences constitute the errors due to sampling and are
known as sampling errors. This is the error which is the
result of sample or sampling procedural and it always exists
in same quantity.
Sampling errors creep in because of the following reasons:
(i)

Faulty selection of the sample: Purposive selection


of sample would result in biases, as in this case, the
investigator deliberately selects the representative

__________________
__________________
__________________
__________________
__________________

224
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

sample. If, however, the selection of sample is


haphazard, the chances of bias errors are great.
(ii) Incomplete investigation (or non-response): If all the
items to be included in the sample are not covered there
will be bias. This occurs frequently in case of sample
collected through questionnaire method. All the
questions in the questionnaires are not responded
properly. Again if the selected person is not interviewed
at any time to collect information, bias may arise.
(iii) Faulty collection of data: During the process of
collection of data certain errors and mistakes may creep
in due to the following reasons:
(a) Negligence or prejudice of enumerator in putting
questions or recording answers.
(b) Lack of knowledge on the part of person furnishing
information.
(c) Poorly designed questionnaire.
(d) Unorganized method of collection of data.
(iv) Substitution: Due to non-availability of the selected
person (or item) the investigator may interview another
person from the same sample. The second person may
not have the same characteristics as the original one.
This will introduce the substitution bias in the sample
and as such deviate the result.
(v) Faulty Analysis: Faulty method of analysis of data may
also introduce the samplingerror.
Enough has been written about the biases of the investigator
or the respondents. There could also be unbiased errors that
creep in due to accident or by natural course of events without
any bias by enumerator or informant. They occur due to
chance factor that of the member of universe being excluded
or included in the sample selection. Further, this type of error
occurs when only a partial observation of universe is made
and is equal to the difference between sample statistic and
parameter of universe.

225

If the over-estimated and under-estimated values of


observations are nearly equal, then errors in one direction
will compensate the errors of other direction. Therefore, the
unbiased errors are usually known as compensatory errors
as they tend to offset each other and leave little effect on the
general results.

Notes
__________________
__________________
__________________
__________________
__________________
__________________

The above discussion about the sampling errors seems to


imply that studies of the entire population are free from any
errors. Nothing could be farther from the truth. Errors may
occur at any stage of enquiry, i.e., planning, collection,
processing and analysis of data. Apart from sampling errors,
errors arise due to following reasons:

Faulty planning resulting in improper definition of the


problem statement.

Vague or incomplete definition of universe.

Imperfect questionnaire which might result in


incomplete or wrong information.

Defective data collection.

Acceptance of exaggerated or irrelevant or wrong


answers to the questions that satisfy the pride or selfinterests of the respondents.

Personal bias of the user of the report.

Improper understanding and definition of the variables.

Improper use of averages to replace the actual figures.

Application of wrong methods and.

Defect in measuring instruments.

And this list is by no means exhaustive!


The magnitude of the errors that can creep in indicate
that sampling is a technique which must be used selectively
and objectively. Reducing sampling and non-sampling
errors can be achieved by keeping in mind the above
mentioned points and not loosing the sight of the objectives
of the study at any stage. Still some errors would invariably
creep in.

__________________
__________________
__________________
__________________

227
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________
__________________

After reading this unit you will be able to :

__________________

Understand the concept of frequency distribution

__________________

Learn how to measure mean, mode and median

__________________

Understand the various types of averages that are used and their
applications in business

Learn the basics of probability distributions

Collecting and collating data is one serious matter as we saw


in the last chapter, getting a meaningful analysis out of it is
another. In the next few chapters we will focus on
understanding the tools that we require to analyze this data.
The data collected could be in terms of qualitative variables
or in terms of quantitative variables. Examples of qualitative
variables include items termed as defective or effective,
persons classified as rich or poor, etc. Care must be taken
before we quantify these qualitative variables for this is one
of the major sources of errors. Quantitative variables may
be discrete, continuous or a combination of the two. Discrete
variables take on only whole number values, for example,
number of defective parts in a sample, number of married
people in a city, etc. Continuous variables can be measured
to any arbitrary degree of accuracy, for example, the weight
of a person can be measured to the nearest kilograms, grams,
milligrams, etc. The result may or may not be a whole
number. The accuracy desired should be such that the
relevancy of the data is not lost and it is not too difficult to
get the desired data.

228
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The data that you have collected till now, either through
sampling or otherwise, is called raw data. Now this data can
be arranged in an array. For example, if you collected data
on electricity consumption for one day of 1000 households,
you would get an array with 1000 rows and two columns, 100
rows for the houses and the two columns for house numbers
and electricity consumption respectively. As it is very
difficult to draw inference from this raw data, we can process
this data so as to show the number of houses, which are using
electricity within a particular range, together. This table of
electricity consumption ranges and number of houses is
shown below:
Table 12.1 Frequency Distribution of Electricity
Consumption
Electricity Consumption (kilowatts)

Number of houses

0-9

10-19

20-29

30-39

10

40-49

20

50-59

35

60-69

50

70-79

70

80-89

100

90-99

130

100-109

130

110-119

100

120-129

70

130-139

50

140-149

35

150-159

20

160-169

10

170-179

180-189

190-199

229

Note that, we are not using the house numbers as the


information is irrelevant for the purpose of determining how
many houses use how much electricity per day. Also note
that, this is only one days utilization and is not indicative of
the average utilization of electricity by that household. If it
is Sunday, the overall average would be lower that what you
can draw an inference from this data. This kind of overgeneralization is very common and is a frequent source of
errors in real life situations. It is important that a frequency
distribution should have a suitable number of class intervals.
Class intervals mean the ranges for which we classified the
number of items. In the above case 10 is the class interval
used for electricity consumption. If too few classes are used,
the original data would be so compressed that little
information will be available. If too many classes are used,
there will be too few items in the classes, and the frequency
polygon would be irregular in appearance.
There are basically three precautions that must be kept in
mind when determining the class intervals. First we must
select the class interval so that the mid-values of the classes
will coincide, as far as possible, with the concentration of
items that may be present. Second, we should avoid openended classes. Third, the class intervals should usually be
uniform.

Frequency distributions may differ in average value,


dispersion, shape or any combination of the three.
Mathematicians have named almost all kinds of shapes which
combine the properties of these three variations. They do
not add too much value to your learning and therefore, are
not mentioned here.
We have discussed averages, dispersion and skewness
graphically and you must be thinking that there must be a
way to measure these quantities mathematically. There is
and that is what we are going to discuss next before going on
to probability distributions.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

230
Notes
__________________
__________________
__________________
__________________

In a series of statistical data that parameter which reflects a


central value of the series is called the central tendency.
Central tendency refers to the middle point of a statistical
distribution and is also known as an average.

__________________
__________________
__________________
__________________
__________________
__________________

An average can be defined as a central value around which


other values of series tend to cluster. An average is computed
to give a concise picture of a large group: By the use of
average complex groups, large numbers are presented in a
few significant words or figures. Averages help in obtaining
a picture of the universe with the help of sample. Although
sample and the universe differ in size, still their average
may be very much identical.
Averages give a mathematical concept to the relationship
between different groups, for example, the trees in one forest
are taller than in another forest but in order to find any
definite ratio of heights it is essential to resort to averages.
But is an average a representative? Yes, essentially because
of three reasons:
i)

Ordinarily most of the values of a series cluster in the


middle,

ii)

At the extreme ends the number of items is usually very


little, and

iii) Ordinarily items with values less than the average cancel
out the items whose values are greater than the average.
The average of 4, 5, 6 is 5. The average 5 is less in value
and is more in value by one towards both the extremes.
Thus, the two deviations -1 and +1 cancel each other.
An average should be affected as little as possible by
sampling fluctuations, i.e., for different sample of same
population the variation in the average is very little. An
average should be capable of algebraic treatment so that it
can be used for further mathematical manipulation.

231
Notes

Averages may be classified into three broad types:


i)

__________________

Mathematical Averages:

__________________

a) Arithmetic mean

__________________

b) Geometric mean

ii)

__________________

__________________
__________________

c) Harmonic average

__________________

Positional Averages:

__________________
__________________

a) Mode
b) Median
iii) Commercial Averages:
a) Moving average
b) Progressive average
c) Quadratic average
Mathematical averages are those which utilize mathematical
formula for the calculation of their values. Positional
averages do not use mathematical calculations but give you
an indication about the positional characteristics of certain
items. Commercial averages are the applications of averages
in commercial situations.
If so many varieties of averages are there, the question that
arises is which one to use. As we go ahead we would see that
each type has a specific application and should be used only
in that case.

Arithmetic Mean
Most of the time when we refer to the average we are talking
about arithmetic mean. This is true in cases like average
winter temperature in Delhi; average life of a flash light
battery, average working hours of an executive, etc. The
arithmetic mean (or simply mean) is the quantity obtained
by dividing the sum of the values of items (X) in a variable
by their number (n), i.e., number of items.

__________________

232
Notes

__________________
__________________

The mean of 3, 4, 5, 6, 7 is.

__________________
__________________

__________________
__________________
__________________
__________________
__________________
__________________

Looking at the formula above we can say that the algebraic


sum of the deviations of the individual items from the
arithmetic mean is zero. If the sum of the deviations of
individual items from the mean is zero, then the sum of
squares of the deviations is minimum when taken from the
arithmetic mean than taken from any other item.
This means that if any one or more items in the group are
replaced by new items, the new arithmetic mean would be
changed by the net change divided by number of items. For
example, if the values 3 and 4 in the above example changes
to 8 and 9 (total change of 10) then the mean can be calculated
in either of the two ways mentioned below:

Although in this example it would have been faster to do it


the original way, the alternate method assumes more and
more significance as the number of items go up.
When a frequency distribution is given, as in the table 12.1,
the mean is calculated using a variation on the above formula.

Here f stands for frequency of the class, x stands for midvalue of the class and n stands for total of all frequencies in
all classes.
Revised Table 12.1 is reproduced below as Table 12.2.

233
Notes

Table 12.2: Frequency distribution of Electricity

__________________

Consumption
Electricity
Consumption
(kilowatts)

Mid-value (x)

0-9

__________________

Number of
houses (f)

fx

10-19

15

45

20-29

25

125

30-39

35

10

350

40-49

45

20

900

__________________

50-59

55

35

1925

__________________

60-69

65

50

3250

__________________

70-79

75

70

5250

80-89

85

100

8500

90-99

95

130

12350

100-109

105

130

13650

110-119

115

100

11500

120-129

125

70

8750

130-139

135

50

6750

140-149

145

35

5075

150-159

155

20

3100

160-169

165

10

1650

170-179

175

875

180-189

185

555

190-199

195

195

848

84800

Total Value

__________________
__________________

Applying this formula to the table we get:

In the case where cumulative percentage distribution is


given, grouped frequency distribution is derived from the
cumulative percentage distribution and then the usual
procedure is applied for computing the mean.
If two or more groups contain respectively N 1, N 2,
observations with means
,
, respectively, then the
1
2
combined mean (X) of the composite group is given by the
relation:

__________________
__________________
__________________

234
Notes
__________________
__________________
__________________

Here N stands for the sum of the denominator (N1+N2+N3+)

__________________

Table 12.3: Critical Evaluation of Arithmetic Mean

__________________

Merits

__________________
__________________

__________________

__________________

__________________

It is rigidly defined and is definite


Its calculation is easy and generally
understood
The data needs very little preparation,
e.g., it need not be arrayed
It utilises all the data in the groups
It is suitable for arithmetic and algebraic manipulation.

Demerits

When distribution is highly skewed


on either side, arithmetic mean
looses its representativeness
Its calculation requires information
about all units, either individually or
collectively. Therefore, it can not be
safely used in open end tables
It is not suitable for non-homogeneous series
Can't be applied for extremely large
values on either side

In calculating simple arithmetic mean it is assumed that all


items were equal in importance. It may not be the case
always. When items vary in importance they should be
assigned weights in order of their relative importance. For
calculating the weighted arithmetic mean the value of each
item is multiplied by its weight, product summated and
divided by the total of weights and not by the number of items.
The result is the weighted arithmetic average. Symbolically:

Here w1, w2, w3, stands for the respective weights of each
of the items.
Weighted averages have important applications in trend
analysis and forecasting. But it should be used when any of
the following conditions holds true:
i)

When the importance of all the items in a series is not


equal.

ii)

When the items falling into different grades of the


classes of the same group show considerable variation
and it is desired to obtain an average which would be

235

representative of the whole group, weighted average is


the only proper average to be used. In other words, when
the classes of the same group contain widely varying
frequencies.

Notes
__________________
__________________
__________________

iii) When the percentages, rates or ratios are being


averaged.

__________________

iv) When there is a change either in the proportion of


frequencies of items or in the proportion of their values.

__________________

It would not be improper to remind you that simple mean


and weighted mean are two of the mostly used means.

__________________

__________________
__________________
__________________
__________________

The most important application of geometric mean is in the


construction of index numbers, i.e., averaging rates of change.
For example, if you are investing in the stock markets, and
your money grows from Rs1,00,000 to Rs2,50,000 in three
years and you want to know what is the average percentage
gain you are making over the three years, you can use this
mean.

This could simply be written as:

This leads us to the general formula for geometric mean

Mathematically speaking, the geometric mean is the nth root


of the product of n items of a series.
The only problem with using this formula is that, you cannot
do it on a simple calculator and this is the biggest drawback
of it.
Geometric mean is also useful in skewed distributions and
averaging ratios.

236
Notes

Table 12.4: Critical Evaluation of G.M.

__________________
Merits

__________________
__________________

It gives comparatively little weight to


extreme values.

It cannot be used when any of the


quantities is zero or negative

It is suitable for arithmetic and algebraic manipulation

It is difficult to compute and requires


more time in computation

It is reversible both ways and therefore, suitable for ratios and percentages

It is difficult to understand

It gives less weight to large items


which sometimes may be a limitation, viz., computing average cost per
unit.

__________________
__________________
__________________
__________________

Demerits

__________________
__________________
__________________

It is defined as the reciprocal of the arithmetic mean of the


reciprocals. Thus, for a simple harmonic mean:

For a weighted harmonic mean, the above equation is


rewritten as:

Although harmonic mean is of limited use, it is less affected


by extremely large observations than any other average. It
is properly used to average rates where the weights are the
numerators of the fractions used to compute the rates.

Mode
Mode is that value which has the maximum frequency (i.e.
occurs most often) in a given set of values. Thus the mode of
a set of data is simply the value that is repeated most often.
It is the most typical value and, therefore, the clearest
example of a measure of central tendency.

237
Notes

For example, if you leave for your office everyday in the


morning and you recorded the following times for two weeks:

__________________

8.30 8.25 8.35 8.29 8.31 8.30 8.32 8.31 8.31 8.31

__________________

One thing is obvious, you are quite punctual! Anyway if you


arrange the data in increasing order

__________________

8.25 8.29 8.30 8.30 8.31 8.31 8.31 8.31 8.32 8.35
Here 8.31 occurs most frequently and is therefore the mode
of the given range.
You must be thinking that there usually be two items of
exactly the same size for a continuous variable, (if
measurements are made with sufficient precision), it is
apparent that our definition of the mode is somewhat vague.
For this we group the data and then use this simple equation:
C

l 1 = lower boundary of the class containing the largest


frequency
d1 = difference of the largest frequency and the frequency of
the last class
d2 = difference of the largest frequency and the frequency of
the next class
C = class interval
The main advantage of mode is that the value of mode is not
affected by the extreme values of the series. Plotting also is
not difficult, if there are more than one mode in the series,
then it is not difficult to determine it and it can be located
graphically also, very easily.
Table 12.5: Critical Evaluation of Mode
Merits

It is very easy to locate. In many cases


it can be obtained by inspection.
It is not influenced by the presence of
a small number of extreme items.
It may be ascertained even when the
details of extreme items are not available.
It is easily understandable.
It may be determined with considerable accuracy from a well selected
sample data.

Demerits

It is frequently ill defined


It sometimes is indeterminable without modifying the data as in the case
with Table 7.1
It cannot be calculated by simple
arithmetic process
It is unsuitable for arithmetic and algebraic manipulation

__________________
__________________
__________________
__________________
__________________
__________________
__________________

238
Notes
__________________

__________________

Median is the value of that item in a series which divides


the series into two equal parts, one part consisting of all
values less and the other all value greater than it. Defined
in another way median is that value of the central tendency,
which divides the total frequency into two halves.

__________________

Table 12.6: Critical Evaluation of Median

__________________
__________________
__________________

__________________
__________________

Merits

__________________

__________________

Demerits

It is easy to understand and calculate


It eliminates the effect of extreme items
In many cases it may be obtained by
inspection

It is easy to locate, subject to the actual number of items being known


Its position is more definite than that

of mode
It is clearly and rigidly defined.

It cannot be calculated by mathematical methods and therefore is


not suitable for algebraic treatment
Median is usually affected by fluctuations of sampling. Data must be
arranged before calculation of median
It is unsuitable when greater importance to large or small values is to be
given
It may be difficult to locate and when
located it may not be representative
in case the items in a series are not
suited closely together
A correct total cannot be obtained by
multiplying the median by the number of items.

Calculation of median from simple series is very simple. If


the data set contains an odd number of items, the middle
item of the array after arrangement is the median. If there
is an even number of items the median is the average of the
two middle items after arrangement.
Calculation of median from simple frequency distribution
(Ungrouped) is also easy. The cumulative frequency (less than
type) corresponding to each distinct value of the variable is
calculated. If the total frequency is N, the value of the variable
corresponding to cumulative frequency
gives the
median.
Calculation of median from simple frequency distribution
(Grouped) is slightly more complex. The following formula
is used:

239

where,
L = Lower boundary of the median class

Notes
__________________
__________________

N = Total frequency

__________________

F = Cumulative frequency below the class immediately


preceding the median class

__________________
__________________
__________________

Fm = Frequency of the median class

__________________

C = Class interval or width of the median class.

__________________
__________________
__________________

Moving Average
The moving average is an arithmetic average of data over a
period and is updated regularly by replacing the first item
in the average by the new item as it comes in. It is useful in
eliminating the irregularity of time series and is generally
computed to study the trend.
Suppose the prices for 12 months are given and a three
monthly average is to be computed. Then the first item in
the 3-month moving average would be the average [(a1+a2+a3)/
3], the second item would be the average of the next three
months [(a2+a3+a4)/3] and so on. The last item would be the
average [(a10+a11+a12)/3]. As the next month would come in
a10 would be dropped and a13 would be added in [(a11+a12+a13)/3]
and so on.
Progressive Average
Progressive average is also calculated with the help of simple
arithmetic mean. It is a cumulative average. In computation
of progressive average, figures of all previous years are added
and divided by the number of items. As the number of items
go up and reach a desired number, we switch to moving
average.
Quadratic Average
The quadratic mean or average is estimated by taking the
square root of the average squares of the items of a series.

240
Notes

Symbolically,

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

where Qm = Quadratic Mean


a2, b2, c2 .....n2 = squares of the different values
Quadratic average is useful when some items have negative
values and other positive values because in such cases the
mean is not very representative. It is also used in averaging
deviations, rather than original values, when the standard
deviation is computed.

A comparison of the mean, median and mode is necessary


for you to understand the positive and negative
characteristics of them.
Parameter of

Comparison

Average

Mean is a calculated average. Median and Mode are averages of positions. If all the items in a variable are the same,
then only AM=GM=HM otherwise AM > GM > HM

Calculation

Mean is the sum of the values of the items divided by the


number of items in the series. Median is the middle value
which divides a series into two equal parts. Mode is the
value around which the items of a series tend to concentrate in density, i.e., most occurred frequency.

Treatment

Mean is capable of mathematical treatment. Median and


Mode are not capable of such treatment.

Items

All the items in a series are taken into account in the calculation of Mean. Median and Mode calculations do not consider all the items in a series.

Extreme Values Mean is affected by extreme values of the items in a series


but it is not so in case of Median and Mode.
Calculation in

Mean calculations of frequency distribution with open-ended

Open-ended

class intervals at both ends is not possible. Median and

distributions

mode of such distribution can he easily calculated.

Reliability

Mean is considered to be a more reliable measures of central tendency than Median and Mode.

Result

In a series of distribution of data, there is only one value of


mean or median. But there could be more than one mode or
no mode at all. Mean is simple to understand and to calculate.

Use

Mean is widely used. Median and Mode have limited use.

241
Notes
__________________

It is only because of variability that we compute averages.


But if there is too much variability among the data, an
average is so unreliable that it is almost useless. Usually, a
high degree of uniformity (i.e. a small amount of dispersion)
is a desirable quality. Mass production would usually be
uneconomical if there was a large amount of variability in
materials or manufactured parts, for standardization and
interchangeability of parts it is essential to have low
variability.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Therefore, we will consider several measures of dispersion,


but there are only two that are used much; the range and the
standard deviation.

The range is the difference between the largest value and


the smallest value.
R = xn x1
For the example on leaving for office times, the range was 10
minutes (8.35 - 8.25).
The problem with using range is that it considers the extreme
values only and does not use all the data in the sample. It is
therefore less reliable than some other measures of dispersion
and varies too much from sample to sample to be of effective
use. It is also very sensitive to the size of the sample, it
usually increases with the increase in the sample size,
although not proportionately.
Inspite of these shortcomings, there are special situations
where the range is useful. When the sample is from a 'normal'
universe with a small sample size, the quantity is nearly as
reliable as the more laboriously calculated standard
deviation. There are also certain types of data and certain
purposes for which the use of range is appropriate, e.g., the
range of temperature in Delhi.

Also known as average deviation, mean deviation is the mean


of the absolute amounts by which the individual items deviate

__________________

242
Notes
__________________

from the mean. The following procedure is usually applied:


1.

Calculate the absolute deviation from the mean,


removing any negative signs.

__________________

2.

Sum all the deviations.

__________________

3.

Divide the sum of the deviations by the total number of


items.

__________________
__________________

__________________
__________________
__________________
__________________

Symbolically, these steps may be summarized as follows:


__________________
Item number

Deviation from
mean

Absolute deviation
from mean

10

10

10

20

20

20

30

30

30

-10

-10

10

-20

-20

20

-30

-30

30

-25

-25

25

25

25

25

Total

170

Mean deviation is simple and easy to understand and unlike


R, it is affected by the value of each item. But it is unreliable
because it varies from sample to sample taken from the same
universe. Also it is a biased estimator of the population.
Therefore standard deviation, discussed below, is the most
often used measure of population dispersion.

The standard deviation of a sample SD is similar to the mean


deviation in that it considers the deviation of each X value
from the mean. However, instead of using the absolute values

243

of the deviations, it uses the squares of the deviations. These


are summed, divided by n, and the square root extracted.
The formula for standard deviation (SD or as it is usually
represented)

Notes
__________________
__________________
__________________
__________________

__________________
__________________

Variance is the square of SD and is represented by:

__________________
__________________
__________________

The detailed calculation is shown below:

__________________

Item number

Deviation from
mean

Square of
Deviation

10

10

100

20

20

400

30

30

900

-10

-10

100

-20

-20

400

-30

-30

900

-25

-25

625

25

25

625

Total

4050

The concept of using sums of squares of deviations about the


arithmetic mean of a distribution is very important and we
would use it extensively in the chapters that follow.

To get an indication of the variation that is related to the


mean, we divide the standard deviation by the mean to get
the coefficient of variance. This enables us to compare two
groups, which have different standard deviations and means
more easily.

Coefficient of Variance = x 100

244
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Skewness may be defined as the lack of symmetry or degree


of distortion from symmetry exhibited by a probability
distribution. Any measure of skewness indicates the
difference between the manner in which the items are
distributed in a particular distribution compared to a normal
distribution.
The most useful measure of skewness is the Karl Pearson's
coefficient of skewness.

__________________
__________________

When the mode is not clear or where there are two or three
modes, the following formula is used:

245
Notes
__________________
__________________
__________________
__________________
__________________

Objectives

__________________

After reading this unit you will be able to :

__________________

Understand the mathematical model of forecasting

Learn and apply correlation analysis

Understand how regression analysis works.

Understand non-linear and multiple regression analysis and their


applications in business.

__________________
__________________
__________________

The future has always held a great fascination for mankind.


Perhaps this is biologically determined. Man and the higher
apes seem to have brains that are equipped to engage in
actions for which a future reward is anticipated. In extreme
situation reward is anticipated not in this life but in the next
life.
There are two methodologies to anticipate future. They are
called qualitative and quantitative. But both start with the
same premise, that an understanding of the future is
predicted on an understanding of the past and present
environment. In this chapter we will mainly deal with
quantitative methods. We will also distinguish between
forecast and prediction. We use the word forecast when some
logical method is used.
The quantitative decision maker always considers himself
or herself accountable for a forecastwithin reason. Let us
look at the conceptual model first and then the mathematical
model and algorithms in turn which are used for making
forecast.

246
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The qualitative school has generated many philosophical,


religious or political conceptual models according to which
the ideology and dogma is structured and forecasts prepared.
Quantitative decision making, defined here as anything that
is not based on underlying belief, offers three conceptual
models. They are quite quantitative to highly technical. They
are guesstimate, fundamental and technical models.
In the guesstimate conceptual model the forecast is based
on expert opinion. It is almost like qualitative decision
making except that the bias of many is pooled. This method
of forecast basically revolves around Delphi Method.
This conceptual model for forecasts should not be used when
ample data bases are available. It is also known as option
methodology. The Delphi method consists of a panel of
experts and a series of rounds during which forecasts are
made via questionnaire. Whether expertise or ignorance is
pooled in each round, the result is the same: a forecast is
born. But in the absence of sufficient data, it may be
preferable to develop heuristics first rather than to rely
initially on guesstimates.
The second conceptual model stresses the fundamentals that
impinge upon the environment at any given time. In this case
the forecaster tries to ascertain the functional relationships
among variables defining the environment. In addition,
attention is paid to changes in the magnitude of the variables
that make up the environment. This conceptual model is
superior because it is based on logical considerations and
not merely on expert opinion.
The reason why not all forecasters wholeheartedly embrace
the fundamental conceptual model is that it takes a pretty
good mind to understand the variables and their
interrelationships that represent the environment. It takes
constant study, constant learning, constant testing and then
the intellectual ability to synthesize it all. To cite an example,
it does not take much to come up with a fundamental
conceptual model to forecast a nations economic activity.
We know that Gross National Product (GNP) is a function of

247

consumption (C), investment (I), government spending (G)


and net exports (E). In equation form it appears as
GNP=C+I+G+E. Now each variable, i.e., consumption,
investment, etc., can be carefully quantified. For example,
do we consume more goods or services? More hard or soft
goods? and so on. Beautiful econometric models have been
generated on the basis of this conceptual model. Beautiful
forecasts have also been presented.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The third conceptual model was called technical. It is used


by forecasters who call themselves technocrats. Whenever,
a pre-determined parameter that the technocrats follow
reaches a certain magnitude, they forecast a change in the
environment irrespective of the behaviour of other variables.
Sometimes, this model gives accurate results and some times
not.

The mathematical models play a very important role in


forecasting. The quantitative analysis that underlies a
forecast is based on the type of conceptual model that has
been chosen. The guesstimate and technical models typically
result in mathematical models that are less rigorous than
the fundamental model, although the decision tools of the
former may be applied in the case of the latter as well. Again
in this chapter only those decision tools are discussed that
are considered efficient and wisely used. The mathematical
model that is considered here should be used in conjunction
with the fundamental conceptual model.
The logic used in mathematical model is twofold. First, it is
based on the idea that the past and present environments
may be used to extrapolate the future, the forecasted
environment. Second, the environment itself comes about
because of a functional relationship that exists between the
variable whose value is to be forecasted and one or more
other variables that determine the forecasted variables
magnitude. The environment can be an economy, the market
for a product, the productivity of a shift of assembly line
workers, the track performance of 1500 meter runners and
so on. It is clear that in forecasting there are always two

__________________
__________________
__________________

248
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

types of variables: the one that is being forecasted and one


or more from which the forecast is made. The first one is
known as the dependent variable, the latter as the
independent variable(s). The functional relationship between
the two can be visualized within a system of coordinates
where the dependent variable is shown on the y and
independent variable(s) on the x-axis. Since both types of
variables have usually positive values, the entire
environment (past, present and future) is shown in the first
quadrant. Those variables that may affect the dependent
variables magnitude but which are not considered in the
decision space of the forecast either because of oversight or
their effect is deemed negligible, are known as intervening
variables.

Such a system may be called a simple forecasting system. If


there are more than one independent variable, a multiple
forecasting results. Visualize each independent variable as
representing a dimension in the decision space. For the case
y=f (x1, x2), the space may still be shown on a plane (a twodimensional piece of paper). If there are more than two
independent variables then imagine a different dimension
for each variableall starting at the origin (O).
Now the question arises as to how the dependent and
independent variables may be calculated mathematically so
that a forecast can be made. Let us stay with the simple
system (only one independent variable x) for illustration
purposes and argue that we can forecast gross national
product on the basis of consumption alone, that is, GNP=f
(Consumption). In developed economy we would not be too
far off the mark. Past and present GNP and corresponding
consumption data may then be obtained and plotted. The
resulting cluster of coordinates is known as a plot or
scattergram.
Now suppose that someone asked you to forecast GNP, given
a certain consumption value. Such a value may be obtained
by polling a sample of consumers about the amounts that
they are planning to spend. Then calculate the sample mean,

249

construct a confidence interval, and then an average line or


curve may be drawn through the plot and the y-value obtained
by extrapolation. The immediately apparent problem is the
mathematically proper selection of the line or curve, because
there are many possible ways of drawing such an average, as
shown and hence many possible forecasted GNP values.
There are two basic methods to solve this problem. They are
known as regression and smoothing. Each one has spawned
a number of offspring. In this book only those are discussed
that minimize the possibility of injecting the decision makers
bias into the forecast. Furthermore, the decision tools that
are illustrated represent the strongest mathematical link in
the chain.
Both methods use the arithmetic mean which represents the
forecasted value. Fitting an average line by free-hand process,
points A and C fall above while points B and D fall below
this average line. The distances between them and the line
are shown as a, b,g, d and the points a,b,c,d determine the
line with its familiar equation yc =a + dx1 where yc is any
estimated (forecasted) y-value given a certain x-value, a is
the y-intercept and b the slope of the line. Obviously the best
and tightest fit is obtained when the sum of the distances
equal 0, since the positive distances (above the line) and
negative distances (below the line) would be offset against
each other. Mathematically the tightest fit is obtained by the
method of least squares,
a2 + b2 + g2 + d2 = |Minimum|
This conditions holds when the normal equations are used
for calculating the a and b values, or, for the straight line
where all values were previously defined. The method is
known as regression analysis and was developed by Francis
Galton. You may read a book on statistics for deeper
understanding of regression and correlation analysis.
It should be noted that the known environment extends
between Points A and D. Given this environment and the
line that has been fitted to it, a forecast yc may be made given
x as shown in Figure 8.3. However, we are now extrapolating
into an unknown environment, that is, its real x and y values

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

250
Notes
__________________
__________________
__________________
__________________
__________________
__________________

are not known. Therefore, we must make a


crucial assumption, namely, for our forecasted yc value to be
correct, no material alternations in the functional
relationship y=f(x) must have taken place. If, for example, a
much larger or smaller real value corresponds to each real x
value in this unknown environment or if a new important
variable has entered the decision space, the forecasted y c
value is probably false.

__________________
__________________
__________________

__________________

Figure 13.1: Fitting a line yc = a+bx, to four points A, B, C, D

As we will see shortly, fitting lines and curves to given data


sets is simple. In fact the computer usually does it for us.
Technocrats do this very well. But the good forecaster knows
his environment. The skill lies in understanding the
interrelationships in a multi-variable decision space and to
know when a change can be expected in the functional
relationships or when new variables (henceforth intervening
variables) must be considered an integral part of that decision
space. It is in this aspect of forecasting that a solid
fundamental conceptual model pays its dividends. A good
forecaster never loses sight of the fundamentals. The master
forecaster already knows the fundamentals of the unknown

251

environment. This is a skill that cannot be taught. It can only


happen to the individual who is willing to completely
immerse himself in a veritable flood of data and information
blocks that may in any conceivable way have some bearing
on the forecast. If he ever comes up again, it has happened:
one of the few master forecasters has joined the ranks.
A distinction is made between linear and curvilinear
regression analysis. Linear analysis fit straight lines to data
sets. Curvilinear or nonlinear analysis do the same with
curves. Furthermore there is a simple regression analysis
with only one independent variable, i.e.; y=f (x), and multiple
regression with more than one independent variables or
y=f (x1, x2,..., xn). A special type of regression analysis uses
time as the independent variable. Sales forecasts are
examples where Sales =f (Time). This type of regression
analysis is known as time series analysis. Finally, before
turning to the algorithms, a word may be added about
smoothing, a major forecasting method. As the name implies
observed values are smoothed and a weighted average is
obtained which represents the behaviour of the variable
under consideration. It is to be noted that the smoothing
method is mathematically much less rigorous than regression
analysis. Also, the method is limited in its scope of
application.

Each of the two forecasting methodsregression and


smoothinghave their distinct algorithms. They will be
discussed in turn. In order to set the mental stage for this
discussion, it is helpful to re-think the approach taken in
the particular type of forecasting analysis that is examined
here. It is not a forecast out of the blue. Rather it is based on
functional relationships among variables for which there is
a stipulated logic. In other words there is a fundamental
conceptual model upon which the quantitative analysis is
based. Furthermore it is assumed that the past and present
environments, that is, the joint behaviour of the dependent
and independent variables, are indicators of the future
environment. Mathematically speaking, the forecast is
nothing but an extrapolation of the past and present. As

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

252
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

previously pointed out, if no fundamental changes take place,


with respect to the magnitude of the functional relationships
among the variables and no new variables enter the decision
space, this logic has been found to work well when applied.
It has stood the test of time. But the above assumption also
means that the direction of the forecast, that is, an increase
or decrease in the magnitude of the forecasted behaviour of
the variable, must be clearly visible and mathematically
substantiated in the past and present environments.

__________________
__________________
__________________

It may be recalled that, depending upon the forecasting


problem, regression analysis may take several forms. There
is linear and non-linear regression. There is simple and
multiple analysis. And there is time series analysis. For each,
however, the same algorithm or solution methodology may
be used. This algorithm has following steps.
(1) Prepare a plot.
(2) Fit a line or curve to the plot and define either
mathematically by the method of test square.
(3) Test the significance of the slope. Sometimes people skip
this test.
(4) Construct a confidence interval for the forecasted yc
value.
(5) Estimate the quantitative effect of the independent
variable(s) x on the behaviour of the dependent variable
y. This is known as correlation analysis.
(6) Test the significance of the correlation. This, too, is
sometimes ignored.
After you have done this, what have you got? You got yourself
a forecast and you may be 95% confident (remember that in
forecasting it is always 95%) that your forecast will be on
the mark when, that mark ultimately becomes known
assuming that the logic of your conceptual model is sound.
This is a big assumption, because anybody can come up with
a functional relationship y=f(x); which may not necessarily
be sound.

253

Let us proceed in our examination of the decision tools from


the simple to the more complex ones and explain each step
of the algorithm in detail as we go along. We begin
with simple linear re-gression and correlation analysis,
yc=f (x), then switch to linear and non-linear time series
analysis, yc=f (x, time), which is only a special case as may be
re-called, and finally look at multiple re-gression and
correlation analysis, yc=f(x1,x2,...,xk).

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

A time honoured functional relationship exists between the


amounts spent on advertising and sales generated by these
amounts. It is a popular belief that this is a positive
relationship in the sense that each rupee spent on
advertising generates so many additional rupees (say, about
10 for consumers goods) in sales revenue. Indeed, the
relationship appears to be linear within meaningful ranges
of the advertising budget. Thus, very little money spent on
advertising may have very little effect on sales or none at
all. Similarly, extremely large sums that are spent on
advertising will not generate that much more in sales
revenue. But, a meaningful advertising budget (meaningful
in terms of market constraints, demand factors, customer
income, etc.), usually shows a linear relationship to sales
revenue.In order to perform regression analysis, data sets
of between n=15 and n=25 should be used as a minimum
sample size. After all, the past and present environment is
described and too small a sample of observations will not do.
For illustration purposes we will use only n=5 so that the
manual calculations do not detract from an understanding
of the procedures. On the job, regression analysis should
be performed on the computer exclusively once the
method has been understood. For easier referencing
all calculations for each step of the regression algorithm
are shown in one work sheet with the column number
identified for each step. Let us get to work on an assignment
that has been just received from the controller of our
company. Given advertising expenditures of Rs 8 million for
the next fiscal year, how much sales revenue can be expected?
is the question that the company wants an answer for.

__________________
__________________

254
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Let us try to answer this question. First, the conceptual


model: what, if any, relationship exists between sales and
advertising? This has been taken care of. We know that sales
=f (Advertising). Secondly, how does this relationship look
in the case of our particular firm? OK. We need data. We get
the data from our accounting department. They tell us that
in the past, five advertising expenditures (budgets) resulted
in certain corresponding sales figures. Remember that we
use n=5 in order to keep the calculations simple. Notice that
they gave us a random sample of observations that is deemed
representative of the environment. The data are recorded
in ascending order (not a requirement) in the work sheet
below. Sales are the dependent variable and advertising the
independent variable. Already at this point begin to think
about the intervening variables that are likely to be operative
in this decision situation. We will have to come back to them
later on. And now let us activate the algorithm.

The data, as they appear in columns 1 and 2 of the worksheet,


are placed into a system of co-ordinates. This can be done
manually or by computer. When making a forecast about a
certain phase of your operations, you usually have a pretty
good idea of how the result ought to look. This mental
picture is the result of your collective experience with the
operations. The plot allows you to verify the mental picture
with reality. You can see, for example, if there is a positive,
negative or zero relationship between the variables. You can
see whether the relationship is consistent (small variance)
when the data cluster is close or inconsistent (large variance)
when they are all over the quadrant. These observations help
you to shape the conceptual model.
Also, you can see if by freehand method you may want to fit
a line or curve to the data as previously mentioned.
Furthermore you can see whether linearity or non-linearity
governs the entire data set or if there are combinations. These
observations help you to decide on the Mathematical model
for your forecast. Obviously your choice of a linear versus
non-linear analysis has great impact on the forecasted value
and the implications of this choice will be discussed in detail

255
Notes

__________________
__________________

__________________
__________________
__________________
__________________
__________________

__________________

__________________

Table 13.1: Worksheet for a Simple Linear Regression Analysis

__________________

256
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

when performing the non-linear analysis later on. A good


forecaster will think about the plot and inspect it again and
again for a considerable period of time before fixing in his
mind the conceptual and mathematical models. Once the
models are determined, the rest of the quantitative analysis
is routine. The work can be turned over to the computer.
But relating a mental picture to reality and vice versa goes a
long way in making a valid forecast. The plot helps in this
endeavour.

__________________
__________________
__________________

Our plot is shown in Figure 8.7. Disregard for the moment


everything except the connected original data points shown
as circles. These data are taken from the worksheet. You
can readily see that if you had to fit an average to this data
set by freehand method, you would use a straight line. The
line that represents the best fit is calculated next in Step 2
of the algorithm.

Figure 13.2: Plot of Original Values and Line of Best Fit

The mathematically best fit of the line yc=a+bx to a data set


is obtained by the method of least squares as previously
discussed. The calculations for the normal equations

257

Notes

__________________

are shown in the worksheet resulting in


245 = 5a + 25b
1327 = 25a + 136.98b
which now may be solved simultaneously by solving for b
and substituting in the first equation in order to find a. Thus,

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

[245 = 5a + 25b ] 5
1225 = 25a + 125b

1327 = 25a + 136.98b


102 = 11.98b
b = 8.5142
Therefore, the best fitting line has been defined as
yc=6.429 + 8.5142x.
Rather than to solve a and b algebraically, the normal
equations may be solved for a and b. After simplification

is obtained. Using these direct formulae which are more


efficient

= 8.5142
and
= 6.429

__________________

258
Notes

__________________

The line yc=6.429 + 8.5142x is called the forecasting equation.


It is the decision tool that allows us to answer the controllers
question. You remember that she stipulated an advertising
budget of Rs 8 million or x=8.

__________________

We can now estimate sales revenue by

__________________
__________________

__________________
__________________

yc = 6.429+(8.5142) (8)

__________________

= Rs 74.5426 million

__________________
__________________
__________________

This result is shown graphically in Figure 13.7. You know


that the straight line is defined by two points. Given the
forecasting equation, pick any two (simple) x-values. Let us
say that we use x=0 because this is the y-intercept and
x =5.5. The estimated y-values are then found by
yc = 6.429+ (8.5142) (0)
= 6.429
and
yc = 6.429 + (8.5142) (5.5)
= 53.2571
These values are shown in Figure 13.2 as squares and the
regression line has been constructed. As you read up to the
line at x=8; you find the forecasted yc -value, Note the
managerial meaning of the y-intercept a and the slope b. At
zero advertising expenditures, sales amount to a=Rs 6.429
million. This tells you that there is not exactly a perfect
relationship between sales and advertising. If there were,
you would see zero sales. You are still thinking about the
intervening variables? Always do. Now, the slope b=Rs 8.5142
million tells you that as the controller authorizes the
advertising budget, each Rs 1 million will result in Rs 8.5142
million in sales. You realize, of course from the preceding
chapters, that since only sample data are used, this value
must be seen as the midpoint of a confidence interval. So
dont call the controller and say Rs 1 million advertising
results exactly in Rs 8.5142 million in sales. This is stochastic
decision making after all. Find the standard error of the slope
and look up the proper t - value and put everything into the

259

95% confidence interval for the regression coefficient b (note


the new term) which takes the form

Notes
__________________
__________________

__________________

While you are at it, you suddenly remember (from the


preceding chapter) that the forecasted sales value yc=Rs
74.5426 million should be communicated to the controller in
confidence interval form as well. We are still talking sample
statistics and always will be. We never know the population
regression line Y=a+bX. All we know is the sample regression
line yc=a+bx. So, better hold that call and let us figure out b
first. Needless to say that you could also calculate aif the
urge ever struck you. Should it strike you? Look at it this
way. At the y-intercept (a), x=0. But you postulated y=f (x)! If
you want to set x=0, you dont have much of a relationship.
Do you? You want to study the effect x has on y, and therefore,
you are interested in the slope b. Leave a to the statisticians.
They have the most peculiar urges at times.

So you are interested in the slope b. The question is at what


angle to the x-axis a slope becomes significant. Only a
significant slope means that y=f(x) is real in the stochastic
decision making sense. If the slope is significant case there
is a forecasting tool. In the insignificant case there is nothing
except a waste paper basket for the study or smoothing.
Regression analysis is based on the assumption that the yvariable is normally distributed. Figure 13.3 shows a pictorial
presentation. Then, it will be recalled from the preceding
chapter, the proper significance test for b is the one-sample
t-test.
The algorithm for that test is as follows:
Step: I

Ho : b = 0

Ho : b > 0
or
B 0

__________________
__________________
__________________
__________________
__________________
__________________
__________________

260
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

It is left to the discretion of the forecaster whether to use a


one or two-sided alternative hypothesis. In this particular
case the plot shows clearly a positive slope. Therefore to test
b < 0 does not make much sense. This test, however, is
included in b 0. So why use it? The answer is that it is always
easy to reject Ho when using a one-tailed test. Remember
the mathematically expected values (MEV) at P .05 for the
one and two-tailed tests? They are 1.64 and 1.96, respectively.
You can see that a smaller experimentally obtained value
(EOV) leads to the rejection of Ho. Take, for example, EOV=1.
75. Then with a one-tailed test you reject. But not with the
two-tailed test. Rejecting Ho in the regression problem means
that the re-gression coefficient b is significant which means
in turn that the re-gression equation may be used as a valid
forecasting tool. A conservative forecaster may argue that
the selection of such a tool should be as severely constrained
as possible. Hence, the two-sided alternative hypothesis.
II P.05
The 95% confidence level is selected as standard operating
procedure in forecasting studies involving regression
analysis for reasons previously stated.

III

The one-sample mean problem formula for the t-test may be


recalled.
It reads

Remember that each point on the regression line with slope


b represents a mean. Thus, we are talking, in the regression
case, simply about a many-sample mean problem where
each data consists of two values, x and y. So you can see the
similarities in the two t configurations : b takes the place of
x,o the place of because we want to test whether the
population regression coefficient equals zero (this is the null

261

hypothesis in all re-gression problems) and finally the


standard error of the regression coefficient b, b is
substituted for the standard error of the mean, You recall
that is unknown and must be estimated by

Notes
__________________
__________________
__________________
__________________

__________________
__________________

Similarly, b must be estimated, but this estimation is


somewhat more complex because of the broader scope of the
problem. It is

__________________
__________________
__________________
__________________

where 2yx is the standard error of the estimated regression


equation of the y values on x. This standard error yx is
defined by

Where n-m are the degrees of freedom with m the number of


regression coefficients. In the case of the straight line m=2
because there are two quantities a and b. The calculations
for Step III of the algorithm may now be performed as shown
in Table 13.1 columns 5 through 7. Note the distinction
between the observed y value (column 1) and the estimated
yc value (column 5). The latter is obtained by solving the
regression equation y c=6.429+8.5142x for every observed x
value (2.8, 4.3, 5.0, 5.5, 7.5) as shown in column 2. In other
words every estimated yc value is obtained for every observed
x value. Then, sum the squared difference between y and yc
as shown in Table 13.1. The SS of the x values (column 7) you
calculated in the preceding chapter. Note that as shown in
column 2, the t test may now be performed as follows,

and

262
Notes
__________________
__________________
__________________
__________________
__________________

=1.1977
If still interested, you may now construct the confidence
interval for the slope b. Finally

__________________
__________________
__________________
__________________

= 7.11

__________________

IV EOV>MEV or 7.11 > 2.353 (one-tailed)


or 7.11 > 3.182 (two-tailed)
The slope is not zero.
The regression equation may be used as a valid forecasting
tool.

We are now ready to calculate a proper stochastic answer to


the forecasting problem at hand. Remember that since sample
data are used, the answer always takes interval form. The
95% regression interval is

yc Sd (t.0250)
Where Sd is the standard error for a certain yc value given
its corresponding x value.
This standard error is defined by

All values are known. Note the x g value in the numerator


which is the given x value corresponding to the yc value. In
other words for each yc, a new interval must be constructed.
As shown by Figure 13.3 each point on the regression line
represents the mean of a normally distributed sample of y

263

values. The upper and lower confidence limits of these


samples are not parallel to the regression line but are
curvilinear as shown in somewhat exaggeratedform.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Figure 13.3: Normal Distribution of y-values

Now, our controller wants to know the forecasted sales


revenue (yc) given an advertising budget of Rs 8 million (x=8).
We calculated yc = Rs 74.5426. Calculating first the standard
error

=16.3464
the interval is

74.542 12.8650
or
(61.677, 87.407)
Remember that t is distributed with (n-m) or 3 degrees of
freedom and that, since an interval is involved with upper
and lower limits, the two-tailed MEV must be employed.
And now our final answer to the company: we tell the
controller that we are 95% certain that with an advertising

264
Notes
__________________
__________________
__________________
__________________

budget of Rs 8 million and assuming no major changes in the


market environment, between Rs 61.677 and Rs 87.407
million in sales revenue may be expected.
But the sales revenue may be the result of some other factor
than advertising.

__________________
__________________
__________________
__________________
__________________
__________________

A correlation analysis which measures the closeness of fit of


the regressions line while assuming that both x and y are
normally distributed (bivariate normal distribution) is
necessary to set at rest the above doubt. Hence a parametric
analysis is performed. Correlation analysis may also be seen
as a measure of mutuality of x and y. Indeed, it is the more
prevalent approach and computer programs are usually
based on it. Of course, the results are the same. Let us start
with closeness of fit because we have most of the calculations
completed and the reasoning process of fitting a line is still
fresh in our minds. But first a general word about this new
decision tool.
The degree of correlation between normally distributed
dependent and independent variables is signified by the
correlation coefficient r. The symbol r was used in order to
honour Francis Galtons work. You recall that he gave us
regression analysis and coined the term regression. Perhaps
it would have made more sense to call the regression
coefficient rather than the correlation coefficient r. But the
slope of the line was already labelled (b is generally known
as the regression coefficient although there is a as well), so
the correlation coefficient r it had to be. The coefficient r is a
pure number. It is constrained by 1 and defined by

Where
is the variance of the y values. But since these yvalues are part of a regression problem, n m degrees of
freedom apply. Therefore, in the case of the straight line,
m=2 previously discussed and

265
Notes

__________________
__________________

__________________
__________________
__________________
__________________
__________________

as shown in Table 13.1 columns 8 and 9. Then

__________________
__________________
__________________

As pointed out, the correlation coefficient r is a pure number.


Its sign is positive in this case because the slope of the
regression line is positive. If b were negative, r would be
negative as well. This fact is mentioned because it is not
readily apparent since the radical has both the positive and
negative sign.
Various types of correlation and their respective plots are
shown in figure 13.4

Figure 13.4

266
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Since perfect positive correlation is r=+1.00, but r is a pure


number, we cannot specify an exact quantitative meaning of
the value that we obtained, r = + 0 .9716. We can only say in
general terms that our findings show a high positive
correlation between sales and advertising. In order to
establish a specific quantitative relationship, we must
calculate r2=0.9440. This is the coefficient of determination
which indicates the percentage of the variation in the y
variable that is caused by the x variable.

__________________
__________________
__________________

For purposes let us say rather looselythe specifics are


discussed in the section about the significance of r that
94.40% of the increases or decreases in the sales revenue
are caused by advertising expenditures and only 5.6% are
caused by other intervening variables. These findings should
pacify our controller, because our product sales seem to be
very sensitive to advertising expenditures. More is to be said
about the significance of correlation, let us hold this
judgement for a while.
First, though, let us take a look at the other method, mutuality
of and of correlation analysis. It was mentioned previously
that parametric correlation analysis underlies the
assumption of bivariate normal distribution. If this is the
case, the distinction between independent and dependent
variable may be dropped. We may perform a regression of
on valuesas we did in the forecasting studyor regress
on values. In the first case we obtain our familiar equation
. In the second case we interchange the variable titles
for the given data sets in columns 1 and 2 of Figure 8.6 and
obtain
. You may do this right now and plot the two
regression lines. Obviously, with perfect correlation the two
lines would be superimposed. Less than perfect correlation
results in two interesting lines at ever wider angles as the
correlation decreases. Obviously, the position of the lines
are caused by the slopes and . And by now you probably
sense already something interesting, namely, that there is a

267

direct relationship between the correlation coefficient and


the two regression coefficient and . Indeed, the correlation
coefficient is the geometric mean of the regression

Notes
__________________
__________________
__________________

coefficients, or r

__________________

Normally, this book does not discuss the derivation of the


decision tools. But in this case you can get a little flavour of
that too because a knowledge of higher mathematics is not
necessary. You recall that we solved the normal equations
for yielding

This was simplified and in a form that uses totals rather than
deviations. The matter was discussed in the preceding
chapter, formula for the variance. Remember? Now let us
take one step back and show the formulae for b and b in
deviation form.

and

then

and

__________________
__________________
__________________
__________________
__________________

__________________

268
Notes

or in simplified totals form

__________________

__________________

__________________
__________________
__________________
__________________
__________________

This formula is known as the Product-Moment Formula or


Spearman- Brown Formula. It is identified as such in the
computer libraries. Solving for our sales and advertising
illustration, using the calculations in Table 13.1. We obtain

__________________
__________________
__________________

which is the same value as obtained previously.


A concluding comment about parametric correlation analysis
may sharpen further our understanding of the subject.
Remember that, the idea behind correlation analysis is to
specify the variation in the dependent variable that is caused
by the independent variable or variables in the multiple
correlation case as discussed later. We may show this idea
in basic equation form like this Total Variation = Unexplained + Explained
Variation

Variation

Where the explained variation is the portion that has been


explained by the regression of y on x, or, 94.40% in our
illustration and the unexplained variation, equal to 5.6%,
which is due to the influence of intervening variables and
statistical error. In symbolic form this equation may be
written as

269
Notes

The coefficient of determination is the ratio

__________________
__________________
__________________
__________________
2

and perfect correlation would exist, r =1.00, if the


unexplained variation equalled zero. Rewriting the symbolic
form of the basic equation with the degrees of freedom and
previously introduced symbols we obtain

__________________
__________________
__________________
__________________
__________________
__________________

where

is the variance of the yc values and has not been

calculated, but it is known by substraction if


(total
variance) and
(unexplained variance or standard error)
are known. This is indeed the case and r2 may be solved in
the closeness of fit approach.

and, as shown before,

When the bivariate normal distribution cannot be assumed


or the observed data represent rankings, a non-parametric
correlation analysis must be performed. The reasoning that
underlies this technique is the same as discussed in the
previous chapter. The appropriate decision tool is known as
the Spearman Rank Order correlation coefficient.

270
Notes

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Where rs stands for the Spearman r and d represents the


differences between ranks as already encountered in the
previous chapter. In order to illustrate the application of
this tool, let us use the problem on supermarket stools. Ten
cashiers were tested for productivity increase after a stool
had been installed at the supermarket checkout. The data
for with stool and without stool conditions are ranked,
the squared rank differences obtained and summed.

__________________

The worksheet in Table 13.2 shows the computations.


Table 13.2: Worksheet
With
Stool

Without
Stool

Rank
With

Rank
Without

d2

21

21

2.0

3.0

-1.0

1.00

24

23

7.0

0.0

0.5

0.25

28

27

10.0

10.0

0.0

0.00

22

24

4.5

8.0

-3.5

12.25

22

21

4.5

3.0

1.5

2.25

25

25

8.5

9.0

-0.5

0.25

21

20

2.0

1.0

1.0

1.00

21

23

2.0

6.5

-4.5

20.25

23

21

6.0

3.0

3.0

9.00

10

25

22

8.5

5.0

3.5

12.25

Cashier

Total

Therefore,

58.5

271

It may be recalled that when this problem was phrased as a


non-parametric dependent two-sample mean comparison
test, the Wilcoxons Test proved to be insignificant. What do
you think is the significance or r s in this case? When the
introduction of a stool did not make a difference in
productivity, would it not make sense to forecast the with
stool condition by the without condition? With Stool
Productivity =f (without stool productivity)? Of course it
would. Now, what does this mean concerning the significance
of rs? Youll find out in the next section. Meanwhile ponder
the problem and note that
=0.4167 which means
that 41.67% of the variation in with-stool-productivity is
caused by without-stool-productivity.

Correlation analysis examines the functional relationship


between dependent and independent variables. The question
arises after an r or r s has been calculated whether the
relationship that is expressed by it happens to be meaningful
or not. The question is: what is meaningful? Meaningfulness
may be defined in a statistical sense and an operational sense.
And so it is in any correlation analysis. The decision maker
has to look at significance (meaningfulness) from two
perspectives. Let us look at the easier onestatistical
significance first.
As in the case of the regression coefficient b, the regression
coefficient r must be tested for significance; for, if it turns
out to be insignificant, only a change relationship exists
between the dependent and independent variables. In those
correlation analysis that assume a bivariate normal
distribution (parametric correlation), the t test may be used
as before in the case of b. For the Spearman Rank Order
Correlation (non-parametric case), significance values
developed by E. Olds must be used. Let us run the test for
the parametric illustration first. The steps in the algorithm
are as follows:

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

272
Notes
__________________
__________________
__________________
__________________

II

P .05

__________________
__________________
__________________
__________________
__________________
__________________

III

Note that this t-value is exactly equal to the t-value that was
obtained before when testing the significance of b. Minor
rounding errors may be expected.
EOV > MEV or 7.11> 2.353 (one-tailed)
or 7.11> 3.182 (two-tailed)
The correlation coefficient is not zero.
There is a statistically significant correlation between
advertising expenditures and sales revenues.
The direct relationship between r and b was noted. Therefore
the equality of the t value for both r and b in Step 3 of the
algorithm does not come as a complete surprise. Obviously
if b is known to be significant, r is also significant and vice
versa. You notice that the significance test for r is much easier
to perform manually than the test for b. It may pay therefore
to test the significance of r if you want to find out the
significance of b. Similarly, as seen in the multiple
regression case many computer programs provide only
the significance for b. If you happen to be interested only
in a correlation analysis, run the regression equation
to obtain b, test its significance and you know whether the r
in which you are interested is significent or insignificent.

273

Let us now turn to the nonparametric correlation illustration


and the application of the Olds tables. Our experimentally
obtained value was rs =0.6455. Remember that we have an
answer outstanding from you. Using the two-sided alternative
hypothesis that was used in the corresponding means
comparison test, we find in the table, MEV= 0.564 for n = 10.
Since EOV>MEV we have obtained a statistically significant
correlation. Was that your outstanding answer a few pages
back? If so, you are getting the hang of it. And, by the way, it
makes good sense. If the means difference had been
significant, all we could expect was a chance relationship
for With-Stool-Productivity= f (Without-Stool-Productivity).
Now let us turn to the much more difficult answer concerning
the operational significance of a correlation. To give a
comprehensive answer, we consider three types of
correlations, all of which may be statistically significant but
which have vastly different operational implications for the
decision. The first one is known as causal correlation. Here
one variable causes the behaviour of the other. Of course
this is hoped for status in any regression study, e.g.,
advertising causes sales, or, sales = f (advertising). Could
the casualty be reversed? Nothing says that it cant give the
assumption of a bivariate normal distribution. Indeed, we
saw that in the Product Moment formula that it did not make
any difference which variable was dependent or independent.
Then sales cause advertising. Why not? It is the old chicken
and egg story. But note that you and the controller would
approach the decision itself (there are Rs 8 million involved,
after all) from a completely different perspective.
The second type of correlation may come about because of
co-dependence. In this case the behaviour of the dependent
and independent variables is caused by an intervening
variable or variables. In the sales advertising study, sales =
f (advertising) may be statistically and operationally
significant but only because our customers have sufficient
disposable income to buy. You can see that it takes much
more for a good forecaster than to throw around a few
formulae. It takes a sound conceptual model.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

274
Notes
__________________
__________________
__________________
__________________

The third type of correlation is due to coincidence. There is


no logical tie between the variables, but there may be a
beautiful statistical significance nevertheless. For example,
the correlation between the growth of a child and a plant
such, correlations need not be considered in decision making.

__________________
__________________
__________________
__________________
__________________
__________________

Time has strange, fascinating and little understood


properties. Virtually every process on earth is determined
by a time variable. One of the most frequently encountered
managerial decision situations involving forecasting is to
measure the effect that time has on the sales of a product,
the market price of a security, the output of individuals, work
shifts, companies, industries, societies and so on.A
fundamental conceptual model in all of these situations
is the product life cycle concept which goes through four
stages introduction, growth, maturity and decline. Let us
look at this concept in greater detail before we apply it.
Maturity

Decline

Growth

Figure 13.5: Product Life-Cycle

275

The sales performance of this product goes through the four


stagesintroduction, growth, maturity and decline. Data
have been plotted and regression lines fitted to each of the
four environments. Thus, when a sales forecast is made and
the target horizon falls within the same stage, the linear fit
yields valid results. If, however, the target horizon falls into
a future stage, a linear forecast may be erroneous. In this
case a curve should be fitted as shown. It is usually lightly
speculative to select a forecasting horizon that spans more
than two stages.
Another point of interest is the behaviour of the sales variable
over the short run. It fluctuates between a succession of peaks
and troughs. How do these come about? In order to answer
this question, the time series, must be decomposed. Then
four independent motors for this behaviour become visible.
First, there is a long-term or secular trend (T) which is
primarily noticeable within each stage of the cycle and over
the entire cycle. Secondly, cyclical variations (C) which are
caused by an economys business cycles affect product sales.
Such cycles, whose origins are little understood, exist for all
economies. Thirdly, the products sales may be influenced
by the seasonality (S) of the item, and finally, there may be
the irregular (I) effects of inclement such as weather, strikes
and so forth. In equation form the decomposed time series
appears as TS = T + C + S + I.
This creates a complex situation in time series analysis. Each
factor must be quantified and its effect ascertained upon
product sales. Let us see how this is done. The long-term
trend effect T is reflected in the slope b of the regression
equation. We already know how b is calculated even though
minor modifications of the decision formulae will be
encountered soon. The quantification of the cyclical
component C is beyond the scope of this book. However, since
business cycles always proceed from peak to trough to new
peak and so on, their positive and negative effects upon a
products sales cancel out in the long-run. Hence in
managerial, as opposed to economic, decision making, the
sum effect of the business cycles may be set equal to zero.
This eliminates the C factor from the equation. Seasonality,
if present, is something that must be taken into consideration

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

276
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

because it is a product-inherent variable and therefore, it is


under the immediate control of the decision maker. We will
quantify the S component and keep it in the equation.
Finally, there are the irregular variations. Do we know in
July whether the weather will be sunny and mild during the
four weeks before Diwali? We dont, but we know that if this
happens, Diwali sales will be severely impacted. Can we
forecast such horrible weather conditions? Not really. We
cannot forecast them because they cannot be quantifieda
rather unpleasant characteristic they share with all other
type of irregular variations like strikes, earthquakes, power
failure, etc. Yet, something strange usually happens after
such an irregular variation from normal has occurred.
Whatever people did not do because of it, like not buying a
product, they attempt to catch up with quickly. Therefore,
the I factor effect may also be assumed to cancel out over
time and it may be dropped from the equation which then
appears to the manager as TS = T + S.

We will construct again the best fitting regression line by


the method of least squares. In order to illustrate the
procedure, let us use a data set from Case Study 13.1 at the
end of the chapter. It involves the dividend payments per
share of the Smart, a well-known discount store chain, for
the years 1990 through 1999. Suppose that a potential investor
would like to know the dividend payment for 2001. The data
are recorded in the work sheet (Figure 13.6) that appears
below. First, however, turn your attention to Figure 13.6
which shows the plot for this problem.

Figure 13.6: Plot of Dividend Values

277

Think for a moment about the qualitative nature of the time


variable. It is expressed in years in this case but could be
quarters, months, days, hours, minutes or any other time
measurement unit. How does it differ from advertising
expenditures, the independent variable that we examine in
the preceding section? Is there a difference in the effect that
a unit of each has on the dependent variable, or, Rs 1 million
in one case and 1 year in the other? Time, as you can readily
see is constant. One year has the same effects as any other.
This is not true for advertising expenditures, especially when
you leave the linear environment and enter the non-linear
environments. Then there may be qualitative difference in
the sales impact as advertising expenditures are increased
or decreased by unit.
Since time is constant in its effect, we may code the
variable rather than to use the actual years or other time
units x values. This code assigns a 1 to the first time
period in the series and continues in unit distances to
the n th period. Do not start with a zero as this may
cause some computer programs to reject the input. The
code is based on the fact that the unit periods are constant,
and therefore, their sum may be set equal to zero. See
what effect this has on the normal equations for the straight
line.

the equations reduce to

which allow the direct solution for a and b as follows

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

278
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

This form simplifies the calculations substantially compared


to the previous formulae. The code, however, that allows to
set x = 0 must incorporate the integrity of a unit
distance series. Thus, if the series is odd-numbered, the
midpoint is set equal to zero and the code completed by
negative and positive unit distances of x=1, where each x
unit stands for one year or other time period. If the series is
even-numbered, let us say it ran from 1990 to 1999, the two
midpoints (1994/1995) are set equal to-1 and +1, respectively.
Since there is now a distance of x = 2 between + 1 (-1, 0, +1),
the code continues by negative and positive units distance
of x = 2 where each x unit stands for one-half year or other
time period.
The worksheet is in Table 13.3 and calculations are as follows:
Table 13.3: Worksheet

YEAR

Code for
an Even
Series X

1990

-9

1991

-7

1991

-4

2.2

-8.8

16

1992

-5

1992

-3

2.4

-7.2

1993

-3

1993

-2

3.0

-6.0

1994

-1

1994

-1

5.0

-5.0

1995

1995

6.8

1996

1996

8.1

8.1

1997

1997

9.0

18.0

1998

1998

9.5

28.5

1999

1999

9.9

39.6

16

Total

55.9

67.2

60

YEAR

Code for
an Odd
Series X

Dividend
payments
in Rs Y

XY

X2

= .

279

The regression equation is plotted in Figure 8.12. Note that


in the case of time series analysis, the origin of the code and
the x units must be defined as part of the regression equation.
In our problem the investor would like to obtain a dividend
forecast for 2001. Since the origin is 1995 (x = 0) and x = 1
year units, the code for 2001 is x = 6. Therefore the forecast
is yc = 6.211+1.12 (6) = Rs 12.9. If the time series had been
even numbered, let us say that dividend payments for 1990
had been included in the forecasting study, the definition
under the regression equation would have read

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Thus, we know that for 1995, x=1; and since we must use x=2
units for each year, the code value for 2001 would be x=13.
Once the y c value has been obtained, b is tested for
significance and the 95% confidence interval constructed as
previously shown.
Time series analysis is a long-term forecasting tool. Hence,
it addresses itself to the trend component T in our time
series equation TS = T + S. In the dividend forecast, b=1.120
was calculated which means that in the environment that is
reflected in the set, Smart increased the dividend payments
on an average by Rs 1.12 per year. Let us now turn out
attention to the seasonal variation component, that may be
present in a time series. A products seasonality is shown by
the regularly recurring increases or decreases in sales or
production that is caused by seasonal influences. In the case
of some products, their seasonality is quite apparent. As an
obvious example virtually all non-animal agricultural
commodities may be cited. Seasonality of other products may
be more difficult to detect. Take hogs in order to stay on the
farm. Are they seasonal? They are lusty breeders and could
not care less about seasonal influences. Yet, there is an
induced season by the corn harvest. If corn is plentiful and
cheap, farmers raise more hogs. This is known as the cornhog cycle. Or take automobiles, Indian manufacturers are
used to introduce major design or technological changes once
every generation. This season has now been shortened
somewhat. How about computers? There the season even has
a special name. It is called a generation and prior to increased
competitive pressures within the industry it used to be about

280
Notes
__________________
__________________
__________________
__________________

seven years long. Our stock market investor knows that stock
trades on the Stock Exchanges are seasonal. The daily season
is V-shaped starting the trading with a relatively high
volume which tapers off toward the lunch hour to pick up
again in the afternoon. And so it goes with many other
products, not ordinarily thought of as being seasonal.

__________________
__________________
__________________
__________________
__________________
__________________

Let us quantify this seasonality and illustrate how it may be


used in a decision situation. There are, as is often the case, a
number of decision tools that may be applied. The reader
may be familiar with the term ratio-to-moving-average. It is
a widely used method for constructing a seasonal index and
programs are available in larger computer libraries. Usually,
the method assumes a 12 - period season like the twelve
months of the year. There is a more efficient method which
yields good statistical results. It is especially helpful in
manual calculations of the seasonal index and when the
number of seasonal periods is small like the four quarters of
a year, the six hours of a stock exchange trading day or the
five days of a work week. This method is known as simple
average and will be used for illustration purposes.
To stay with the investment environment of this chapter
section, let us calculate a seasonal index for shares traded
on the Stock Exchange from July 2 through July 7, 1999. This
period includes the July 4 week-end. Volume of shares (data)
for each trading day (season) is given in thousands of shares
per hour. The Individual steps of the analysis (operations)
are discussed in detail for each column of the worksheet
below.
Table 13.4: Worksheet

Hour

Column (2)

(3)

(4)

(5)

Total
Variation
(TS)

Trend
Variation
(T)

Seasonal Seasonal
Variation Index
TS-T

(6)

(7)

(8)

(9)

(10)

7/2

7/3

7/6

7/7

Avg.

10-11

0.965

0.965

110.6

12.00 12.25

15.44

16.72

14.10

11-12

0.245

0.159

0.086

103.7

10.40 11.75

15.04

16.32

13.38

12-13

-0.885

0.318

-1.117

94.2

10.55 10.06

13-14

-1.555

0.477

-2.032

87.1

14-15

0.395

0.636

-0.241

15-16

0.835

0.795

0.040

Average

-0.383

12.95

15.44

12.25

9.46

12.05

15.24

11.58

101.1

11.02 11.55

14.82

16.73

13.53

103.3

11.58 12.25

15.38

16.69

13.97

600

10.85 11.22

14.28

16.19 13.135

9.55

281

As you inspect the data columns, you notice the V-shaped


season for each trading day. You also notice in the total daily
volume that there is an increase in shares traded. Hence,
you can expect a positive slope of the regression line. The
hourly mean number of shares is also indicated. This is the
more important value because we are interested in
quantifying a season by the hour for each trading day. Now
turn to the operations. In last column, the hourly trading
activity for the four days has been summed. In this total all
time series factors are assumed to be incorporated. You will
recall that the positive or negative cyclical and irregular
component effect is assumed to cancel out over time. Hence
averaging the trading volume over a long term data set
eliminates both components, yielding TS=T+S. You may ask,
are four days a sufficiently long time span? The answer is
NO. In a real study you would probably use 15 to 25 yearly
averages for each trading hour. In an on-the-job application
of this tool, you will have to know the specific time horizon
in order to effectively eliminate cyclical and irregular
variations. But by and large, what is a long or short time
span depends upon the situation.
In order to isolate the trend component (T) so that it may be
subtracted from column (2) in the Table 13.4, yielding
seasonal variation, the slope (b) of the regression line must
be calculated. (Remember: b is T.) The necessary calculations
are performed below using the mean hourly trading volume
for each day. But since we are interested in an index by the
hour, the calculated daily b value must be apportioned to
each hour. This is accomplished by a further division by six
the number of trading hours. The result is entered in column
(3). Note that the origin of a time series is always zero. The
origin of the time series is always the first period of the
season. In our case this is the 10-11 trading hour. Therefore
the first entry in column (3) is always zero to be followed by
the equal (since this is a linear analysis) summed increment
of the apportioned b-value.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

282
Notes

Table 13.5

__________________
__________________

Day

Code

__________________

x1

__________________
__________________
__________________
__________________

7/1
7/5
7/6
7/7
Total

-3
-1
1
3

Average Hourly
Trading Volume
Per Day
y

xy

x2

10.85
11.22
14.28
16.19
52.54

-32.55
-11.22
14.28
48.57
19.08

9
1
1
9
20

__________________
__________________
__________________

and the apportioned b-value is

It is not necessary to calculate the y-intercept (a) in this


analysis unless of course, you wish to combine it with a longterm forecast of daily trading volume. Then, just to review
the calculations, you would find:

283

In column (4) TS - T = S is performed. Column (4) is already


a measure of seasonal variation. But in order to standardize
the answer so that it may be compared with other stock
exchange, for example, it is customary to convert the values
in column (4) to a seasonal index. Every index has a base of
100 and the values above or below the base indicate
percentages of above or below normal activity, hence the
season. Since the base of column (5) is 100, the mean of the
column should be 100 and the total 600 since there are 6
trading hours. In order to convert the obtained values of
column (4) to index numbers, each of its entries is added to
the total mean and then is divided by the column mean added
to total mean and multiplied by 100 yielding the
corresponding entry in column (5). It is customary to show
index numbers with one significant digit.
Column (6) shows the seasonal effect of this decision
variableshare trading on the Stock Exchange. Regardless
of heavy or light daily volume, the first hour volume is the
heaviest by far. It is 7.4% above what may be considered
average trading volume for any given day. Keep in mind that
a very limited data set was used in this analysis and while
the season, reaching its low point between 1 and 2 p.m., is
generally correctly depicted, individual index members may
be exaggerated. What managerial action programs would
result from analysis such as this? Would traders go out for
tea and samosas between 10-11? How about lunch between
1-2? When would brokers call clients with hot or luke-warm
tips? Assuming that a decrease in volume means a decrease
in prices in general during the trading day, when would a
savvy trader buy? When would he sell? Think of some other
intervening variables and you have yourself a nice little bull
session in one of Dalal Streets watering holes. If, in addition,
you make money for yourself or firm, then, you have got it.

Any number of different curves may be fitted to a data set.


The most widely used program in computer libraries, known
as CURFIT, offers a minimum of 5 curves plus the straight
line. The curves may differ from program to program. So,
which ones are the best ones? There is no answer. Every

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

284
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

forecaster has to decide individually about his pet forecasting


tools. We will discuss and apply three curves in this section.
They appear to be promising decision tools especially in
problem situations that in some way incorporate the life cycle
concept and the range of such problems is vast, indeed.
If you take a look again at Figure 13.5, you see that three
curves have been plotted. As we know from many empirical
studies, achievement is usually normally distributed. Growth,
on the other hand, seems to be exponentially distributed.
The same holds true for decline. As the life cycle moves from
growth to maturity, a parabolic trend may often be used as
the forecasting tool. These are two of the curves that will be
considered. The third one is related to the exponential curve.
As you look at the growth stage and mentally extrapolate
the trend, your eyes will run off the page. Now, we know
again from all sorts of empirical evidencethat trees dont
grow into the high heavens. Even the most spectacular
growth must come to an end. Therefore, when using the
exponential forecast, care must be taken that the eventual
ceiling or floor (in the case of a decline) is not overlooked.
The modified exponential trend has the ceiling or floor build
in. It is the third curve to be discussed.
One final piece of advice before we start fitting curves. If
you can do it by straight line, do it. For obvious reasons, just
look at Figure 13.5, any possible errorand there is always
a built-in five percent chanceis worse when a curve is fitted.
By extending the planning and forecasting horizon over a
reasonable shorter period rather than spectacular but
dangerous longer period, the straight line can serve as useful
prediction tool.
The Parabola Fit
The parabola is defined by

Where a, b and c are constants a and b have been dealt. c can


be treated as acceleration. The normal equations are (method
of least square).

285

Setting

Notes

__________________

as previously explained,

will also be zero.

__________________
__________________

__________________

__________________

__________________

__________________
__________________

There are direct formulae for a and c as well, but because of


the possible compounding of arithmetic error in manual
calculations, it is safer to solve a and c algebraically in this
case.
To illustrate the parabolic trend let us forecast earnings per
share in dollars for Storage Technology Corporation for the
years 2000 and 2001. Storage Technology manufactures
computer data storage equipment, printers, DVD-ROMS and
telecommunication products. The company was founded in
1969 and after going through a period of explosive growth
seems to be moving into the maturity stage. Data, code and
calculations are shown below in the usual worksheet format.
Year

Earnings
Per
Share
y

xy

x2

x 2y

x4

1993

-3

0.39

-1.17

3.51

81

1994

-2

0.54

-1.08

2.16

16

1995

-1

1.13

-1.13

1.13

1996

1.58

1997

1.72

1.72

1.72

1998

2.50

5.00

10.00

16

1999

1.84

5.52

16.56

81

Total

9.70

8.86

28

35.08

196

Then

Code

__________________
__________________

286
Notes

and solving simultaneously

__________________
__________________

__________________
__________________
__________________
__________________
__________________
__________________

Therefore,

__________________
__________________

and specifically,

Remember that the data set is small. Quarterly earnings per


share figures for the period may have been better because of
the larger sample size. The significance test and construction
of the confidence interval is performed as previously shown.
Furthermore, as soon as new earnings per share figures
become available, the regression line should be re-calculated,
because there is always the chance that there may be a change
in the environment.

This illustrative forecasting study is performed for the Acme


Company Ltd. that manufactures toy rubber ducks to be used
in bathtubs. Over the past few quarters, the company has
become a major defence contractor. The Navy is buying an
ever increasing number of the ducks as part of its rearmament
program. Shipment figures are kept secret to confuse the
enemyand the media. Therefore, the data in the
accompanying table are hypothetical. We may fit an
exponential trend which takes the form

287

As previously mentioned, exponential trends are difficult to


plot, because you run very quickly off the top of the page.
However, when using semi-log paper (the y-axis is in
logarithmic scale), the trend appears as a straight line. This
phenomenon may be used to good advantage when
calculating a and b.
Thus using the logarithmic form of the exponential trend

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The straight line equations may be used, or,

__________________
__________________

when x = 0. The data set and calculations appear in the


worksheet below. Logarithms are obtained from a pocket
calculator or any standard table.

Total

Quarter Since
Initial
Navy
Contract

Code

Shipments
in Thousands
of Units
y

log y

x log y

x2

-5

0.30103

-1.50515

25

-3

0.60206

-1.80618

-1

0.954243 -0.954243

20

1.30103

1.30103

55

1.740363

5.221089

110

2.041393 10.206965

25

7.241149 12.463511

70

288
Notes
__________________

when expressed in logarithmic form.


equation is

The regression

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

The equation may be used in this form for forecasting


purposes. Suppose that Acme would like to have a forecast
for the next two quarters.
The forecasts are

__________________

If you transform the logarithmic form of the regression


equation, there is something interesting to be seen if you
remember the compound interest formula. It works like this-

which may be re-written

And you recognize that it takes the form of the compound


interest formula where the rate is 0.51 or 51%. This is Acmes
average quarterly increase in its defence business.

In any case other than military procurement, except in those


countries that have bled themselves dry because of it and
now have neither the money for military extravaganza nor
civilian necessities/amenities, trees dont grow into the high
heavens. Given this profound observation, there must be a
decision tool that places a cap or ceiling on overly exorbitant
growth forecasts. But they may want to consider the other

289

option: exponential economic declines do not always result


in a merciful state of down and out but gradually approach a
floor which may be called subsistence, making do, squeezing
by or other nice and flowery allegories. At any rate the
asymptote of the modified exponential curve is,

Notes
__________________
__________________
__________________
__________________
__________________

Where k is the asymptote, provides us with such a tool. There


are four cases.

__________________
__________________
__________________
__________________
__________________

Figure 8.16

A least squares fit is not efficient in this case. Rather a


solution method is discussed that is based on the theorem
that the ratio of successive first differences between points
on the exponential curve is constant and equal to the slope b.
The decision tool is known as the method of semi-averages.
It is based on the calculation of three sums of successive
points of the time series. Therein lies the limitation of this
technique, because the number of data points must be
divisible by three. Thus, a minimum of six points is necessary
and if the time series consists of n=20 data points, the two
earliest one (to preserve the most relevant environment)
must be eliminated. The formulae for a, b and k are generated

290
Notes
__________________

as follows from six general y-values starting with the origin


of the series.

__________________

__________________
__________________
__________________

__________________
__________________

__________________
__________________
__________________

-1

-1
- 1)(

- 1)

or, in the general case involving a time series of n data points


and n is divisible by three.




-1
)

-1
-1)(

-1)

291

Suppose a set consists of the following data points.


Year

Code

Sales Units

Notes
__________________
__________________

1995

__________________

100

1996

160

1997

200

1998

230

1999

245

2000

250

1 = 260

__________________
__________________

2 = 430

__________________
__________________

3 = 495

__________________
__________________

Then

and

which makes it a Case 4 curve with a sales ceiling of 267.10.


The forecast is made in the usual manner. For 2001 it is

__________________

292
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

At this point we have a fairly good understanding about how


to approach, a forecasting problem with one independent
variable. We know how to fit a straight line or three widely
applicable curves to a data set. But even as we generated
the functional relationship of our first problemSales= f
(Advertising) the thought must have occurred to us if
advertising were the only predictor variable in this case. And
as you can sense, the decision situation will be considerably
broadened thereby. It is not only a matter of incorporating
more variables into the decision space, or, mathematically
speaking
the
move
from
y=f(x)
to
y
=
f (x1, x2,...,xn) and in the form of the regression equation from
y=a+bx to yc = a + b1 x1 + b2 x2 + ...... + b n xn, but rather by
testing additional variables effect and then by substituting
another one, we are able to simulate various environments.
In this sense multiple regression analysis may not only be a
more sophisticated forecasting tool, but it can nicely serve
to sharpen the decision makers understanding of the
forecasting environment and thus, serve as a managerial
training device.
To get us started in this new light of somewhat more
sophistication, a few background review comments may be
in order. The relationships between independent and
dependent variables in the multiple regression problems are
assumed to be linear. The methodology for a non-linear
multiple regressions have not really advanced beyond the
research stage. This does not mean that inherently non-linear
variables cannot be accommodated within the analysis
system, but they must be transformed. The growth situation
of the exponential trend problem comes to mind. Remember
that we then introduced linearity back into the picture by
using the logarithm of the variables datay in that case.
We transformed the data to make them appear linear. Other
techniques than logarithmic transformation are available.
The fitting technique for multiple regressions is again the
least squares method which is based on the Gauss-Markov
Theorem. It holds that when the population variables are
fixed and the Y-variable random, the variances of the sub-

293

population y s for corresponding x s are equal, and the


probability of the system being in a certain state
(environment) initially is the same probability throughout
the system, the best estimation of the population regression
y = a + bx by the sample regression yc = a+bx is the method
of least squares. In Figure 8.3 we noted the deviation y-yc,
which is known as stochastic disturbance or simply error
(e). You remember that the method of least squares minimizes
the sum of the squared error. Further, it is assumed that the
effort is normally distributed, hence y is normally distributed
and therefore, we can test a and b by t-test after estimating
s a and s b. We did only the latter, but soon you will see
computer printouts that include the former as well.
Finally a few comments may be made about operational
aspects. The normal equations are as follows for the multiple
regression problem

This is the general system of equations and we will consider


an actual problem shortly. By comparing the normal
equations for the straight line with the ones above, you can
readily see how they were developed and you can easily
generate the equations for 10 or 15 variable problems as
frequently encountered in econometrics. What is not so easy,
indeed it may be impossible, is to solve this many equations
simultaneously. Here matrix algebra helps. But even then
manual calculations are forbidding. Therefore, multiple
regression analysis must be performed by computer and we
will not even bother with a simplified manual illustration as
before. However, so that we may have an inkling about the
computational procedures, let us take a look at the multiple
correlation formula for a problem with two independent
variables
+

+ 2 12
1-

13 23

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

294
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Where R and r are the multiple and simple correlation


coefficient, respectively, and y=1, x1 = 2 and x2 = 3. As you
can see the procedures rest on performing iterative simple
correlation analysis. This, of course, is what computers love
to do. Managers dont.
Having solved this operational problem, let us look at the
second and trickier, one. It has to do with a relaxation of the
underlying assumptions of regression analysis. Such
relaxation may be necessary because of aspects that exist in
the environment. First, it is assumed that the error (e) terms
a, b, g, d are independent, or re = 0. In trend analysis,
however, a long term increase or decrease virtually assures
dependency, or, r e 0. This is known as serial or auto
correlation. Tests and significance tables have been
prepared. The Drubbing-Watson statistic, soon to be
encountered in our computer analysis, allows the
identification of serial correlation. Secondly, it is assumed
that the independent variables x1, x2, ..., xn are mutually
independent. If this assumption is relaxed, sb tends to become
large resulting in an insignificant b. Small potatoes, as you
recall. This problem is known as multi-collinearity, and
techniques are available to test for its existence. Remember,
however, that the logic of a well-designed conceptual model
should identify the problem already at the stage of
forecasting. If it exists, there are only two ways out: get
additional data, or more typical, get other variables.
There are three more, though minor, problems that may arise
because of the underlying assumptions. The first one was
mentioned already and is only repeated here. The assumption
is made of linearity or constant variance s2. It has the name
of homoscedasticity. We already know that the behaviour of
some variables tends to result in nonlinearity and that some
linear variables as shown in Figure 8.5 may display
nonlinearity at certain magnitudes. It was suggested to use
transformations in those cases. This is sound advice as long
as the forecaster knows which way s2 varies. If unknown,
the transformation cannot be performed. The second problem
involves lagged variables such as in
yc = a + b1 x b + b2 xt-1

295

Where xt is the x value of time period t and xt-1 the x value of


the previous time period. We will encounter lagged variables
again in the later discussion about smoothing. Indeed, since
lagged variables may contain both multicollineari and serial
correlation, perhaps the initial forecasting method should
rely on smoothing. There are at present no meaningful
methods to cope with this problem in the regression
framework.

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Finally, there is the general assumption governing the whole


body of quantitative analysis, namely, the variables must be
quantifiable. We said before rather non-chalantly that this
would not pose a problem. May be not, but, as you may have
noticed, our discussion is moving away from the conceptually
simple decision problems into the more complex ones. In the
case of the first type, once the problem has been defined and
a solution method algorithm explained, the situation is clear.
The second type calls for extensive model building abilities
and training. Here the problem and algorithm are clear but
not necessarily how to quantify complex variables and to show
their interrelationships.
There are three qualitatively different types of variables in
any decision situation. They may be called concrete, abstract
and nonsense. Concrete variables can be measured easily.
Take heat, we obtain a physical measurement in centigrade.
Take advertising expenditure, we obtain a dollar or rupee
measurement. Now take beauty which, as anyone knows, is
in the eye of the beholder. This is an abstract variable. But it
is easily quantifiable by rankings as any beauty contest
demonstrates. The same holds true for a nonsense variable
concept as found in many product names. But now suppose
that one of the variables in a sales forecasting study is gender.
Let us say that it is known that women buy more of the
product than men. In this case, data collection procedures
assign a 1 when women respond and a 0 for the response of
men. Such variables are known as dummy variables and, as
of now, this term has been removed from your catalog of
insults. Make it part of the model building knowledge. Here
is how it works.

__________________
__________________
__________________

296
Notes
__________________
__________________
__________________
__________________
__________________

Take
yc = a + b1 x1 + b2 x2
Where yc is sales revenue, b1 advertising expenditures and
b2 gender. In this equation yc and x1 are quantitative and x2
qualitative in nature. Now setting

__________________
__________________
__________________

we obtain the following equations

__________________
__________________

You notice that the slope of the regression lines is b 1, but


that a the y - intercept differs. In the case of Woman
response, the line shifts upward, parallel to the all other
responses line. Similarly regression equations may be solved
when all independent or the dependent variables are
qualitative in nature. The disadvantage of the dummy
variable, however, is the fact that it can take only the 0 or 1
value. There are methods that allow qualitative rankings
with more than 2 response choices like agree, dont know,
disagree. Such a variable is known as polychotomous and
just from the term you know that it is beyond the scope of
this book. But if you ever meet such a variable in some dark
decision situation, brush up on multivariate analysis. That
allows you to handle this stranger.
Now let us look at an old friend in new clothes and
demonstrate how multiple problems are actually solved.
From now on it is strictly computer work and every computer
library has one or more MULREG programs, as they are often
called. (To use the most used problems we will use
Regression Analysis in Tools section of Microsoft Excel
97). The problem is the same that we solved for our controller
via simple linear regression except that we change it from
Sales = f (Advertising) to Sales = f (Advertising, Percent
spendable income allocated to product type by our customers
as shown by a market research study, Time). Perhaps we
thought that an analysis like this would put us on the
promotional fast-track of the company. We shall see. First
the data set.

297
Sales
Rs lacs

Advertising
Rs lacs

Per cent
Allocated

x2

x3

30

2.8

13.8

40

4.2

55

Year

Time
Code

Notes
__________________

x4

__________________

1995

__________________

12.2

1996

__________________

5.0

15.0

1997

__________________

50

5.5

16.0

1998

70

7.5

18.0

1999

Let us mention the caveat again: this data set lands us in the
dog house rather than on the fast-track because of its paucity.
But we are interested only in an illustration of the
procedures. Note that the dependent and three independent
variables are called y and x respectively and x variable are
numbered consecutively. This is for data input purposes. Thus
the multiple regression equation is

Each computer system calls for slightly different instructions


and data input for the Mulreg program. By looking carefully
at the 'Microsoft Excel 97' procedures below, you can readily
infer how your system works.
Before we look at them let us explain the reasons for this
analysis. Remember that multiple regression analysis is as
much as forecasting tool as it is a managerial training device.
Obviously we emphasis the training aspect here. First we
want to perform the
regression analysis by using
'Regression Analysis' in 'Microsoft Excel 97' and you will see
how quickly we can verify the results to the manual
calculations that we suffered through a number of pages ago.
So the first regression that we perform is simple and reads

Next we want to do the whole Schemer. You get

and you see how nice and easy life is for the quantitative
decision maker.

__________________
__________________
__________________
__________________
__________________

298
Notes

The third regression is a time series analysis, or,

__________________
__________________
__________________

and

__________________
__________________
__________________
__________________
__________________
__________________
__________________

As you realize, by now, in this little training exercise we have


incorporated all the regression options that have been
discussed so far except the dummy variable. Its inclusion,
let us say that the product was seasonal and we had
differentiated between high season and all others, would
pose no problem, except that then we would have had to
change the sample size and could not have used the original
problem. It works like this. In this multiple regression
analysis

or, to rewrite this assumption in terms of the

where k is the number of


or variables in the equation.
Otherwise a degenerate model results and thats why the
dummy variable was not included. You may remember this
point when you run your own multiple regression problem
with 15 variables.
In order to perform these five regressions, the instructions
about the 'Regression' dialog box are shown in Exihibit 13.1.
Input Y range
Enter the reference for the range of dependent data. The range must consist of a
single column of data. Here the sales data is the Y range.
Input X Range
Enter the reference for the range of independent data. Microsoft Excel orders
independent variables from this range in ascending order from left to right. The
maximum number of independent variables is 16. In this question the inputs have to
be specified for each of the regression calculation separately.
Labels
Select if the first row or column of your input range or ranges contains labels. Clear
if your input has no labels; Microsoft Excel generates appropriate data labels for the
output table.
Confidence Level
Select to include an additional level in the summary output table. In the box, enter the
confidence level you want applied in addition to the default 95 per cent level.

299
Constant is Zero

Notes

Select to force the regression line to pass through the origin.


Output Range
Enter the reference for the upper-left cell of the output table. Allow at least seven
columns for the summary output table, which includes an anova table, coefficients,
standard error of y estimate, r2 values, number of observations and standard error
of coefficients.

__________________
__________________
__________________
__________________

New Worksheet Ply

__________________

Click to insert a new worksheet in the current worklook and past the result starting at
cell A1 os the new worksheet. To name of new worksheet, type a name in the box.

__________________

New Workbook

__________________

Click to create a new workbook and past the results on a new worksheet in the new
workbook.

__________________
__________________

Residuals
Select to include standardized residuals in the residuals output table.
Standardized Residuals
Select to include standardized residuals in the residuals output table.
Residual lots
Select to generate a chart for each independent variables versus the residual.
Line Fit plots
Select to generate a chart for predicted values versus the observed values.
Normal Probability Plots
Select to generate a chart that plots normal probability.

Exhibit 13.1: About the Regression dialog box

After we have discussed each regression, it is suggested that


you run this very problem with your computer program
unless you are already familiar with it.
Regression 1 (y = a + b2 x2).
Regression 1 (y = a + b2 x2). The print-out for this simple
regression problem is shown in Exhibit 13.1. It was
previously solved manually.

Exihibit 13.2: Summary Output of Regression of Sales (y 1)


and Advertising (x2)

__________________

300
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Let us proceed step-by-step through the print-out. The


correlation coefficient value shows a minor rounding error
between the manual and computer solutions (r= 0.9716 and
r = 0.9715784, respectively). The Anova matrix is of no concern
to us here. This matrix is followed by the values of the
regression equation where coefficient indicates the y intercept (a) value and the other X Variable 1 value the b
value that corresponds to the numbers of the independent
variables. Thus, yc = 6.429 + 8.514x, as previously obtained.
Note the Standard. Error and T-STAT. The latter is rounded
t = 7.11 for b and as previously calculated.
Let us now discuss the main regression statistics, the
-squared and -values. These are the adjusted values for
the multiple coefficients of determination and correlation,
respectively. These coefficients carry a positive bias in
unadjusted form, that is, an unadjusted
obtained by the
formula results in an overestimate of the true . In order to
find the true
the following adjustment formula is used

Since in this case of a simple correlation problem R2 =


r2 = (0.9715782)2. Then

while the printout shows 0.925286105. It should be noted that


the relatively large difference comes between the adjusted
and unadjusted R2 values, it is due to the small sample size.
With a meaningful sample size, the difference between R 2
and R2ADJ is usually negligible.
Regression 2 (y = a + b2 x2 + b3 x3 + b4 x4)
Now we are entering the realm of the multiple.
Notice the regression equation
y = yc = -8.3958 - 3.9583 x4 +0.9375x3 +11.0416 x2
and the T-STAT that is insignificant in every case. R2 ADJ has
dropped from 92.5% of the variation explained (Regression
1) to 80.4%. Obviously, the additional variables did not give

301

us a better forecasting tool. With an analysis like this, we


wont make the fast-track.
Regression 3 (y =a + b4 x4).

Notes
__________________
__________________
__________________

The b is significant again and R2 ADJ has improved. Overall it


does not look as good as Regression 1 as a forecasting tool.
Unless, after elimination of the serial correlation, its R 2 ADJ
represents an improvement over this value in Regression 1,
we might as well stay with Regression 1.
Regression 4 (y = a + b3 x3 + b2 x2)

__________________
__________________
__________________
__________________
__________________
__________________

Well, there is a flicker of hope in Table 13.6.


R2 ADJ has improved to 89% from 84% in Regression 3. But
look at the T-STAT of b3 which indicates an insignificant value
and b2 just makes it by one-tailed test with df = 2, or MEV =
2.92 according to the t table. Conclusion: no improvement
over Regression 1.
It shows you very nicely the interface between the conceptual
and mathematical models as you can test any linear variable
or combination of variables keeping in mind the time and
cost constraints that apply to data collection. It also teaches
you to differentiate between statistical and operational
significance which helps you to sharpen the conceptual model
building skillsthe most important asset of the forecaster.
Table 13.6: Summary Output of Regression of Sales (y) with
Advertising (x2) and Percent Allocated (x3)

__________________

302
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

If you have understood the basic aspects of forecasting up to


this point and are reasonably certain that you can use the
regression algorithm, you may not want to take this final
section of the chapter too seriously. Read it, nevertheless,
to know how the simple minded folk of forecasters spends
its days. Indeed, many authors feel compelled to point out
when introducing the subject of smoothing that it is used
when (a) real forecasting would be prohibitively expensive
(b) real forecasting talent has not found its way into an
organizations staff (c) real forecasting talent is not
necessary because in the case of the operations or firms
involved, it wouldnt amount to much anyway and so forth.
You remember the comment about how simple minded
decision making folk knows how to phrase its utterances to
make them sound scientific? Well, this is it. Final
introductory advice: Beware of smooth smothers.
What do such types do anyhow? For some psychological
reason they do not like the charming 1 and revealing little
wiggles and waggles (scientific: peaks and troughs) of time
series and smooth them out. They throw away the good stuff
and eat chaff. How do they do it? They average them out.
But that is all. No further sophistication is visible. Remember,
they are simple minded folk. Here is how it works. Suppose
that you have the stock closing prices of Joy Manufacturing
Co. for the month of April, 1999. During this month there
were 21 trading days with their inevitable wiggles and,
unfortunately, waggles. By calculating a moving average, they
may disappear. The moving averages are known as the
smoothed values and are obtained by

where S t is the smoothed value for time period t, m a


movement specification that is to be determined, x the data,
and n the number of adjacent data points that are to be
averaged. Furthermore m is defined by m = (n-1)/2. Nobody
will argue that this does not look very scientific, indeed.

303

Let us do it for a five-day moving average. The


which means that four data points are lost. That is, two daily
stock prices at the beginning of the series and two at the
end. Hence

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

where S t is the smoothed value for time period t, m a


movement specification that is to be determined, x the data,
and n the number of adjacent data points that are to be
averaged. Furthermore m is defined by m = (n-1)/2. Nobody
will argue that this does not look very scientific, indeed.
Let us do it for a five-day moving average. The
which means that four data points are lost. That is, two daily
stock prices at the beginning of the series and two at the
end. Hence

and so on. The lagging is necessary because of S t-1 which


obviously does not exist at X t-1. As you notice by doing a few
more St values manually, exponential smoothing uses the
weighted average of past time series values in order to
compute the smoothed values and it assigns greater weight
to the most recent time series values. The method assumes

304
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

no significant long-term trend or seasonal variation in the


data. So what kind of a forecasting tool is this? As pointed
out before, if smoothing is to be used at all, it should be used
over the very short term only. Perhaps a trader, as opposed
to investor, may base stock purchases or sales on exponential
smoothing. But check the effectiveness of this decision tool
yourself in the cases of the given two common stocks.
Disregarding intra-day trading highs and lows, how often
would our trader have been successful in buying or selling
the stocks of the companies if he had bid the estimated price
for each of the trading days? Then calculate his profits or
losses. The proof is in the pudding.

Let us conclude on this profound heuristic, the proof is in


the pudding. Forecasting is managements most important
task, but not many managers in the public or private sectors
are good forecasters. Oh, they may use beautiful mathematical
models and algorithms, but often they burn the pudding or,
if lucky, end up with a pie or roast. This feat is then smoothly
explained to the astonished multitudes as something that
had been expected all along like steel companies suddenly
having (claimed only) terrific oil potential or oil companies
having (claimed only) terrific merchandising potential for
clothes. But how about the kid who wants his pudding? He
will say something that his more sedate elders wont say:
whether it is burnt or ends up as pie or roast, the fact of the
matter is that there is no pudding. And this brings us to the
recipe. We called it the conceptual model. Unless the recipe
is sound lets forget about the pudding. And further, if
management suddenly fails to stir the potan unrecognized
variable has entered the decision space there wont be
much of a pudding either. And thats what forecasting is all
about. Next you will find a few case studies. See if you can at
least stir the pot. Better yet, take a few of the newly acquired
decision tools out of your little black bag, and show them
what a pudding is.

305
Notes
__________________
__________________

There are 2117 Smart stores at petrol stations in the India


(the chain is building up). At present Smart has reached an
upgrading phase like so many discounters before.

__________________
__________________
__________________
__________________

Given the data below, perform the indicated analysis.

__________________
YEAR

1999

1998

1997

1996

1995

1994

1993 1992

1991 1990

PER SHARE

19.0

17.5

20.7

28.4

27.4

23.9

21.1

16.1

8.5

11.1

DIVIDENDS

9.9

9.5

9.0

8.1

6.8

5.0

3.0

2.4

2.2

1.9

2.1

2.0

3.1

4.9

5.4

5.7

5.8

5.8

3.3

5.3

EARNINGS

__________________

PER SHARE
PRE-TAX

__________________

MARGIN

(1) To what extent does the Board of directors regard


dividend payments as a function of earnings? Test
whether there is a significant relationship between the
variables. Use a parametric analysis.
(2) Find the linear forecasting equation that would allow
you to predict dividend payments based on earnings and
test the significance of the slope.
(3) Is there a significant difference in pre-tax margin when
comparing the periods 1995-1999 and 1990-1994. Perform
a non-parametric analysis. Explain the managerial
implications of your findings.

Employment figures in thousands for Neo-Classical City and


suburbs are given below. Perform the required analysis.
(1) Using linear forecasts, predict the year in which
employment will be the same for the two locations.
(2) Construct the NCC confidence interval for that year.
(3) Correlate the employment figures for the two areas
using both parametric and non-parametric methods and
test the significance of the correlation coefficients.

__________________

306
Notes
__________________
__________________
__________________

(4) Fit a modified exponential trend to SUB data and discuss


the results in terms of your findings in (1) above.
(5) Are NCC employment figures uniformly distributed over
the period 1994 through 2000?

__________________
__________________
__________________
__________________

YEAR

1994

1995

1996

1997

1998

1999

2000

NYC

64.1

60.2

59.2

59.0

57.6

54.4

50.9

SUB

20.7

21.4

22.1

23.8

24.5

26.3

26.5

__________________
__________________
__________________

Shown below are data sets that have been compiled by the
Reserve Bank of India and The Department of Commerce.
All amounts are in billions of rupees. Perform the following
analysis:
(1) Fit a modified exponential trend to Other Checkable
Deposits such as NOW accounts and predict the 2002
value. Compare this value to the actual one.
(2) Is there a significant difference in the Percent Cash
Purchases when comparing the first half of the series
against the second half?
(3) Predict Personal Consumption on the basis of Consumer
Credit in the amount of Rs 500 billion and test the
significance of the slope b.
(4) Predict Demand Deposits for 2002 by linear trend.
Year

Demand
Deposits
(Rs bn)

1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001

53.6
58.6
65.4
70.1
73.3
78.0
82.6
91.0
97.4
99.2
102.4
86.6

Other
Ccheckable
Extensions
(Rs bn)

0.4
0.5
0.6
0.8
0.9
1.6
3.2
4.8
7.8
17.7
27.4
74.4

Consumer
Personal
Credit
Consumption
Expenditures Purchases
(Rs bn)
(Rs bn)

187.1
215.8
240.8
269.0
269.4
280.7
318.2
373.5
424.2
465.8
449.3
477.2

634.1
692.6
767.0
834.3
914.1
1016.9
1127.9
1254.5
1416.6
1582.3
1751.0
1909.5

Percent
Cash
(Rs bn)

70.5%
68.8
68.6
67.8
70.5
72.4
71.8
70.2
70.1
70.6
74.3
74.1

307
Notes

After three months on the job at Exxon Oil Marketing Ltd.;


Nidhi Batraa recent MBA graduate was called into the
office of Mr Sunil, Vice President and General Manager. Mr
Sunil said that Nidhis initial progress at the firm had been
quite satisfactory and that, because she had acquired certain
quantitative and computer skills at the University and on
the job, she was to be put in charge of a sales and productions
study. The study itself was rather complex, but the technical
problems boiled down to the following.

__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

In order to get a better feel for a certain target market,


average annual family income in that market had to be
estimated. Before a sample could be drawn and income
figures obtained, Nidhi had to determine the proper sample
size. She set the following criteria: (a) maximum sampling
error not more than 10,000 above or below the true
population mean, (b) confidence interval should be 99%, and
(c) standard deviation of the population, based on a previous
study, was known to be 2000. What was the sample size?
Another aspect of the sales portion of the study called for
the determination of a seasonal index for trend. Nidhi
decided to use the method of simple averages and obtained
the following input data for the firms major sales outlet in
the target market.
Quarterly Sales in lacs
Year

Q1

Q2

Q3

Q4

1981

10

14

16

1982

18

16

22

24

1983

20

30

27

35

What was the seasonal index adjusted for trend for each
quarter?

Here two technical problems were encountered. The first


consisted of a production forecast for 2001 based on the
known estimating equating log yc =1.082 + 0.013x with origin

308
Notes
__________________
__________________
__________________
__________________
__________________

at July 1, 1994, x in one year units and y in thousand-ton


units. What was the forecasted tonnage?
Finally, in order to study Astras training program
effectiveness, Nidhi obtained pertinent data for five workers
and calculated a coefficient of determination. What was r2?
Worker

Hours of Training

Units Produced

__________________
__________________
__________________
__________________
__________________

Mindful that Mr Sunil would want to have some idea about


the implications, that is, validity or managerial explanation,
of the four value sets calculated, Nidhi wrote a brief
statement concerning each. What would you have advised
her to cover in each statement for the four value sets?

We do not know precisely why formerly great nations like


the Italian city states, Spain, France and England in this order
declined in economic and political importance during the
modern era. Exact data is not available. It is safe to assume,
however, that fiscal mismanagement was a major contributing
factor in each case. Today we do have data like the ones for
India. Get this from any good source. Perhaps one day
historians trained in statistics will perform appropriate
analysis and know precisely the reasons for this nations
decline.
Given your own understanding of our economic and political
situation today, briefly explain your findings after solving
the problems below.
(1) Perform ANOVA and Kruskal-Wallis comparing the
Surplus or deficit figures for the decades of the 1970s,
1980s and 1990s. What do the answers mean? Which test
is more appropriate?
(2) Fit a linear trend to the Surplus or deficit figures of the
1980s and predict the 2004 deficit figure within a 95%
confidence interval.

309

(3) Fit an exponential trend to the surplus or deficit figures


of the 1990s and predict the 2004 deficit figure. Compare
the forecasted value to the interval obtained in (2). What
conclusions do you draw?
(4) Calculate the parametric correlation between Receipts
and Outlays for the 1970s and test the significance.
(5) Calculate the non-parametric correlation between
Receipts and Outlays for the 1990s and test the
significance. Compare your answers for (4) and (5). What
does remain in each case and is it operationally
meaningful?

Graph is petrol additives manufacturer. Perform the analysis


specified below and briefly discuss your findings in terms of
the managerial if not national implications.
(1) Quarterly sales figures in thousands for car petrol
additives are given below
(a) Construct a seasonal index.
(b) Predict sales for the first and second quarters of
2001.
(c) Construct a confidence interval for the 2001 Q 2
forecast.
Q1

Q2

Q3

Q4

1998

1999

10

2000

10

10

12

(2) While car petrol additives has had difficulties getting


off the ground, Graph successfully introduced car diesel
additives. For a modified exponential trend and predict
2001 Q3 sales.
1998 Q1 Q2 Q3 Q4 1999 Q1 Q2 Q3 Q4 2000 Q1 Q2 Q3 Q4
2

20

40

70 150 200

250 400 750 50

Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

310
Notes
__________________
__________________
__________________

(3) Given data referring to sales of three wheeler full


additives and two wheeler fuel additives, calculate the
correlation coefficient by (a) parametric and (b) nonparametric methods and test the significance.

__________________

3-wheeler

20

40

60

30

20

70

__________________

2-wheeler

__________________
__________________
__________________
__________________
__________________

Glass company is headquartered in Indore, M.P. It


manufactures glass and plastic storage jars for petrochemical
industry.
Given the data below, perform the indicated analysis.
YEAR

2000

1999 1998

1997

1996

1995

1994

1993

1992

1991

EARNINGS
PER SHARE

1.25

2.89

2.81

3.09

3.52

2.97

2.71

2.13

1.59

1.77

1.36

1.36

1.28

1.20

1.07

1.00

0.90

0.80

0.72

0.72

1.1

4.6

5.2

5.5

8.9

7.2

10.1

7.9

7.3

9.7

DIVIDENDS
PER SHARE
PRE-TAX
MARGIN%

(1) To what extent does the Board of directors regard


dividend payments as a function of earnings? Test
whether there is a significant relationship between the
variables. Use a paramedic analysis.
(2) Find the linear forecasting equation that would allow
you to predict dividend payments based on earnings and
test the significance of the slope.
(3) Is there a significant difference in pre-tax margin when
comparing the periods 1996-2000 and 1991-1995?
Perform a parametric analysis.

311
Notes
__________________
__________________
__________________
33865

04131

78302

22688

79034

01358

61724

98286

97086

21376

__________________

09356

09387

52825

93134

21731

93956

85324

68767

49490

11449

__________________

98243

37636

64825

43091

24906

13545

90172

31265

81457

93108

99052

61857

33938

86339

63531

77146

33252

81388

28302

18960

00713

24413

36920

03841

48047

04207

50930

84723

07400

81109

__________________

34819

80011

17751

03275

92511

70071

08183

72805

94618

46084

__________________

20611

34975

96712

32402

90182

94070

94711

94233

06619

34162

64972

86061

04685

53042

82685

45992

19829

45265

85589

83440

15857

73681

24790

20515

01232

25302

30785

95288

79341

54313

80276

67053

99022

36888

58643

96111

77292

03441

52856

95035

30548

51156

63914

64139

14596

35541

70324

20789

29139

66973

53530

79354

75099

89593

36449

66618

32346

37526

20084

52492

77012

18480

61852

82765

29602

10032

78925

71953

21661

95254

04304

40763

24847

07724

99223

77838

09547

47714

13302

17121

76953

39588

90708

67618

45671

19671

92674

22841

84231

59446

34479

85938

26363

12025

70315

58971

28991

35990

23542

74794

28421

16347

66638

25578

70404

67367

14730

37662

64669

16752

58160

17725

97075

99789

24304

63100

22123

83692

92997

58699

96701

73743

82979

69917

34993

36495

47023

48869

50611

61534

55600

61672

99136

73925

30250

12533

46280

03865

88049

13080

55850

38966

46303

37073

42347

36157

44357

52065

66913

06284

47089

83871

51231

32522

41543

22675

89316

38451

78694

01767

26035

86173

11115

22083

12083

43374

66542

23518

05372

33892

74920

35946

21149

70861

13235

02729

57485

23895

80607

11299

44498

00498

31354

39787

65919

61889

17690

10176

94138

95650

80045

71846

17840

23670

77769

84062

52850

20241

06073

20083

15828

95852

12124

95053

09924

91562

09419

27747

84732

81927

04100

75759

37926

70040

80884

48939

65228

60075

45056

56399

69257

48373

58911

78549

63693

43727

81058

53301

85945

54890

33915

26034

08166

59242

03881

88690

92298

48628

02698

94249

83497

62761

68609

85811

40695

08342

67386

63470

85643

68568

46466

15977

69989

90106

01432

59700

13163

56521

96687

41390

03573

87778

27696

35147

54639

20489

03688

72254

28402

98954

02046

44774

31500

30232

27434

14925

65901

34521

94104

54935

68736

12912

02579

34719

09568

21571

91111

81307

97866

76483

10817

35729

44825

67304

40180

51054

06745

35539

82764

44618

36715

32588

87768

70033

79187

69967

26494

01600

32800

03147

39125

18169

75335

47246

79137

87167

59804

25724

83782

55780

75285

49456

79438

45855

07117

62301

42452

12294

43591

83547

__________________

__________________
__________________

312
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Table A. Future Value Rs 1: (1 + r)n

313
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

314
Notes
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________
__________________

Table C. Present Value of an Annuity of Rs 1 for n Periods:

315
Notes
__________________
__________________
__________________
z

0.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.0

0.0000

0.0040

0.0080

0.0120

0.0160

0.0199

0.0239

0.0279

0.0319

0.0359

0.1

0.0398

0.0438

0.0478

0.0517

0.0557

0.0596

0.0636

0.0675

0.0714

0.0753

0.2

0.0793

0.0832

0.0871

0.0910

0.0948

0.0987

0.1026

0.1064

0.1103

0.1141

0.3

0.1179

0.1217

0.1255

0.1293

0.1331

0.1368

0.1406

0.1443

0.1480

0.1517

0.4

0.1554

0.1591

0.1628

0.1664

0.1700

0.1736

0.1772

0.1808

0.1844

0.1879

0.5

0.1915

0.1950

0.1985

0.2019

0.2054

0.2088

0.2123

0.2157

0.2190

0.2224

0.6

0.2257

0.2291

0.2324

0.2357

0.2389

0.2422

0.2454

0.2486

0.2517

0.2549

0.7

0.2580

0.2611

0.2642

0.2673

0.2704

0.2734

0.2764

0.2794

0.2823

0.2852

0.8

0.2881

0.2910

0.2939

0.2967

0.2995

0.3023

0.3051

0.3078

0.3106

0.3133

0.9

0.3159

0.3186

0.3212

0.3238

0.3264

0.3289

0.3315

0.3340

0.3365

0.3389

1.0

0.3413

0.3438

0.3461

0.3485

0.3508

0.3531

0.3554

0.3577

0.3599

0.3621

1.1

0.3643

0.3665

0.3686

0.3708

0.3729

0.3749

0.3770

0.3790

0.3810

0.3830

1.2

0.3849

0.3869

0.3888

0.3907

0.3925

0.3944

0.3962

0.3980

0.3997

0.4015

1.3

0.4032

0.4049

0.4066

0.4082

0.4099

0.4115

0.4131

0.4147

0.4162

0.4177

1.4

0.4192

0.4207

0.4222

0.4236

0.4251

0.4265

0.4279

0.4292

0.4306

0.4319

1.5

0.4332

0.4345

0.4357

0.4370

0.4382

0.4394

0.4406

0.4418

0.4429

0.4441

1.6

0.4452

0.4463

0.4474

0.4484

0.4495

0.4505

0.4515

0.4525

0.4535

0.4545

1.7

0.4554

0.4564

0.4573

0.4582

0.4591

0.4599

0.4608

0.4616

0.4625

0.4633

1.8

0.4641

0.4649

0.4656

0.4664

0.4671

0.4678

0.4686

0.4693

0.4699

0.4706

1.9

0.4713

0.4719

0.4726

0.4732

0.4738

0.4744

0.4750

0.4756

0.4761

0.4767

2.0

0.4772

0.4778

0.4783

0.4788

0.4793

0.4798

0.4803

0.4808

0.4812

0.4817

2.1

0.4821

0.4826

0.4830

0.4834

0.4838

0.4842

0.4846

0.4850

0.4854

0.4857

2.2

0.4861

0.4864

0.4868

0.4871

0.4875

0.4878

0.4881

0.4884

0.4887

0.4890

2.3

0.4893

0.4896

0.4898

0.4901

0.4904

0.4906

0.4909

0.4911

0.4913

0.4916

2.4

0.4918

0.4920

0.4922

0.4925

0.4927

0.4929

0.4931

0.4932

0.4934

0.4936

2.5

0.4938

0.4940

0.4941

0.4943

0.4945

0.4946

0.4948

0.4949

0.4951

0.4952

2.6

0.4953

0.4955

0.4956

0.4957

0.4959

0.4960

0.4961

0.4962

0.4963

0.4964

2.7

0.4965

0.4966

0.4967

0.4968

0.4969

0.4970

0.4971

0.4972

0.4973

0.4974

2.8

0.4974

0.4975

0.4976

0.4977

0.4977

0.4978

0.4979

0.4979

0.4980

0.4981

2.9

0.4981

0.4982

0.4982

0.4983

0.4984

0.4984

0.4985

0.4985

0.4986

0.4986

3.0

0.4987

0.4987

0.4987

0.4988

0.4988

0.4989

0.4989

0.4989

0.4990

0.4990

0.09

To find the area under the curve between the mean and point 2.25 on the right, just look at value
under z of 2.2 and then 0.05 on the right. The value at the intersection of these two values is
0.4878. If we want to know the total are a under the curve to the left of this then we add this
value to 0.5 and if we want to know the area on the right we subtract it from 0.5. The method is
the same when the value of Z is negative. Therefore if Z = -2.25 then the value would be located
in the same way to get 0.4878 but the (-) sign would also come in to give you the value -0.4878.
The value to the left would still be found by adding 0.5 to it to give you 0.0228 and to the right
by subtracting it from 0.5 to give 0.9772.

__________________
__________________
__________________
__________________
__________________
__________________
__________________

S-ar putea să vă placă și