Sunteți pe pagina 1din 20


“Statistics may be defined as the collection, presentation, analysis and interpretation of numerical


1. To collect and present facts in a systematic manner.

2. Helps in formulation and testing of hypothesis.

3. Helps in facilitating the comparison of data.

4. Helps in predicting future trends.

5. Helps to find the relationship between variable.

6. Simplifies the mass of complex data.

7. Help to formulate polices.

8. Helps Government to take decisions.


A measure of central tendency is a single value that attempts to describe a set of data by identifying
the central position within that set of data.


1. It should be rigidly defined.
2. It should be simple to understand & easy to calculate.

3.It should be based upon all values of given data.

4.It should be capable of further mathematical treatment.

5.It should have sampling stability.

6.It should be not be unduly affected by extreme values.

Methods of central tendency.

1. Mean

2. Mode
3. Median

4. Geometric Mean
Harmonic Mean


Arithmetic Mean: It is the most common type of measures of central tendency.

It is obtained by dividing the sum of all observation in a series by the total number of

observation. The mean is the arithmetic average of all the observations in the data.
Merits of Arithmetic Mean:

1. Easy to calculate
2. Based on all observations

3. Capable of further mathematical calculations.

4. It is rigidly defined.
5. It is easy to understand & easy to calculate.

6. It is based upon all values of the given data.

7. It is capable of further mathematical treatment.
8. It is not much affected by sampling fluctuations.


1. Affected by extreme values.

2. Cannot be calculated in open-end series.

3. Cannot be graphically determined.

4. Sometimes misleading or absurd result

5. It cannot be calculated if any observations are missing.

6. It cannot be calculated for the data with open end classes.

7. It may be number which is not present in the data.

8. It can be calculated for the data representing qualitative characteristic.

The point or the value which divides the data into two equal parts., or when the data is

arranged in numerical order. The median is the middle value of an ordered set of data.

Merits of Median
1. It is rigidly defined.

2. It is easy to understand & easy to calculate.

3. It is not affected by extreme values.

4. Even if extreme values are not known median can be calculated.

5. It can be located just by inspection in many cases.

6. It can be located graphically.

7. It is not much affected by sampling fluctuations.

8. It can be calculated for data based on ordinal scale


1. Not based on all observations.

2. It requires arrangement of data.

3. Not capable o further algebraic treatment.

The Mode is simply the most frequently occurring observation ( score )in a distribution. The Mode is

the most frequently occurring value in a set of values.

The values of the variate which divide the total frequency into four equal parts are called quartiles.


The values of the variate which divide the total frequency into ten equal parts are called deciles.
The values of the variate which divide the total frequency into hundred equal parts, arte called


Define measure of dispersion

Measures of Dispersion
Dispersion in statistics is a way of describing how spread out a set of data is. When a data set has a large value, the
values in the set are widely scattered; when it is small the items in the set are tightly clustered. Very basically, this set of

data has a small value.The measure of dispersion shows the deviation/scatterings of the data. It tells the

variation of the data from one another and gives a clear idea about the distribution of the data. The
measure of dispersion shows the homogeneity or the heterogeneity of the distribution of the


Methods of dispersion
1. Range and Mean Deviation

2. Quartiles, Quartile Deviation and Coefficient of Quartile Deviation

3. Standard deviation and Coefficient of Variation

List the Characteristics of Measures of Dispersion

1. A measure of dispersion should be rigidly defined
2. It must be easy to calculate and understand

3. Not affected much by the fluctuations of observations

4. Based on all observations


Standard Deviation: The square root of the variance is known as the standard deviation
Standard deviation is a measure of the dispersion of a set of data from its mean..
Standard deviation (or S.D.) is the positive square root of the arithmetic mean of the

square deviations of various values from their arithmetic mean M.

Variance is the expectation of the squared deviation of a random variable from its mean.

Define probability

It refers to “the chances of occurrence of an event among a large number of possibilities”

Define Random Experiment:

If an experiment or trial is repeated under the same conditions for any number of times and it is

possible to know the number of outcomes is called as “Random Experiment”.

Define Sample Space:

The set of all possible outcomes of a random experiment is known as “Sample Space” and denoted by

set S.
Define Event:
An ‘event’ is an outcome of a trial meeting a specified set of conditions

Define Exhaustive Events:

The total number of all possible elementary outcomes in a random experiment is known as‘exhaustive

Define Mutually Exclusive Events:

Events are said to be ‘mutually exclusive’ if the occurrence of an event totally prevents occurrence of
all other events in a trial.

Define Equally likely or Equi-probable Events:

Outcomes are said to be ‘equally likely’ if there is no reason to expect one outcome to occur in
preference to another. i.e., among all exhaustive outcomes, each of them has equal chance of


Define Independent Events:

Two or more events are said to be ‘independent’, in a series of a trials if the outcome of one event is

does not affect the outcome of the other event or vise versa.
Example: When a coin is tossed twice, the result of the second toss will in no way be affected

by the result of the first toss.

Explain Bays theorm.

In other words, it is used to calculate the probability of an event based on its association with another
event. Bayes’ Theorem is a way of finding a probability, when we know certain other probabilities.

For example, If we know that it’s cloudy, than we can easily judge the possibilities of happening of

For example, if the probability that someone has cancer is related to their age, using Bayes’ theorem

the age can be used to more accurately judge the probability of cancer than can be done without
knowledge of the age.
Definition: coorelation

Correlation is the degree of inter-relatedness(relationship) among the two or more variables.

Correlation analysis is a process to find out the degree of relationship between two or more

variables by applying various statistical tools and techniques

It defines how two variables are closely related with each other …

In a distribution if the change in one variable effects a change in the other variable, the variable are
said to be correlated(or there is a correlation between the variables)

Explain Types of Correlation:

The important ways of classifying the correlation are:

1. Positive and Negative

2. Simple , Partial and Multiple

3. Linear and non-Linear.


Positive correlation: If two related variables are such that when one increases (decreases), the

other also increases (decreases).

If one variable is increasing and with its impact on average other variable is also increasing

that will be positive correlation.

Negative correlation: If two variables are such that when one increases (decreases), the

other decreases (increases) . if one variable is increasing and with its impact on average
other variable is also decreasing.

Positive correlation: If two related variables are such that when one increases (decreases), the

other also increases (decreases).

if one variable is increasing and with its impact on average other variable is also increasing that

will be positive correlation.

Negative correlatioIf two variables are such that when one increases (decreases), the other

decreases (increases) . if one variable is increasing and with its impact on average other
variable is also decreasing

Simple correlation
Correlation is said to be simple when only two variables are analyzed.

For example :
Correlation is said to be simple when it is done between demand and supply or we can say income

and expenditure etc

Partial correlation :
When three or more variables are considered for analysis but only two influencing variables are

studied and rest influencing variables are kept constant.

For example :

Correlation analysis is done with demand, supply and income. Where income is kept constant.

Multiple correlation :

In case of multiple correlation three or more variables are studied simultaneously.

For example :

Rainfall, production of rice and price of rice are studied simultaneously will be known are multiple

Linear correlation :

If the change in amount of one variable tends to make changes in amount of other variable bearing
constant changing ratio it is said to be linear correlation

Non linear correlation

If the change in amount of one variable tends to make changes in amount of other variable but not
bearing constant changing ratio it is said to be non - linear correlation.
Define Time series analysis.

“In other words, the arrangement of data in accordance with their time of occurrence is a time series.
It is the chronological arrangement of data. Here, time is just a way in which one can relate the entire

phenomenon to suitable reference points. Time can be hours, days, months or years.
Ex: Values taken by a variable over time (such as daily sales revenue, weekly orders, monthly

overheads, yearly income) and tabulated or plotted as chronologically ordered numbers or data

List out uses Of Studying Time Series Analysis

1. I t helps us to predict the future behaviour of the variable based on past experience
2. It is helpful for business planning as it helps in comparing the actual current performance with the

expected one
3. study the past performance and behaviour of the phenomenon or the variable under

4. We can compare the changes in the values of different variables at different times or places, etc.


The four categories of the components of time series are

 Trend

 Seasonal Variations
 Cyclic Variations

 Random or Irregular movements


The trend shows the general tendency of the data to increase or decrease during a long period of
time. A trend is a smooth, general, long-term, average tendency. It is not always necessary that the

increase or decrease is in the same direction throughout the given period of time.It is observable that
the tendencies may increase, decrease or are stable in different sections of time. But the overall trend

must be upward, downward or stable. The population, agricultural production, items manufactured,
number of births and deaths, number of industry or any factory, number of schools or colleges are
some of its example showing some kind of tendencies of movement.
Seasonal Variations: The variations in a time series data which operate themselves over less than a

span of one year are the Seasonal Variations. These are the rhythmic forces which operate in a regular
and periodic manner over a span of less than a year. They have the same or almost the same pattern

during a period of 12 months. This variation will be present in a time series if the data are recorded
hourly, daily, weekly, quarterly, or monthly.

For example, it is commonly observed that the consumption of ice-cream during summer is generally
high and hence an ice-cream dealer’s sales would be higher in some months of the year while

relatively lower during winter months. Employment, output, exports, etc., are subject to change due to
variations in weather. Similarly, the sale of garments, umbrellas, greeting cards and fire-works are

subject to large variations during festivals like Valentine’s Day, Eid, Christmas, New Year’s, etc. These
types of variations in a time series are isolated only when the series is provided biannually, quarterly

or monthly. These variations come into play either because of the natural forces or man-made
conventions. The various seasons or climatic conditions play an important role in seasonal variations.

Such as production of crops depends on seasons, the sale of umbrella and raincoats in the rainy
season, and the sale of electric fans and A.C. shoots up in summer seasons.

Cyclic Variations: The variations in a time series which operate themselves over a span of more than

one year are the cyclic variations. This oscillatory movement has a period of oscillation of more than a
year. One complete period is a cycle. This cyclic movement is sometimes called the ‘Business Cycle’.

It is a four-phase cycle comprising of the phases of prosperity, recession, depression, and recovery.
The cyclic variation may be regular are not periodic. The upswings and the downswings in business

depend upon the joint nature of the economic forces and the interaction between them.

Random or Irregular Movements: There is another factor which causes the variation in the variable
under study. They are not regular variations and are purely random or irregular. These fluctuations are

unforeseen, uncontrollable, unpredictable, and are erratic. These forces are earthquakes, wars, flood,
famines, and any other disasters.



“A network is, then, a graphical representation of a project plan, showing the inter-relationship of the
various activities.


A project is an interrelated set of activities that has a definite starting and ending point and that result
in a unique product or service.


Project management is a scientific way of planning, implementing, monitoring & controlling the
various aspects of a project such as time, money, materials, manpower & other resources.




–An activity which does not consume any kind of resource or time but merely shows the
technological dependence is called a dummy activity.


–When more than one activity comes and joins an event such an event is known as merge event


–When more than one activity leaves an event such an event is known as burst event.


Slack time for an activity is the difference between its earliest(Ei)and latest start time(Li) or between
the earliest and latest finish time.

Critical path is the sequence of activities between a projects’ that takes the longest time to complete.
Critical Path is “A path in a project network is called critical if it is the longest path. The activities lying

on the critical path are called the critical activities.”

1. Optimistic time (to) – It is the shortest time in which the activity can be completed.

2. Most likely time (tm) – It is the probable time required to perform the activity.
3. Pessimistic time (tp) – It is the longest estimated time required to perform an activity.






Non-research projects like civil RESEARCH PROJECTS

construction, ship building etc


Crashing concept applicable Crashing concept not applicable

Related with activities of certain time Related with activities of uncertain time

List out advantages and limitations of PERT/CPM

 Advantages:

Simple to understand and use

 It provides a graphical display of project activities that helps the users understand the

relationships among the activities.

 Show whether the project is on schedule; or behind/ ahead of the schedule.
 Identify the activities that need closer attention (critical).

 Determine the flexibility available with activities

 Show potential risk with activities (PERT)

 Provide good documentation of the project activities

 Help to set priorities among activities and resource allocation as per priority


 Uncertainly about the estimate of time and resources.

 It is not suitable for relatively simple and repetitive processes such as assembly line work which are

fixed-sequence jobs
 It is also difficult to estimate the activity completion time in a multidimensional project.

 Overemphasis on Critical path

 Activity time estimates are subjective

 The allocation of resources cannot be properly monitored.

 The project managers have to spend a lot of time to calculate it carefully.

 Cost of crashing an activity may not be linear

 It requires a lot of information as input to generate an effective plan. This may prove too



 Looping

 Dangling
 Redundancy

Looping: Looping error is also called as cycling error in a network diagram. Making an endless loop in

a network is called as error of looping.

Dangling: Whenever an activity is disconnected from the network it is called dangling error. To
disconnect an activity before the completion of all activities in a network diagram is known as

Redundancy: When the dummy activity is introduced and it is not required, it is called redundancy

errors. Unnecessarily inserting the dummy activity in network logic is known as the error of

DEFINE ACTIVITY “An activity is any portion of a project which consumes time or resources and has a

definable beginning and ending


Project scheduling by PERT / CPM consists of four main steps




The planning phase is started by splitting the total project in to small projects. These smaller projects
in turn are divided into activities and are analyzed by the department or section.

The relationship of each activity with respect to other activities are defined and established and the

corresponding responsibilities and the authority are also stated.

Thus the possibility of overlooking any task necessary for the completion of the project is
reduced substantially.


The ultimate objective of the scheduling phase is to prepare a time chart showing the start
and finish times for each activity as well as its relationship to other activities of the project.

Moreover the schedule must pinpoint the critical path activities which require special attention if the
project is to be completed in time.
For non-critical activities, the schedule must show the amount of slack or float times
which can be used advantageously when such activities are delayed or when limited resources are to

be utilized effectively.
Allocation of resources is performed to achieve the desired objective. A resource is a physical

variable such as labour, finance, equipment and space which will impose a limitation on time for the

 When resources are limited and conflicting, demands are made for the same type of resources a
systematic method for allocation of resources become essential.

 Resource allocation usually incurs a compromise and the choice of this compromise
depends on the judgment of managers.


The final phase in project management is controlling. Critical path methods facilitate the
application of the principle of management by expectation to identify areas that are critical to the

completion of the proj ect.

 By having progress reports from time to time and updating the network continuously, a

better financial as well as technical control over the project is exercised.

 Arrow diagrams and time charts are used for making periodic progress reports. If require d, a new

course of action is determined for the remaining portion of the


“The slack time or slack of an event in a network is the difference between the latest event

time and the earliest event time.

DEFINE TOTAL FLOATS:“The total activity float is equal to the difference between the earliest and
latest allowable start or finish times for the activity in question. Thus, for an activity (i-j), the total float

is given by

DEFINE INDEPENDENT FLOAT: It is computed by subtracting the tail event slack from the free float of
an activity.

Define Basic feasible solution:

A feasible solution to a transportation problem is said to be a basic feasible solution if it contains no

more than m + n – 1 non – negative allocations, where m is the number of rows and n is the number
of columns of the transportation problem.

Define Optimal solution:

A feasible solution (not necessarily basic) that minimizes (maximizes) the transportation cost (profit) is
called an optimal solution.

Define Non -degenerate basic feasible solution:

A basic feasible solution to a (m x n) transportation problem is said to be non – degenerate if, the
total number of non-negative allocations is exactly m + n – 1 (i.e., number of independent constraint

equations), and these m + n – 1 allocations are in independent positions.

Define Degenerate basic feasible solution:

A basic feasible solution in which the total number of non-negative allocations is less than m + n – 1

is called degenerate basic feasible solution. In a transportation problem with m origins and n
destinations if a basic feasible solution has less than m + n – 1 allocations (occupied cells), the

problem is said to be a degenerate transportation problem.

Define Linear Programming

Mathematical programming or modeling technique which is used to find the best or optimal solution
to a problem that requires a decision or set of decisions about how best to use a set of limited

resources to achieve a state goal of objectives

It is a mathematical modeling technique used to determine a level of operational activity in order to
achieve an objective.


1. limited resources

2. objective
3. linearity

4. homogeneity
5. divisibility

1. Limited resources : limited number of labour, material equipment and finance

2. Objective : refers to the aim to optimize (maximize the profits or minimize the costs).
3. Linearity : increase in labour input will have a proportionate increase in output.

4. Homogeneity : the products, workers' efficiency, and machines are assumed to be identical.
5. Divisibility :it is assumed that resources and products can be divided into fractions. (in case the

fractions are not possible, like production of one-third of a computer, a modification of linear
programming called integer programming can be used).

Explain the steps in formulation of LPP

Steps in Formulation of LP

• Identify the decision variables;

• Formulate the objective function; and

• Identify and formulate the constraints.

Objective function:
The objective of the problem is identified and converted into a suitable objective function. The

objective function represents the aim or goal of the system (i.e., decision variables) which has to be
determined from the problem. Generally, the objective in most cases will be either to maximize

resources or profits or, to minimize the cost or time.

When the availability of resources are in surplus, there will be no problem in making decisions. But in

real life, organizations normally have scarce resources within which the job has to be performed in the
most effective way. Therefore, problem situations are within confined limits in which the optimal

solution to the problem must be found.

Non-negativity constraint
Negative values of physical quantities are impossible, like producing negative number of chairs,

tables, etc., so it is necessary to include the element of non-negativity as a constraint.

What are the Advantages & disadvantages LP

 It helps decision - makers to use their productive resource effectively.

 The decision-making approach of the user becomes more objective and less subjective.
 In a production process, bottle necks may occur.

Disadvantages of LP
1. Linear programming deals with only single objective, whereas in real life situations may have

multiple and conflicting objectives

2. Not used for more decision variables or factors

3. LP is used only when constraints and objective function are linear i.e., where they can be

expressed as equations which represent straight lines.

4. Constraints or objective functions are not linear, this technique cannot be used.

5. Factors such as uncertainty and time are not taken into consideration.
6. Parameters in the model are assumed to be constant but in real life situations they are not

S-ar putea să vă placă și