Sunteți pe pagina 1din 109

is an art and science

that deals with the


collection, organization,
creative presentation,
analysis and
interpretation of data.
Uses of Statistics
 In education, statistics can be used to
assess students’ performance and
correlate factors affecting teaching and
learning to improve quality of education.
 In Psychology, statistics is used to
determine attitudinal patterns, the causes
and affects of misbehaviour.
 In business and economics, statistics is
used to analyze a wide range of data like
sales, outputs, price, indices, revenues,
costs, inventories, accounts, etc.
 In research and experimentation, statistics
is used to validate or test a claim or
inferences about a group of people or
object, or a series of events.
 In the field of medicine, statistics is used to
collect information about patients and
diseases and to make decisions about the
use of new drugs or treatment.
 Meteorologists use statistics to find
patterns in the weather and make
predictions about what future will be like.
Field of Statistics
1. Descriptive Statistics is concerned with
the methods of collecting, organizing, and
presenting data appropriately and
creatively to describe or assess group
characteristics. The includes measures of
location (mean, median, mode, quartiles,
deciles, percentiles), measures of
variability (range, variance, standard
deviation, coefficient of variability), and
measures of tendencies (skewness and
kurtosis).
2. Inferential Statistics is concerned
with inferring or drawing conclusions
about the population based from pre-
selected elements of that population.
This includes testing the significant
difference and independence between
two or more variables. Z-test, T-test,
Chi-Square test are some of the test
statistics that are used testing
hypothesis.
Constants and Variables
 Constants refer to the fundamental
quantities that do not change in value.
Fixes is an example of a constant.
 Variables are quantities that may take
anyone of a specified set of values.

Variables can be classified as:


1. Qualitative Variable (Categorical)
2. Quantitative Variable (Numerical)
Qualitative Variables are non-
measurable characteristics that cannot
assume a numerical value but can be
classified into two or more categories:

1. Dichotomous Variables are


variables which may take one of the
two values.
Example : Gender: Male or Female
2. Trichotomous Variables are variables
which may take one of the three values.
Example : In an opinion poll: “For”,
“Against”, or “Undecided”
3. Multinomous Variables are variables
which may take one of the several
situations.
Example : Smoking habits: “Very
Often”, “Often”, “Seldom”, “ Very Seldom”,
or “Never”
The data obtained about qualitative
variable is called qualitative data.
Quantitative Variables are those
quantities that can be counted with your
bare hands, can be measured with the
use of a mathematical formula. Those
data involving quantitative variables are
called quantitative data.
Quantitative variables are classified
as discrete and continuous
1. Discrete Variables consist of
variates (actual values) usually
obtained by counting. Hence, they are
represented by the counting numbers
or whole numbers.
Example :
a. number of students in Statistics class
b. number of courses enrolled by the
students.
2. Continuous Variables are obtained by
measurements usually with units.
Example :
a. height in meters
b. weight in kilograms
Continuous variables are also obtained by
evaluating values using formula.
Example :
a. IQ
b. Final Grades
Variables also refer to any observable
characteristics or attributes of a group of objects,
individuals or events. Those variables having
cause-effect relationships are called
independent/ endogenous variables and
dependent/exogenous variables.

 Dependent Variables are variable


whose value is predicted.
 Independent Variables are used as
predictors if the objective is to predict
the value of one variables on the basis
of the other.
Data and Information
 Data usually refers to the facts
concerning things such as status in life
of people, defectiveness of objects or
effect of an event to the society.
 Information is a set of data that have
been processed and presented in a form
suitable for human interpretation, usually
with the purpose of revealing trends or
patterns about the population.
Sources of Data
 Primary Sources is from which first-
hand information is obtained usually by
means of personal interview and actual
observation.
 Secondary Sources of information is
taken from other’s works, news reports,
readings, books, researches, thesis etc.
Scale of Measuring Data
1. Nominal Scale classifies objects or
people’s responses so that all of those
in a single category are equal with
respect to some attributes and then
each category is coded numerically.
Example :
Civil Status:
1 – Single 2 – Married
3 – Separated 4 – Widow (er)
2. Ordinal Scale classifies objects or
individuals’ responses according to
degree or level, then each level is coded
numerically
Example :
Customer Satisfaction
5 – Excellent 4 – Very Satisfatory
3 – Satisfactory 2 – Fair
1 - Poor
3. Interval Scale refers to quantitative
measurements in which lower and upper
control limits are adapted to classify
relative order and differences of item
numbers or actual scores.
Example :
Gross Income
15,000 – 19,000+; 10,000 – 14000+;
5,000 – 9000+
4. Ratio Scale takes into account the
interval size and ratio of two related
quantities, which are usually based on a
standard measurement.
Example :
Weights, time, height, rate of change in
production, and return on investments
are measured with the use of ratio scale.
Methods of Collecting Data
1. Direct or Interview Method is a person-to-
person interaction between an interviewer and
an interviewee. Tape recorded or written
interviews will help the researcher obtain
exact information from the interviewee
2. Indirect or Questionnaire Method is an
alternative method for the interview method.
Written responses are obtained by distributing
questionnaires (a list of questions intended to
elicit answers to a given problem, must given
in a logical order and not too personal) to the
respondents through mail or hand-carry.
3. Registration Method is enforced by
private organizations or government
agencies for recording purposes.
4. Observation Method is a scientific
method of investigation that makes
possible use of all senses to measure or
obtain outcomes/responses from the object
to study.
5. Experimentation is used when the
objective is to determine the cause-and-
effect of a certain phenomenon under
some controlled conditions.
Population and Sample
 Population is a finite or infinite
collection of objects, events, or
individuals with specified class or
characteristics under consideration.
Examples :
a. All college students, say 2500 of the
College of Immaculate Conception
b. Registered tricycle drivers in
Cabanatuan, say 450.
 Sample – is a finite or limited collection
of objects, events, or individuals
selected from the population.
Examples :
a. 1000 of the 2500 college students
b. 200 of the 450 tricycle drivers
Slovin’s Formula in Determining
the Sample Size
n = ___N___
1+Ne²
Where n = sample size
N = population size
e = margin of error (1% to 10%)
Example :
1. Find the sample size of population of 500 with 5%
margin of error.

n = ___N___ = ___500___ = __500__ = _500_ = 222.22


1+Ne² 1+500(0.5²) 1+1.25 2.25

222.23

Note : The researcher should be aware of the LAW


OF LARGE NUMBER which states that:

“The larger the size of the sample, the more certain


we can be sure that the sample mean will be good
estimate of the population mean.”

The larger the size of the sample, the closer its


characteristics would be to the characteristics of
the entire population.
Census
Complete enumeration or the so-called
census taking is a vital tool if the
information to be gathered would be
used administrative purposes and if it is
local or national concern. (National
Census is held only once every decade
– because it requires a large investment
of money and employment).
Sampling Techniques
There are different techniques on how
sample of given population is
determined. These could be:
1. Random Sampling Technique
(Probability Sampling Technique)
2. Non-Random Sampling Technique
(Non-Probability Sampling Technique)
1. Random Sampling is the most
commonly used sampling technique in
which member in the population is given an
equal chance of being selected in the
sample. It is usually called fair sampling.
2. Non-Random Sampling is a method of
collecting a small portion of the population
by which not all the members in the
population are given the chance to be
included in the sample. It is usually called
as a bias sampling.
Properties of Random Sampling
1. Equiprobability – means that each
member of the population has an equal
chance of being selected and included
in the sample.
2. Independence – means that the
chance of one member being drawn
does not affect the chance of the other
member.
Types of Random Sampling
1. Restricted Random Sampling –
involves certain restrictions intended to
improve the validity of the sampling.
This design is applicable only when the
population being investigated requires
homogeneity.
 A study on the effectiveness of a new drug
can be tested to two groups of animals, the
controlled group and the experimental
group.
 Those animals that belong to the controlled
group will not be treated with a new drug.
 The selection of a sample of paired
animals should be with restrictions
according to their degree of illness so that
the significant difference between the two
groups will be accepted.
2. Unrestricted Random Sampling is
considered the best random sampling
design because there were no
restrictions imposed and every member
in the population has an equal chance of
being included in the sample.
Random Sampling Techniques
1. Lottery or Fishbowl Sampling. This
is done by simply writing the names or
numbers of all the members of the
population is small rolled pieces of
paper which are later placed in a
container. The researcher shakes the
container thoroughly then draws n out of
N pieces of paper as desired for a
sample.
2. Sampling with the use of Table of
Random Numbers. If the population is large,
more practical procedure is use the Table of
Random Numbers which contains rows and
columns of digits randomly ordered by a
computer. A sample size n can be generated
by beginning at an arbitrary point in the Table
of Random Numbers, closing your eyes and
haphazardly pointing at an entry in the Table.
Then, proceed in any direction – vertically,
horizontally, or diagonally until n distinct
numbers could represent the numerically
coded elements in the population.
3. Systematic Sampling. This method of sampling
is done by taking kth element in the population. It
applies to a group of individuals arranged in a
waiting line or in a methodical manner. For instance,
the objective is to get the opinion of employees
regarding employee-management relations, sample
size of n will be selected from a list of employees
arranged alphabetically or according to age,
experience, position or academic rank. By
systematic sampling, every kth employee from the
listed order will be included in a sample. If N is
known, k value can be calculated as:

k = N/n where N = population size and


n = sample size
4. Stratified Random Sampling. When
the population can be partitioned into
several strata or subgroups, it may be
wiser to employ the stratified technique to
ensure a representative of each group in
the sample. Random samples will be
selected from each stratum. Selecting a
sample with this technique is quite difficult
and costly since it requires a complete
listing, called frame, of all elements in the
population.
There are two kinds of stratified
random sampling:
a. Simple Stratified Random Sampling.
When the population is grouped into
more or less homogeneous classes, that
is, different groups but with a relatively
common characteristic, then each can
be sampled independently by taking
equal number of elements from each
stratum. This method is called simple
stratified random sampling.
Example : 800 students can be grouped
according to year levels, then using
simple random sampling. 50 students
can be taken randomly from each of the
4 groups and that compromises a
sample size of 200 students.
Population No. of Students Sample
Fourth Year 185 50
Third Year 200 50
Second Year 215 50
First Year 200 50
Totals N = 800 N = 200
b. Stratified Proportional Random
Sampling. In some cases, the
characteristic of the population is such
that the proportions of the subgroups
are grossly equal. The researcher may
wish to maintain these characteristics in
the sample with the use of the stratified
proportion technique.
Population No. of Proportion Sample
Students
Fourth Year 120 15.0% 30
Third Year 200 25.0% 50
Second Year 220 27.5% 55
First Year 260 32.5% 65
Totals N = 800 100% N = 200
Non-Random Sampling
Techniques
1. Judgment or Purposive Sampling.
This method is also referred as non-
random or non-probability sampling. It
plays a major role in the selection of a
particular item and/or in making
decisions in cases of incomplete
responses or observation. This is
usually based in a certain criteria laid
down by the researcher or his adviser.
2. Quota Sampling. This is relatively
quick and inexpensive method to
operate since the choice of the number
of persons or elements to be included in
a sample is done at the researcher’s
own convenience or preference and is
not predetermined by some carefully
operated randomizing plan.
3. Cluster Sampling. This is sometimes
referred to as an area sampling because it
is usually applied on a geographical basis.
The population is grouped into cluster or
small units such as blocks or districts in city
or municipality. Area sampling usually
requires larger samples of elementary units
than those required in simple random
sampling. It is not a common practice,
however, that every individual located in
selected area is interviewed.
4. Incidental Sampling. This design is
applied to those samples which are
taken because they are the most
available. The investigator simply takes
the nearest individuals as subjects of the
study until it reaches the desired size. In
an interview, for instance, an interviewer
can simply choose to ask those people
around him or in a coffee shop where he
is taking a break.
5. Convenience Sampling. This method
has been widely used in television and
radio programs to find out opinions of TV
viewers and listeners regarding a
controversial issue. While the issue is
being discussed in a talk show, the hosts
will immediately get responses and
comments from those who will call their
telephone operators. This method, of
course, is bias against those without
telephones in their houses.
Organization and Presentation
of Data
The bulk of data that are collected from
primary and secondary sources are still
considered raw data. It requires manual
tallying and classifying of responses taken
from taped interviews, answered
questionnaires, furnished registration
forms, recorded observations, and results
from an experiment. After these manual
enumerations, an appropriate form of
organization and presentation should be
used in order to arrive at meaningful
interpretation of data.
Form of Presentation of Data
1. Textual Presentation. This form of
presentation combines text and numerical
facts in a statistical report.
2. Tabular Presentation. This form of
presentation is better than textual form
because it provides numerical facts in a more
concise and systematic manner. Statistical
tables are constructed to facilitate the analysis
of relationships. Each class/subclass is
assigned to a particular row or column and
figures for various classifications are noted in
appropriate cells.
3. Graphical Presentation. This form is
the most effective means of organizing
and presenting statistical data because
the important relationships are brought
out more clearly and creatively in
virtually solid and colourful figures.
Different Kinds of Graphs/Charts
1. Line Graph. It shows relationships
between two sets of quantities. This is
done by plotting point of X set of quantities
along the horizontal axis against the Y set
of quantities along the vertical axis in a
Cartesian coordinate plane. Those plotted
points will be connected by a line segment
which finally forms the line graph. It is often
used to predict growth trends for a longer
period of time.
2. Bar Graph. It consists of bars or rectangles of
equal widths, either drawn vertically or
horizontally, segmented or non-segmented. This
is done by drawing rectangles with length
proportional to the frequencies of observed
items or magnitude of classes under study. Two
or more kinds of information can be compared
by showing them in multiple bar graphs, each of
which is shaded with different colours to give
distinctions of each. In some cases, bars can be
shown in opposite directions above and below a
zero line to illustrate profits/earnings (positive)
and loss/deficit relationships.
3. Circle Graph or Pie Chart. It
represents relationships of the different
components of a single total as revealed
in the sectors of a circle. The angles or
size of the sectors should be
proportional to the percentage
components of the data which give a
total of 100%. Colors, legends, and
cross hatching will be useful in
identifying each component.
4. Picture Graph or Pictogram. It is a
visual presentation of statistical quantities
by means of drawing pictures or symbols
related to the subject under study. Sizes
and magnitudes of drawn pictures should
be clear enough to depict differences.
Legends are sometimes used to represent
magnitude of a single unit of the picture
then repetitions of this picture are drawn to
indicate differences in quantity.
5. Map Graph or Cartogram. It is one
of the best ways to present geographical
data. This kind of graph is always
accompanied by a legend which tell us
the meaning of the lines, colors, or other
symbols used and positioned in a map.
6. Scatter Point Diagram. It is graphical
device to show the degree of
relationship between two quantitative
variables. Unlike the line graph, the
plotted points for every pair of X and Y
set of quantities are not connected by
line segments but are simply scattered
on the Cartesian coordinate.
Frequency Distribution
 is a tabulation or grouping of data into
appropriate categories showing the number of
observations in each group or category.
Consider the given data below which shows
the scores of 60 students in a Statistical test.
5 13 8 6 13 10 5 13 15 16
8 12 15 10 12 16 12 9 3 7
11 15 11 7 15 2 13 5 9 12
13 9 12 9 9 14 12 11 19 13
16 18 3 13 18 10 15 14 18 11
10 12 6 9 5 17 9 6 9 18

The numbers shown above are called raw data.


Parts of a Frequency Table
1. Class Limits. Groupings or categories
defined by lower and upper limits.
Examples :
16 – 20
21 – 25
26 – 30
Lower class limits are the smallest
numbers that belong to the different
classes.
Upper class limits are the highest number
that belongs to the different classes.
2. Class Size. It is a width of each class
interval.
Example :
Lower Limit Upper Limit
16 20 (16,17,18,19,20 = 5
items = class size)
21 25 (21,22,23,24,25 = 5
items = class size)
26 30 (26,27,28,29,30 = 5
items = class size)
3. Class Boundaries. These are
numbers used to separate class but
without gaps created by class limits. The
number to be added or subtracted is half
the difference between the upper limit of
one class and the lower limit of the
preceding class.
Example:
Class Intervals Class Boundaries
16 – 20 15.5 – 20.5
21 – 25 20.5 – 25.5
26 – 30 25.5 – 30.5
31 – 35 30.5 – 35.5
4. Class Marks. These are midpoints of
the classes. They can be found by
adding the lower and upper limits and
then divide the sum by 2.
Example :
Class Interval Class Marks (x)
16 – 20 18
21 – 25 23
26 – 30 28
31 – 35 33
Steps in Constructing a
Frequency Distribution Table
1. Find the range of the values.
Range = Highest Value – Lowest Value
Example: range = 19 – 2 = 17
2. Determine the class size or class width by
dividing the range by the desired number of
class intervals.
Class Size = Range / Desired number of class
intervals
For instances, in a range of 17 desired number of
classes / class intervals is 6.
Class Size = 17 / 6 = 2.83 = 3
3. Set up the class limits of each class. The limits of
each class are defined by lower limit and upper limit.
To determine the constructing class with the class
width, the highest observation should be part of the
highest class interval.
4. Set up the class boundaries or the true limits of
each class are defined by a lower class boundary
and an upper boundary.
5. Tally the scores in the appropriate classes and
then the tallies for each in order to obtain the
frequency.
6. Solve the class mark or midpoint (x) of each
class. This is obtained by adding the lower class limit
and the upper class limit, then divide by 2.
Example of Frequency Distribution Table:

Class Class Tally Frequency Class Mark


Limits Boundaries (f) (x)
2–4 1.5 – 4.5 3 3
5–7 4.5 – 7.5 9 6
8 – 10 7.5 – 10.5 14 9
11 – 13 10.5 – 13.5 18 12
14 – 16 13.5 – 16.5 10 15
17 – 19 16.5 – 19.5 6_____ 18
60
Cumulative Frequency Distribution
The less than cumulative frequency
distribution (<f) is obtained by adding
successively from the lowest to the
highest interval while more than
cumulative frequency distribution (>f)
is obtained by adding frequencies
from the highest class interval to the
lower class interval.
Example :

Class f <cf >cf


Interval
2–4 3 3 60
5–7 9 12 57
8 – 10 14 26 48
11 – 13 18 44 34
14 – 16 10 54 34
17 – 19 6 60 6
Relative Frequency Distribution
The relative frequency of a class is the
frequency divided by the total frequency
of all classes and is generally expressed
as a percentage.

Relative = frequency of each class


interval / total number of observations
Example :
Class f rf rf (%) <rf (%) <cf (%)
Interval
2–4 3 0.05 5% 5% 100%
5–7 9 0.15 15% 20% 95%
8 – 10 14 0.233 23.3% 43.3% 80%
11 – 13 18 0.30 30% 73.3% 56.7%
14 – 16 10 0.167 16.7% 90% 26.7%
17 – 19 6______ 0.10 10% 100% 10%
60
Graphical Presentation of the
Frequency Distribution
The following graphs can be constructed to present frequency
distribution:
1. Histogram. It consists of a set of rectangles having bases
on a horizontal axis with center at the class marks and
lengths equal to the class interval sizes.
2. Frequency Polygons. These are constructed by plotting
class frequencies against class marks and connecting the
consecutive points by a straight line.
3. Ogives. It is obtained by plotting the cumulative frequency
by connecting points of intersection between the class
boundaries versus cumulative frequencies less than or more
than.
Histogram
Frequency Polygon
Ogives
Measures of Central Tendency
Descriptive measures that are used to
indicate where the center, the middle
property, or the most typical of a set of data
lies are called measures of central
tendency.
The three most important measures of
central tendency:
1. mean
2. median
3. mode
Arithmetic Mean
 The most commonly used of central tendency is the
arithmetic mean.
 It is called the mean or the computed average
 It is defined to be the sum of the values of a group of
items divided by the number of such items.
 The mean of a sample of scores on a variable x is
symbolized by X̅ (x – bar) and the mean of a
population is called the µ (mu).
 Most of the time, researchers are forced to estimate
µ form X̅, since they cannot measure every item in
the population.
Characteristic of the Mean
1. The mean is reliable or more stable measurement
to use when sample data is being used to make
inferences about populations.
2. It is he point which balances all the values on the
either side.
3. The mean is sensitive or greatly affected by the
values, high or low and this makes it an
inappropriate average to use when the distribution is
highly skewed.
4. It loses its representative quality.
5. The mean cannot be computed when the
distribution contains open-ended intervals in the
balance of additional information.
Uses of Mean
 The mean is the most commonly used,
easily understood, easily calculated, and
generally recognized average.
 It is the best measure to use when the
distribution is symmetrical.
 It is a useful measure for inferential
statistics.
 It is also used to obtain an average value
of a series values after each item is
weighted. It is referred to as weighted
average.
Mean for Ungrouped Data
For ungroup data, the mean is
computed by simply adding all the
values and dividing the sum by the total
number of items.
Formula of Sample Mean
X̅ = ΣX
n

Where:
X̅ = Sample mean
Σ = The sum of
ΣX = Sum of all the values (data)
n = Number of items added
Formula for the Population
Mean
µ = ΣX̅
N

Where :
X̅ = Sample mean
Σ = The sum of
ΣX̅ = Sum of all the values (data)
N = Number of items added
Examples :
1. Mr. Santiago obtained the following
grades:88, 87, 85, 76, 95, 92, 78, 82, 81
and 89. Find his average grade.

Find ΣX̅ : 88 + 87 + 80 + 76 + 95 + 92 + 78
+ 82 + 81 + 89 = 933

Solve for the average:


X̅ = ΣX̅ = 933 = 84.82
n 11
2. A random sample of six cashiers in a NE Emporia shows the
following sales at the end of the day: Php 25,123.78; Php
30,943.89; Php 34,657.90; Php 102,003.75; Php 85,756.25 and
Php 54,690.45. Find their mean sales.
Find ΣX̅:
Php 25,123.78
Php 30,943.89
Php 34,657.90
Php 102,003.75
Php 85,756.25
+ Php 54,690.45
Php 333,176.02
Solve for the mean sale:
X̅ = ΣX̅ = Php 333,176.02 = Php 55,529.34
n 6
Weighted Mean
The weighted mean of a set of values can be expressed as the
sum of the values multiplied by their corresponding weights.
Formula :
X̅ = ΣwX
Σw
Where :
X̅ = weighted mean
Σ = the sum of
w = weight
X = observed value
ΣwX = sum of the observed multiplied by the corresponding
weight
Σw = sum of weight
Example :
1. Miss Ruiz got the following grades at
the end of the semester:

Courses Grade (X) Unit (w)


English 101 88 3
Filipino 101 85 3
Nat Sci 101 95 3
CL 101 96 2
PE 101 90 2
Acctg 101 98 6
To solve for the weighted mean:

Courses Grades (x) Units (w) (w)(X)


English 101 88 3 264
Filipino 101 85 3 255
Nat Sci 101 95 3 285
CL 101 96 2 192
PE 101 90 2 180
Accctg 101 98 6______ 588____
19 = Σw 1,764 = ΣwX

X̅ = ΣwX = 1,764 = 92.84


Σw 19
Mean for Grouped Data
Data which are arranged in a frequency
distribution are called grouped data.
Observations belonging to each class
interval are represented by the class
mark of the interval.
There are two methods to compute the
mean for the grouped data:
1. long method
2. coded method
Long Method Formula
X̅ = Σ f X for sample
n

µ=ΣfX for population


N
Where :
X̅ = sample mean
µ = population mean
f = frequency or number of observations in a class
X = class mark or midpoint of a class
n = sample size or the total frequency in the sample
distribution
N = population size or the total frequency in the
population distribution
Example :
1. Compute for the mean height of 50
persons in the following frequency
distribution using the long method.
Height in Inches Frequency (f)
61 – 63 2
64 – 66 5
67 – 69 12
70 – 72 15
73 – 75 8
76 – 78 5
79 – 81 3
Solution :
Height in Frequency (f) Class Mark (X) fX
Inches
61 – 63 2 62 124
64 – 66 5 65 325
67 – 69 12 98 816
70 – 72 15 71 1,065
73 – 75 8 74 592
76 – 78 5 77 385
79 – 81 3________ 80 240____
Totals 50 = Σ f = n 3,547 = Σf X

X = Σ f X = 3,547 = 70.94
n 50
Coded Formula
This formula requires coding and is called the coded
formula for the mean.
Procedures for Coded Formula
1. Take the class mark for the class intervals as an
assumed mean. Denote this by X0. This X0 is set to zero
origin. (Usually this class interval with the most number of
frequency).
2. The class marks of the classes following the class
containing the origin are coded +1, +2, +3 . . . for higher
classes from 0 “deviation”. The class marks prior to the
class containing the origin are -1, -2, -3 . . . for the lower
classes from 0 “deviation”.
3. Multiply the coded values by the corresponding
frequencies and find the sum.
4. Divide the sum by the total number of frequencies and
multiply the result by the size of the class interval.
5. The result is then added to the class mark of the
assumed mean.
Coded Formula :

X̅ = X0 + ( Σ f d ) .C
n
Where :
X = mean
X0 = class mark of the assumed mean
f = frequency of the class interval
n = total frequency
C = class size
Example :
1. Consider again the frequency
distribution of the heights of 50 persons.
Height in Inches Frequency (f)
61 – 63 2
64 – 66 5
67 – 69 12
70 – 72 15
73 – 75 8
76 – 78 5
79 – 81 3
Solution using the Coded
Formula
Height in Frequency Class Mark Deviation Fd
Inches (f) (X) (d)
61 – 63 2 62 -3 -6
64 – 66 5 65 -2 -10
67 – 69 12 68 -1 -12
70 – 72 15 71 0 0
73 – 75 8 74 +1 8
76 – 78 5 77 +2 10
79 – 81 3______ 80 +3 9
Totals 50 = n -1 = Σ f d
Solve the mean :
X̅ = Xo + (Σ f d) . C
n
= 71 + (-1) . 3
50
= 71 + (0.06)
X̅ = 70.94
Median
 The median of a set of data is a measure of
central tendency that occupies the middle
position in an array of values.
 It is the number that divided the bottom 50% of
the data from the top 50%, that is, half the
data items fall below median and half are
above that value.
 In an odd number of ordered items or array,
the median is simply the middle value
 Ina n even number of items, the median is the
average of the two middle data values in its
ordered list or array.
Characteristics of Median
1. Median is another widely used average,
easy to compute.
2. It cannot be found unless the items are
arranged in an ascending or descending
order.
3. It is the point that divides the frequency
distribution into two halves.
4. The median is not affected by the
extremely high or low values, so it is the
better choice when a distribution is badly
skewed.
5. It may be determined in open-ended
distribution.
Uses of the Median
 The median is used whenever an
average of position is desired.
 It is used when open-ended intervals are
involved.
 Since the median divides a distribution
in half, it is also frequently used as an
average in testing general abilities, like
intelligence test.
Median for Ungrouped Data
The median is computed as follows:
1. Arrange the items in an array.
2. Identify the middle value
Examples :
1. Carlo got the following grades in his
subjects : 86, 84, 92 and 90. Find the
median.
Solution:
Make an array of these grades in
ascending order :
78, 84, 86, 90, 92

Since there are 5 items (odd)


Median = Md =86
2. The average grades of 10 students in Statistics
are: 78, 82, 89, 88, 76, 82, 90, 88, 93 and 77.
Solution :
Make an array of the average grades:
76, 77, 78, 82, 82, 88, 88, 89, 90, 93

Since the number of items is (10) even. Find the 2


most values:
82 and 88, then:

Median = (82 + 88) / 2 = 170 / 2 = 85


Median for Grouped Data
The median of a grouped frequency
distribution is essentially the x –
coordinate of the point of intersection of
the less than and greater than ogives of
the distribution.
The formula for the computation
of the median is:
n – Cfp
Median = Md = Lb + ( 2_____ ) C
Fmd
Where :
Lb = lower boundary of the median class. Median
class is the class interval where n / 2 cumulative
frequency is found.
n = total frequency
Cfp = cumulative frequency of the next lower class
from the median class.
C = size of the median class
Fmd = frequency of the median class
Example :
1. Consider again the frequency
distribution of the heights of 50 persons
below:
Height in Inches Frequency (f)
61 – 63 2
64 – 66 5
67 – 69 12
70 – 72 15
73 – 75 8
76 – 78 5
79 – 81 3
Solution :
Height in Inches Frequency (f) <Cumulative
Frequency
61 – 63 2 2
64 – 66 5 7
67 – 69 12 19
70 – 72 15 34
73 – 75 8 42
76 – 78 5 47
79 – 81 3____ 50
50 = n

n / 2 = 50 / 2 = 25 and 25 is contained in 34 <cumulative


frequency so the median class is 70 – 72. The lower boundary
of the median class is 69.5 = Lb
Solve the median :
n – Cfp
Md = Lb + ( 2_____ ) C
Fmd

50 - 19
Md = 69.5 + ( 2 )3
15

= 69.54 + (25 – 19) 3


15

= 69.5 + 1.2
= 70.7 inches
Mode
 The mode is the most commonly
occurring value in a series. A series may
have one or more than one, or none at
all.
 When there is one mode, it is called
unimodal.
 When there are two modes, it is called
bimodal.
 When more than two modes are present
it is called multimodal.
Characteristics of the Mode
1. It is the simplest but unreliable measure
of central tendency. It is not affected by
extreme values in a distribution.
2. It is not necessary to arrange the item
before the mode id known.
3. The mode may not exist in some set of
data or there may be more than one
mode in other data sets.
Uses of the Mode
 It is used when a quick estimate of the average is
needed.

For instance, shoe producer or a clothing


manufacturer would want to know the size that will fit
the greatest number of people. These manufacturers
should seek the modal size. Obviously, the shoe
producer or clothing manufacturer will produce more
shoes or dresses in the most commonly purchased
size than in other sizes. The mode therefore provides
information to businessmen and producers that
would help them in business planning and decision
making.
Mode for Ungrouped Data
For ungrouped data, the most frequent occurring
value is the mode.

Examples:
1. Find the mode of the following values: 2, 2, 3, 5, 6,
7, 8, 8, 8, 9, 10, 10, 10, 10, 15, 15, 15. The mode =
M0 = 10
2. Find the mode of the following values: 45, 44, 44,
50, 50, 62, 62, 62, 75, 75,75, 80, 85, 95, 95. The
mode is 62, and 75
3. Find the mode of the following values: 46, 45, 44,
48, 50, 51, 64, 62, 75, 78, 80, 89, 85, 88, 95, 91.
There is no mode
Mode for Grouped Data
For the grouped distributions, the class with the greatest
frequency is called the modal class.
Formula :
M0 = Lb + (___d1___) C
d1 +d2
Where :
Lb = lower boundary of the modal class
d1 = difference between the frequency if the modal class
and the frequency of the class interval lower than the
modal class.
d2 = difference between the frequency of the modal class
and the frequency of the class interval higher than the
modal class.
C = class size if the modal class
Example :
1. Consider again the frequency
distribution of the heights of 50 persons
below:
Height in Inches Frequency (f)
61 – 63 2
64 – 66 5
67 – 69 12
70 – 72 15
73 – 75 8
76 – 78 5
79 – 81 3
Solution
Identify the modal class (class with highest
frequency) = 70 – 72
Lb = 69.5 d1 = 15 – 12 = 3
d2 = 15 – 8 = 7 C = 3
Solve :
M0 = Lb + (___d1___) C
d1 +d2
= 69.5 + ( ___3___) 3
3+7
= 69.5 + 0.9
= 70.4

S-ar putea să vă placă și