00 voturi pozitive00 voturi negative

0 vizualizări7 paginiBusiness Statistics

Sep 10, 2019

© © All Rights Reserved

PDF, TXT sau citiți online pe Scribd

Business Statistics

© All Rights Reserved

0 vizualizări

00 voturi pozitive00 voturi negative

Business Statistics

© All Rights Reserved

Sunteți pe pagina 1din 7

Statistics is the study of the methods for describing and interpreting quantitative

information, including techniques for organizing and summarizing data and techniques for making

generalizations and inferences from data. The first of these two broad classes of methods is called

descriptive statistics, and the second is called inferential statistics.

Descriptive statistics refers to the procedures for organizing, summarizing, and describing

quantitative information which is called data. For example, a basketball fan is accustomed to

checking over his favorite player’s shooting average; the sales manager relies on charts showing

the sales distribution of an enterprise.

The second class of statistics, inferential statistics, include methods for making inferences

about a larger group of individuals on the basis of data collected on a much smaller group.

presented in the media and other aspects of everyday life, and it is essential in understanding and

conducting researches.

1. Entity. When we make observations about persons, places, and things, we call that which

is being observed an entity, regardless of the type of unit involved.

2. Variable. A characteristic that assumes different values for different entities is called a

variable. By contrast, a characteristic that retains the same value from entity to entity is

called a constant. The different values that one observes (or measures) are called

observations.

3. Quantitative Variable. A quantitative variable is one whose values are expressible as

numerical quantities, such as measurements and counts. A measurement taken on a

quantitative variable conveys information regarding amount.

4. Qualitative variable. A qualitative variable is one that is not measurable or countable.

Many characteristics can be classified only. A measurement taken on a qualitative variable

conveys information regarding attributes.

1

5. Discrete Variable. A discrete variable is one that can assume only certain values within

an interval. A discrete variable is characterized by interruptions between values that the

variable can assume.

6. Continuous variable. There is a continuum of values that a continuous variable can

assume- all whole numbers and all values in between.

7. Population. The largest collection of values of some variable in which there is interest

constitutes the population of these values.

8. Sample. A sample is a part of a population.

Summarizing Data

An ordered array is a list of the observations in order of magnitude. The order may be

from smallest value to the largest value or from the largest to the smallest.

A frequency distribution is any device, such as graph or table, that displays the values that

a variable can assume along with the frequency of occurrence of these values, either individually

or as they are grouped into a set of mutually exclusive and exhaustive intervals.

Class intervals are contiguous, nonoverlapping intervals selected in such a way that they

are mutually exclusive and exhaustive. That is, each and every value in the set of data can be

placed in one and only one of the intervals.

2. Determine the number of class intervals (k). Usually between 6 and 15 class intervals are

required. We use the formula: k = 1 + 3.322 (log10n) where n is the number of

observations. We should not regard the number of class intervals indicated in the formula

as final. The actual number of class intervals may be more or less than k obtained using

the formula.

3. Decide for the class size ( i) = R/k. The class size should be of the same size and we should

select the class size that is convenient to work with.

2

4. Organize the class intervals and proceed constructing your frequency distribution table.

# Also Discuss: True class limits (class boundaries); lower/upper class limits, class marks.

Sometimes one wants a cumulative frequency distribution. The entries in the cumulative

frequency < column is obtained by adding the number of observations from the first interval

(smallest) through the preceding interval, inclusive. A cf < indicates the number of observations

that fall below a specified upper boundary. Meanwhile, to obtain the entries in the cumulative

frequency > column, we add the number of observations from the largest interval (largest) to the

smallest interval. A cf > indicates the number of observations that fall above a specified lower

boundary.

fall above or below a class boundary. Steps:

2. Divide the cumulative frequencies by n and multiply by 100%.

a histogram, we plot the variable under consideration on the horizontal axis and the frequency on

the vertical axis. We locate the class intervals on the horizontal axis and above each we erect a

vertical bar. The height of a bar corresponds to the frequency of observations in the class interval

above which it is erected. We also make the adjacent cells of a histogram contiguous. We may

also use the true class limits to label the horizontal axis of a histogram. However, we may find it

more meaningful to use the lower limits, the upper class limits, or both.

construct this graph, we place a dot above the center (class mark) of each class interval at a height

corresponding to the frequency for that interval. We then connect the dots with straight lines. We

can make the frequency polygon touch the horizontal axis at both ends by extending it to the center

of an imaginary class interval at each end.

3

IV. Descriptive Measures

summarize data by methods that lead to numerical results, called descriptive measures. We will

discuss two types of descriptive measures: measures of central tendency and measures of

dispersion.

statistic. A descriptive measure computed from or used to describe a population is called

a parameter.

For a data set, it is impractical to keep in mind all the values that are in there. What we

need is some single value that we may consider typical of the set of data as a whole. The

need for such a single value is usually met by one of the three measures of central tendency:

the arithmetic mean (commonly known as the average), the median, and the mode.

The Arithmetic Mean is the most popular measure of central tendency. We find it by adding

all the values in a set of data and dividing the total by the number of values that were

summed.

ΣXi

Ungrouped data: = 𝑛

ΣfM

Grouped data: = 𝑛

Properties of the Mean:

1. For a given data set, there is one and only one mean.

2. Its meaning is easily understood.

3. Since every value goes into its computation, it is affected by the magnitude of each

value. Because of this property the mean may not be the best measure of central

tendency when one or two extreme values are present in a data set.

4

4. The mean cannot be obtained by inspection, it is a computed value and therefore can

be manipulated algebraically.

The median is that value above which half the values lie and below which the other half

lie. If the number of items is odd, the median is the value of the middle item of an ordered

array, when the items are arranged in ascending (or descending) order of magnitude. If the

number of items is even, none of the items has an equal number of values, above and below

it. In this case, the median is equal to the mean or average of the two middle values.

where:

L = the lower boundary of the class interval in which the median is located.

j = the number of values still needed to reach the median after the lower

limit of the interval containing the median has been reached.(n/2 – cf<).

i = class size.

f= the frequency in the class interval containing the median.

1. The median always exists in a set of numerical data. For a given data set, there is only

one median.

2. The median is not often affected by extreme values, whereas the mean is. Because of

this property, the median is frequently the central tendency measure of choice for a data

set that is skewed.

3. The median can be used to characterize qualitative data.

4. The median is easy to calculate unless a large number of values are involved.

5. The median for a data set can be calculated even when the data are incomplete, provided

that the number and the general location of all measurements are known and the exact

5

information regarding the magnitude of measurements near the center of the data set is

available.

The median for a frequency distribution is that value or point on the horizontal axis of the

histogram of the distribution at which a perpendicular line divides the area of the histogram

into two equal parts.

The mode for ungrouped discrete data is the value that occurs most frequently. If all the

values in a set of data are different, there is no mode. When we want to find the mode of

a frequency distribution, we usually specify the modal class, which is defined as the class

interval containing the largest number of values.

B. Measures of Dispersion

Once we have computed the mean of a data set, we want to know the extent to which the

values differ from this mean. We use the term dispersion to describe the degree to which

a set of values vary about their mean. When the values are closed to the mean, they exhibit

less dispersion than when some of the values are much larger and/or much smaller than the

mean.

The range is the difference between the largest and the smallest values in a set of data. For

grouped data, the range is simply the difference of the exact upper limit of the largest class

interval and the exact lower limit of the smallest class interval.

The variance uses all the deviations of values from their mean. It is the average of the

squared deviations of the individual values from the mean of the data set.

Grouped : s2 = 𝑛(𝑛−1)

The standard deviation is simply the positive square root of the variance.

Sometimes the need arises to compare the variability present in two sets of data. This

usually can be done by comparing the two variances or standard deviations if the data sets

6

satisfy two conditions: 1) the same unit of measurement is employed in both data sets; 2)

the means of the two data sets are approximately equal. If either of these conditions is not

met, we need a relative measure of dispersion for use in comparing the variability of the

two data sets. Such relative measure of dispersion is the coefficient of variation. The

sample coefficient of variation (CV) is equal to the ratio of the standard deviation to the

𝑠

mean. That is, CV = 𝑥 The CV is frequently multiplied by 100 and expressed as a percent.

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.