LA8 Handout

CENTRAL TENDENCY CALCULATION IN EXCEL (you just need to type in the command below and figure out the
range of your data) MEAN: =AVERAGE(range) MEDIAN: =MEDIAN(range) MODE: =MODE(RANGE) e.g. =AVERAGE(C2:C10) e.g.=MEDIAN(C2:C10) e.g. =MODE(C2:C10)
PERCENTILE:=PERCENTILE(RANGE, DESIRED PERCENTILE) E.G. =PERCENTILE(C2:C10,0.75) FOR 75TH PERCENTILE Terms
Measures of average are also called measures of central tendency and include the mean, median, mode, and midrange . Measures that determine the spread of data values are called measures of variation or measures of dispersion and include the range, variance, and standard deviation. Measures of position tell where a specific data value falls within the data set or its relative position in comparison with other data values. The most common measures of position are percentiles, deciles, and quartiles. The measures of central tendency, variation, and position are part of what is called traditional statistics. This type of data is typically used to confirm conjectures about the data. Another type of statistics is called exploratory data analysis. These techniques include the the box plot and the five-number summary. They can be used to explore data to see what they show. A statistic is a characteristic or measure obtained by using the data values from a sample. In statistics the basic rounding rule is that when computations are done in the calculation, rounding should not be done until the final answer is calculated.
LEVELS OF MEASUREMENT
Measures of Central Tendency

Central Tendency -- is a statistical measure that identifies a single score as a representative for an entire distribution or set of data. Three measures of Central Tendency (CT) a) The Mean The average of a set of scores The most commonly used measure of CT Notation The mean of a population is symbolized as: The mean of a sample is symbolized as: x (with a bar on top) The mean = the sum of all the scores divided by the number of scores: X/N
a) The Mode The mode is the most frequent score in a distribution. It is the "typical" value. In a frequency graph you can immediately see what the mode is because it is the tallest value, or the score with the highest frequency. Notation: Mo c) The Median The median is the score that divides the distribution exactly in half -- that is, have the scores are below the median and half are above the median. It is the precise midpoint. Notation: Mdn The computation of the median depends on whether there are an odd number of observations or an even number of observations. It also depends upon whether there are duplicate observations (i.e., 1 2 2 2 3 3 4). If N is odd and there are no duplicates, then the median is the score that falls exactly in the middle of the scores (once you have ordered them). If N is even and there are no duplicates, then the median is the average of the two middle scores. Again you need to order them first. If there are duplications of scores, then you need to use the median formula as discussed in class. Note: duplicates are only a problem if they fall at the median.
Measures of Central Tendency and Scales of measurement The mode requires only nominal data - and you can compute it for ordinal, interval, and ratio. The Mdn requires ordinal data - and you can compute it for interval and ratio. You cannot compute the Mdn for nominal data. The Mean requires interval or ratio data - you cannot compute it for either nominal or ordinal data.
Measures of central tendency and skewed distributions If the distribution is symmetrical and unimodal (like the normal distribution) the mean will equal the median which will equal the mode. If the distribution is skewed, these three measures of central tendency will not agree. A skewed distribution is a distribution that has a long tail extending out on one end. This is caused by having a few very extreme values relative to the majority of the scores. Positively and negatively skewed distributions Bimodal distributions Uniform distributions
Properties of the different central tendency measures:

The mean is the standard measure of central tendency in statistics. It is most frequently used. The mean is not necessarily equal to any score in the data set The mean is the most stable measure from sample to sample. The mean is very influenced by Outliers -- That is, the mean will be strongly influenced by the presence of extreme scores. The median is not sensitive to outliers. The mean is based on all scores from the sample but the mode and the median are not. The Mode is the least stable measure from sample to sample. The median is the best measure of central tendency if the distribution is skewed.
Which central tendency measure you can use depends on the level of measurement your scores represent: Level of Measurement Nominal Dichotomies Ordinal Interval/Ratio MODE MEDIAN MEAN
Yes Yes Yes Yes
No No Yes Yes
No Yes No* Yes
Level of measurement
From Wikipedia, the free encyclopedia The level of measurement of a variable in mathematics and statistics is a classification that was proposed in order to describe the nature of information contained within numbers assigned to objects and, therefore, within the variable. The levels were proposed by Stanley Smith Stevens in his 1946 article On the theory of scales of measurement. Different mathematical operations on variables are possible, depending on the level at which a variable is measured. According to the classification scheme, in statistics the kinds of descriptive statistics and significance tests that are appropriate depend on the level of measurement of the variables concerned. Four levels of measurement were proposed by Stevens: * nominal, * ordinal, * interval and * ratio.
Nominal measurement
In this classification, names are assigned to objects as labels. These names come from a given small set, and are meant to identify categories used for classifying the data. They are the "variable values". If two entities have the same name associated with them, they belong to the same category, and that is the only significance that they have. For practical data processing the names may be numerals, but in that case the numerical value of these numerals is irrelevant. The only comparisons that can be made between variable values are equality and inequality. There are no "less than" or "greater than" relations among the classifying names, nor operations such as addition or subtraction. Examples include: a country represented by its international telephone access code, the marital status of a person, or the make of a car. The only kind of measure of central tendency is the mode. Statistical dispersion may be measured with a variation ratio, index of qualitative variation, or via information entropy, but no notion of standard deviation exists. Variables that are measured only nominally are also called categorical variables. In social research, variables measured at a nominal level include gender, race, religious affiliation, political party affiliation, college major, and birthplace.
Ordinal measurement
In this classification, the numbers assigned to objects represent the rank order (1st, 2nd, 3rd etc.) of the entities measured. The numbers are called ordinals. The variables are called ordinal variables or rank variables. Comparisons of greater and less can be made, in addition to equality and inequality. However operations such as conventional addition and subtraction are still meaningless. Examples include the Mohs scale of mineral hardness; the results of a horse race, which say only which horses arrived first, second, third, etc. but no time intervals; and most measurements in psychology and other social sciences, for example attitudes like preference, conservatism or prejudice and social class. The central tendency of an ordinally measured variable can be represented by its mode or its median; the latter gives more information.
Interval measurement
The numbers assigned to objects have all the features of ordinal measurements, and in addition equal differences between measurements represent equivalent intervals. That is, differences between arbitrary pairs of measurements can be meaningfully compared. Operations such as addition and subtraction are therefore meaningful. The zero point on the scale is arbitrary; negative values can be used. Ratios between numbers on the scale are not meaningful, so operations such as multiplication and division cannot be carried out directly. But ratios of differences can be expressed; for example, one difference can be twice another. The central tendency of a variable measured at the interval level can be represented by its mode, its median or its arithmetic mean; the mean gives the most information. Variables measured at the interval level are called interval variables, or sometimes scaled variables, though the latter usage is not obvious and is not recommended. Examples of interval measures are the year date in many calendars, and temperature in Celsius scale or Fahrenheit scale. About the only interval measures commonly used in social scientific research are constructed measures such as standardized intelligence tests (IQ).
Ratio measurement
The numbers assigned to objects have all the features of interval measurement and also have meaningful ratios between arbitrary pairs of numbers. Operations such as multiplication and division are therefore meaningful. The zero value on a ratio scale is non-arbitrary. Variables measured at the ratio level are called ratio variables. Most physical quantities, such as mass, length or energy are measured on ratio scales; so is temperature measured in kelvins, that is, relative to absolute zero. The central tendency of a variable measured at the ratio level can be represented by its mode, its median, its arithmetic mean, or its geometric mean; however as with an interval scale, the arithmetic mean gives the most useful information. Social variables of ratio measure include age, length of residence in a given place, number of organisations belonged to or number of church attendances in a particular times. Interval and/or ratio measurement are sometimes called "true measurement", though it is often argued this usage reflects a lack of understanding of the uses of ordinal measurement. Only ratio or interval scales can correctly be said to have units of measurement. SUMMARY Nominal: Nominal data have no order and thus only gives names or labels to various categories. Ordinal: Ordinal data have order, but the interval between measurements is not meaningful. Interval: Interval data have meaningful intervals between measurements, but there is no true starting point (zero). Ratio: Ratio data have the highest level of measurement. Ratios between measurements as well as intervals are meaningful because there is a starting point (zero).
Measures of Variability
While measures of position describe where the data points are concentrated, measures of variability measure the dispersion (or spread) of the data set.
Range:
The range is the difference between the largest and the smallest observations in the data set. However, This is a limited measure because it depends on only two of the numbers in the data set. Using the above data set again, the range is 149, but that does not provide any information regarding the concentration of the data at the low end of the scale. Another limitation of range is that it is affected by the number of observations in the data set. Generally, the more observation there are, the more spread out they will be. One use of range in everyday life is in newspaper stock market summaries, which give the day's high and low numbers.
Variance:
Unlike range, variance takes into consideration all the data points in the data set. If all the observation are the same, the variance would be zero. The more spread out the observation are, the larger the variance.
Standard Deviation:
Standard deviation is the positive square root of the variance, and is the most common measure of variability. Standard deviation indicates how close to the mean the observations are. The larger the standard deviation, the more variation there is in the data set.
Measures of Skewness
Measures of position and variability tell us where the data are located and how dispersed they are. Measures of skewness are concerned with whether the data are symmetrically distributed, or the
shape of the distribution. Most people are familiar with the distribution referred to as the normal, or bell-shaped, curve. Many of the statistics we use assume the data are distributed normally. Unfortunately, this is not always the case.

LA8 Handout

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

LA8 Handout

Încărcat de

Drepturi de autor:

Formate disponibile

CENTRAL TENDENCY CALCULATION IN EXCEL (you just need to type in the command below and figure out the

PERCENTILE:=PERCENTILE(RANGE, DESIRED PERCENTILE) E.G. =PERCENTILE(C2:C10,0.75) FOR 75TH PERCENTILE Terms

Measures of Central Tendency

Properties of the different central tendency measures:

Yes Yes Yes Yes

No Yes No* Yes

S-ar putea să vă placă și