00 voturi pozitive00 voturi negative

15 vizualizări45 paginiSep 24, 2011

© Attribution Non-Commercial (BY-NC)

PPT, PDF, TXT sau citiți online pe Scribd

Attribution Non-Commercial (BY-NC)

15 vizualizări

00 voturi pozitive00 voturi negative

Attribution Non-Commercial (BY-NC)

Sunteți pe pagina 1din 45

1

Central Tendency

Measure of Central Tendency:

A single summary score that best describes the central location of an entire distribution of scores.

The typical score. The center of the distribution.

Mean Median Mode

Must decide which measure is best for a given situation.

2

Mean

The most commonly used measure of central tendency When people ask about the average of a group of scores, they usually are referring to the mean. The mean is the sum of all the scores in the distribution divided by the number of scores (the mathematical average). Is the balance point of a distribution.

Mean (cont)

Population

mu sigma, the sum of X, add up all scores

Sample

X = N

N, the total number of scores in a population sigma, the sum of x, add up all scores

X bar

X X= n

4

Mean (cont)

Exam Scores 75 82 72 68 89 91 78 94 88 75

X X= n

812 X = = 81 .2 10

Mean (cont)

2 4 2 4 3 4 3 4

Frequency Performance and Memory S tudy

6 5 4 3 2 1 0

4 10

40 X = =4 10

1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5

6

Pros

Mathematical center of a distribution. Just as far from scores above it as it is from scores below it. Good for interval and ratio data. Does not ignore any information.

Cons

Influenced by extreme scores. May not exist in the data.

12,000; 12,000; 12,000; 12,000; 12,000; 12,000; 12,000; 12,000; 12,000; 12,000; 20,000; 390,000 Mean = 44,167

7

Median

The middle score of the distribution when all the scores have been ranked either in ascending or descending. Represents the exact center or middle of the distribution Appropriate for variables that are at least at the ordinal level Odd number of cases = (n+1)/2 th score Even number of cases = ((n/2)+(n/2 +1) th score)/2 average the two middle values together

8

What is the median suicide rate for the nine largest U.S. cities?

Rate 7.44 13.38 10.00 14.11 14.78 12.61 12.26 14.30 18.37 Total (N) City New York Los Angeles Chicago Houston Philadelphia San Diego Detroit Dallas Phoenix 9

9

Rate 7.44 10.00 12.26 12.61 13.38 14.11 14.30 14.78 18.37 Total (N) City New York Chicago Detroit S an Diego Los A ngeles Houston Dallas P hiladelphia P hoenix 9

10

n is odd (9 + 1) / 2 = 5 Now, find the 5th case The median suicide rate for the nine largest U.S. cities is 13.38 (not 5)

Median (cont)

2 2 3 3 4 4 4 4 4 10

11

Median (even no. of cases) = ((n/2)+(n/2 +1) th score)/2 = ((10/2) th score + (10/2 +1) th score) /2 = (5th score + 6th score)/2 = (4+4)/2 = 4

Responses of 7 Individuals very dissatisfied very satisfied somewhat satisfied very dissatisfied somewhat dissatisfied somewhat satisfied very satisfied

Total(N)

7

12

To locate the median Arrange the responses in order from lowest to highest (or highest to lowest): Response

very dissatisfied very dissatisfied somewhat dissatisfied somewhat satisfied ( The middle case =Median) somewhat satisfied very satisfied very satisfied

13

Pros

Not influenced by extreme scores or skewed distributions. Good with ordinal data. Easier to compute than the mean.

Cons

May not exist in the data. Doesnt take actual values into account.

14

Mode

The most frequent score in the distribution. A distribution where a single score is most frequent has one mode and is called unimodal. When there are ties for the most frequent score, the distribution is bimodal if two scores tie or multimodal if more than two scores tie. Applications: Printing, Manufacturing, etc

For example, it is important to print more of the most popular books; because printing different books in equal numbers would cause a shortage of some books and an oversupply of others. For example, it is important to manufacture more of the most popular shoes; because manufacturing different shoes in equal numbers would cause a shortage of some shoes and an oversupply of others.

15

Mode (cont)

2 2 3 3 4 4 4 4 4 10

16

Mode (cont)

72 81 87

72 83 88

73 85 90

76 85 91

78 86 92

17

Mode (cont)

Mode is best measure of central tendency when data are not orderedlike the colors of cars in a parking lot.

You cant use the medianthere is no order in the colors, no counting up from the bottom to find the middle score Also, you cannot add them together to find a mean. (blue +red+ white=?) Summary: Mode is the place where the greatest number of cases, observations, scores occur

18

19

Pros

Good for nominal data. Good when there are two typical scores. Easiest to compute and understand. The score comes from the data set.

Cons

Ignores most of the information in a distribution. Small samples may not have a mode.

20

The scale of measurement. The shape of the distribution.

21

Scales of Measurement

Nominal scale = mode Ordinal scale = median Ratio scale = mean, median, or mode Interval scale = mean, median, or mode

22

Skew refers to the general shape of a distribution when it is graphed. Symmetrical = zero skew Scores clustered on the high or low end of a distribution = skewed distribution

23

Symmetrical D istribution

16 14 12 10 8 6 4 2 0

Frequency

2 4 .5 2 9 .5 3 4 .5 3 9 .5 4 4 .5 4 9 .5 5 4 .5 5 9 .5 6 4 .5 6 9 .5

Score s

24

25

Distributions that are skewed have one side of the distribution where the data frequency tapers off

26

Skewed Distribution

P ositive S kew

12 10

Frequency

8 6 4 2 0 27 32 37 42 47 52 57 62 67 72 77

Score s

27

Skewed Distribution

Negative Skew

12 10

Frequency

8 6 4 2 0 27 32 37 42 47 52 57 62 67 72 77

Scores

28

The mean will either underestimate or overestimate the center of skewed distributions.

Positive Skew

12 10 12 10

Negative Skew

Frequency

Frequency

27 32 37 42 47 52 57 62 67 72 77

8 6 4 2 0

8 6 4 2 0 27 32 37 42 47 52 57 62 67 72 77

Scores

Scores

29

Dispersion

The spread of a set of scores around some central value Why it is important

It gives us additional information that enables us to judge the reliability of our measure of the central tendency

Two datasets can have the same average but very different variability.

Applications

Stock market Quality control Data set B Data set A

30

Measures of Variability

Range Interquartile Range Variance. Standard Deviation

31

Range

The difference between the highest and lowest score in a distribution Range = highest value - lowest value

Las Vegas Hotel Rates 52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283, 303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480, 643, 693, 732, 749, 750, 791, 891 Range: 891-52 = 839

32

Pros

Very easy to compute.

Cons

Value depends only on two scores. Very sensitive to outliers. Influenced by sample size (the larger the sample, the larger the range).

33

Interquartile Range

Range of the middle half of scores IQR = Q3(Third quartile) Q1(First quartile) Las Vegas Hotel Rates 52, 76, 100, 136, 186, 196, 205, 150, 257, 264, 264, 280, 282, 283, 303, 313, 317, 317, 325, 373, 384, 384, 400, 402, 417, 422, 472, 480, 643, 693, 732, 749, 750, 791, 891 Interquartile Range: (35+1)/4 = 9 (Q1) 472 (Q3) 257(Q1) = 215

34

Pros

Fairly easy to compute. Scores exist in the data set. Eliminates influence of extreme scores.

Cons

Discards much of the data.

35

Variance

Mean of all squared deviations from the mean. The average amount that a score deviates from the typical score. Score Mean = Difference Score Average of Difference Scores = 0 In order to make this number not 0, square the difference scores (no negatives to cancel out the positives).

36

Variance: Formula

Population Sample

2

=

2

(X )

N

(X X ) S = n 1

2

sigma

37

3, 4, 4, 4, 6, 7, 7, 8, 8, 9

(X X ) S = n 1

2

X 60 X= = =6 n 10

S2 = S2 =

(3 6)2 + (4 6)2 + (4 6)2 + (4 6)2 + (6 6)2 + (7 6)2 + (7 6)2 + (8 6)2 + (8 6)2 + (9 6)2 9 40 = 4.4 5 9

38

Pros

Takes all data into account. Lends itself to computation of other stable measures (and is a prerequisite for many of them).

Cons

Hard to interpret. Can be influenced by extreme scores.

39

Standard Deviation

Square root of the average of the squared distances of the observations from the mean To undo the squaring of difference scores, take the square root of the variance. Return to original units rather than squared units.

Population

Sample

= (X ) =

2

s= s

2

2

2

(X X ) S= n 1

40

Example

(X X ) S= n1

(3 6) 2 + (4 6) 2 + (4 6) 2 + (4 6) 2 + (6 6) 2 + (7 6) 2 + (7 6) 2 + (8 6) 2 + (8 6) 2 + (9 6) 2 S= 9 S= 40 = 2.11 9

41

Pros

Lends itself to computation of other stable measures (and is a prerequisite for many of them). Average of deviations around the mean.

Cons

Influenced by extreme scores.

42

And

If X = mean, s = standard deviation and x is a value in the data set, then: about 68% of the data lie in the interval X -s < x < X +s about 95% of the data lie in the interval X -2s < x < X +2s about 99% of the data lie in the interval X -3s < x < X +3s

43

Coefficient of Variation

It relates the standard deviation and the mean by expressing the standard deviation as a percentage of the mean Population Sample

(100)

S (100) X

44

Example: Which Technician shows less variability

Technician A: Average = 40 Standard deviation =5 Technician B: Average = 160 Standard deviation = 15

C V = (100 ) O

= 500/40 =12.5%

CO = (100 ) V

= 1500/160 =9.4%

45

## Mult mai mult decât documente.

Descoperiți tot ce are Scribd de oferit, inclusiv cărți și cărți audio de la editori majori.

Anulați oricând.