Hypothesis Testing, Test Statistic (Z, P, T, F)

Hypothesis Testing Examples 1.
Suppose we would like to determine if the typical amount spent per customer for dinner at a new restaurant in town is more than $20.00. A sample of 49 customers over a three-week period was randomly selected and the average amount spent was $22.60. Assume that the standard deviation is known to be $2.50. Using a 0.02 level of significance, would we conclude the typical amount spent per customer is more than $20.00? Suppose an editor of a publishing company claims that the mean time to write a textbook is at most 15 months. A sample of 16 textbook authors is randomly selected and it is found that the mean time taken by them to write a textbook was 12.5. Assume also that the standard deviation is known to be 3.6 months. Assuming the time to write a textbook is normally distributed and using a 0.025 level of significance, would you conclude the editors claim is true? Suppose, according to a 1990 demographic report, the average U. S. household spends $90 per day. Suppose you recently took a random sample of 30 households in Huntsville and the results revealed a mean of $84.50. Suppose the standard deviation is known to be $14.50. Using a 0.05 level of significance, can it be concluded that the average amount spent per day by U.S. households has decreased? Suppose the mean salary for full professors in the United States is believed to be $61,650. A sample of 36 full professors revealed a mean salary of $69,800. Assuming the standard deviation is $5,000, can it be concluded that the average salary has increased using a 0.02 level of significance? Historically, evening long-distance calls from a particular city have averaged 15.2 minutes per call. In a random sample of 35 calls, the sample mean time was 14.3 minutes. Assume the standard deviation is known to be 5 minutes. Using a 0.05 level of significance, is there sufficient evidence to conclude that the average evening long-distance call has decreased? Suppose a production line operates with a mean filling weight of 16 ounces per container. Since overor under-filling can be dangerous, a quality control inspector samples 30 items to determine whether or not the filling weight has to be adjusted. The sample revealed a mean of 16.32 ounces. From past data, the standard deviation is known to be .8 ounces. Using a 0.10 level of significance, can it be concluded that the process is out of control (not equal to 16 ounces)? Ho: = 20 Ha: > 20
Z= x
2.
3.
4.
5.
6.
1.
22.60 20 = 2.50 7.28 49
Reject Ho if Z > 2.06 Reject Ho There is sufficient evidence to conclude the typical amount spent per customer is more than $20.00, = 0.02. 2. Ho: = 15 Ha: < 15
Z= x
12.5 15 = 3. 6 -2.78 16
Reject Ho if Z < -1.96 Reject Ho There is sufficient evidence to conclude the editors claim is true, = 0.025. 3. Ho: Ha: = 90 < 90
Z=
84.50 90 = 14.50 -2.078 30
Reject Ho if Z < -1.65 Reject Ho There is sufficient evidence to conclude the average amount spent per day by U.S. households has decreased, = 0.05. 4. Ho: = 61,650 Ha: > 61,650
Z=
69,800 61,650 = 5,000 9.78 36
Reject Ho if Z > 2.06 Reject Ho There is sufficient evidence to conclude the average salary has increased, = 0.02. 5. Ho: Ha:
Z= x
= 15.2 < 15.2

= 14.3 15.2 = 5 1.065 30
Reject Ho if Z < -1.65 Fail to Reject Ho There is insufficient evidence to conclude the average evening long-distance call has decreased, = 0.05. 6. Ho: Ha: = 16 16
Z=
16.32 16 = 0.8 2.19 30
Reject Ho if Z < - 1.65 OR Z > 1.65 Reject Ho There is sufficient evidence to conclude the process is out of control, = 0.10.
The table found below is a compilation of areas from the standard normal distribution, more commonly known as the bell curve. The table provides the area of the region located under the bell curve and to the left of a given z score. These areas represent probabilities and have numerous applications throughout statistics. Anytime that a normal distribution is being used, a table such as this one can be consulted to perform important calculations. If you need help reading the table, begin with the value of your z score. In order to use this particular table, the value should be rounded to the nearest hundredth. Find the appropriate entry in the table by reading down the first column for the ones and tenths places of your number, and along the top row for the hundredths place. For example, if z=1.67, then you would split this number into 1.67 = 1.6 + .07. The number located in the 1.6 row and .07 column is .953. Thus 95.3% of the area under the bell curve is to the left of z=1.67.
The table may also be used to find the areas to the left of a negative z score. To do this, drop the negative sign and look for the appropriate entry in the table. After locating the area, subtract .5 to adjust for the fact that z is a negative val Standard Normal Distribution Table
The area of the shaded region to the left of z in the diagram is given by the table below. C.K.Taylor z 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 0.0 .500 .540 .580 .618 .655 .692 .726 .758 .788 .816 .841 .864 .885 .903 .919 .933 .945 .955 .964 .971 .977 .982 .986 .989 .992 .994 .995 .997 0.01 .504 .544 .583 .622 .659 .695 .729 .761 .791 .819 .844 .867 .887 .905 .921 .935 .946 .956 .965 .972 .978 .983 .986 .990 .992 .994 .996 .997 0.02 .508 .548 .587 .626 .663 .699 .732 .764 .794 .821 .846 .869 .889 .907 .922 .936 .947 .957 .966 .973 .978 .983 .987 .990 .992 .994 .996 .997 0.03 .512 .552 .591 .630 .666 .702 .736 .767 .797 .824 .849 .871 .891 .908 .924 .937 .948 .958 .966 .973 .979 .983 .987 .990 .993 .994 .996 .997 0.04 .516 .556 .595 .633 .670 .705 .740 .770 .800 .826 .851 .873 .893 .910 .925 .938 .950 .959 .967 .974 .979 .984 .988 .990 .993 .995 .996 .997 0.05 .520 .560 .599 .637 .674 .709 .742 .773 .802 .829 .853 .875 .894 .912 .927 .939 .951 .960 .968 .974 .980 .984 .988 .991 .993 .995 .996 .997 0.06 .524 .564 .603 .641 .677 .712 .745 .776 .805 .832 .855 .877 .896 .913 .928 .941 .952 .961 .969 .975 .980 .985 .988 .991 .993 .995 .996 .997 0.07 .528 .568 .606 .644 .681 .716 .749 .779 .808 .834 .858 .879 .898 .915 .929 .942 .953 .962 .969 .976 .981 .985 .988 .991 .993 .995 .996 .997 0.08 .532 .571 .610 .648 .684 .719 .752 .782 .811 .837 .850 .881 .900 .916 .931 .943 .954 .963 .970 .976 .981 .985 .989 .991 .993 .995 .996 .997 0.09 .536 .575 .614 .652 .688 .722 .755 .785 .813 .839 .862 .883 .902 .918 .932 .944 .955 .963 .971 .977 .982 .986 .989 .992 .994 .995 .996 .997
The Null Hypothesis The null hypothesis reflects that there will be no observed effect for our experiment. In a mathematical formulation of the null hypothesis there will typically be an equal sign. This hypothesis is denoted by H0. The null hypothesis is what we are attempting to overturn by our hypothesis test. We hope to obtain a small enough p-value that we are justified in rejecting the null hypothesis.
If the null hypothesis is not rejected, then we must be careful to say what this means. The thinking on this is similar to a legal verdict. Just because a person has been declared "not guilty", it does not mean that he is innocent. In the same way, just because a null hypothesis is not rejected does not mean that the statement is true. For example, we may want to investigate the claim that despite what convention has told us, the mean adult body temperature is not the accepted value of 98.6 degrees Fahrenheit. The null hypothesis for an experiment to investigate this is The mean adult body temperature is 98.6 degrees Fahrenheit. If we fail to reject the null hypothesis, then our working hypothesis remains that the average adult has temperature of 98.6 degrees. If we are studying a new treatment, the null hypothesis is that our treatment will not change our subjects in any meaningful way. The Alternative Hypothesis The alternative or experimental hypothesis reflects that there will be an observed effect for our experiment. In a mathematical formulation of the alternative hypothesis there will typically be an inequality, or not equal to symbol. This hypothesis is denoted by either Ha or by H1. The alternative hypothesis is what we are attempting to demonstrate in an indirect way by the use of our hypothesis test. If the null hypothesis is rejected, then we accept the alternative hypothesis. If the null hypothesis is not rejected, then we do not accept the alternative hypothesis. Going back to the above example of mean human body temperature, the alternative hypothesis is The average adult human body temperature is not 98.6 degrees Fahrenheit. If we are studying a new treatment, then the alternative hypothesis is that our treatment does in fact change our subjects in a meaningful and measureable way. Negation The following set of negations may help when you are forming your null and alternative hypotheses. Most technical papers rely on just the first formulation, even though you may see some of the others in a statistics textbook.

Null hypothesis: x is equal to y. Alternative hypothesis x is not equal to y. Null hypothesis: x is at least y. Alternative hypothesis x is less than y. Null hypothesis: x is at most y. Alternative hypothesis x is greater than y.
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some cause that we should be looking for? We need to have a way to differentiate between events that easily occur by chance and those that are highly unlikely to occur randomly. Such a method should be streamlined and well defined so that others can replicate our statistical experiments. There are a few different methods used to conduct hypothesis tests. One of these methods is known as the traditional method, and another involves what is known as a p- value. The steps of these two most common methods are identical up to a point, then diverge slightly. Both the traditional method for hypothesis testing and the p-value method are outlined below.
The Traditional Method The traditional method is as follows: 1. Begin by stating the claim or hypothesis that is being tested. Also form a statement for the case that the hypothesis is false. 2. Express both of the statements from the first step in mathematical symbols. These statements will use symbols such as inequalities and equals signs.
3. 4. 5.
6.
7. 8. 9. 10.
Identify which of the two symbolic statements does not have equality in it. This could simply be a "not equals" sign, but could also be an "is less than" sign ( < ) or an "is greater than" sign ( > ). The statement containing inequality is called the alternative hypothesis, and is denoted H1 or Ha. The statement from the first step that makes the statement that a parameter equals a particular value is called the null hypothesis, denoted H0. Choose which significance level that we want. A significance level is typically denoted by the Greek letter alpha. Here we should consider Type I errors. A Type I error occurs when we reject a null hypothesis that is actually true. If we are very concerned about this possibility occurring, then our value for alpha should be small. There is a bit of a trade off here. The smaller the alpha, the most costly the experiment. The values 0.05 and 0.01 are common values used for alpha, but any positive number between 0 and 0.50 could be used for a significance level. Determine which statistic and distribution we should use. The type of distribution is dictated by features of the data. Common distributions include: z score, t score and chi-squared. Find the test statistic and critical value for this statistic. Here we will have to consider if we are conducting a two tailed test (typically when the alternative hypothesis contains a is not equal to symbol, or a one tailed test (typically used when an inequality is involved in the statement of the alternative hypothesis). From the type of distribution, confidence level, critical value and test statistic we sketch a graph. If the test statistic is in our critical region, then we must reject the null hypothesis. The alternative hypothesis stands. If the test statistic is not in our critical region, then we fail to reject the null hypothesis. This does not prove that the null hypothesis is true, but gives a way to quantify how likely it is to be true. We now state the results of the hypothesis test in such a way that the original claim is addressed.
An Example of a Hypothesis Test Mathematics and statistics are not for spectators. To truly understand what is going on, we should read through and work through several examples. If we know about the ideas behind hypothesis testing and seen an overview of the method, then the next step is to see an example. The following shows an example of the both traditional method of a hypothesis test and the p-value method.
Here the test statistic falls within the critical region. C.K.Taylor Suppose that a doctor claims that 17 year olds have an average body temperature that is higher than the commonly accepted average human temperature of 98.6 degrees Fahrenheit. A simple random statistical sample of 25 people, each of age 17, is selected. The average temperature of the 17 year olds is found to be 98.9 degrees, with standard deviation of 0.6 degrees.
The Null and Alternative Hypotheses The claim being investigated is that the average body temperature of 17 year olds is greater than 98.6 degrees This corresponds to the statement x 98.6. The negation of this is that the population average is not greater than 98.6 degrees. In other words the average temperature is less than or equal to 98.6 degrees. In symbols this is x < 98.6. One of these statements must become the null hypothesis, and the other should be the alternative hypothesis. The null hypothesis contains equality. So for the above, the null hypothesis H0 : x = 98.6. It is common practice to only state the null hypothesis in terms of an equals sign, and not a greater than or equal to or less than or equal to.
The statement that does not contain equality is the alternative hypothesis, or H1 : x >98.6. One or Two Tails? The statement of our problem will determine which kind of test to use. If the alternative hypothesis contains a "not equals to" sign, then we have a two tailed test. In the other two cases, when the alternative hypothesis contains a strict inequality, we use a one tailed test. This is our situation, so we use a one tailed test. Choice of a Significance Level Here we choose the value of alpha, our significance level. It is typical to let alpha be 0.05 or 0.01. For this example we will use a 5% level and alpha will be equal to 0.05.
Choice of Test Statistic and Distribution Now we need to determine which distribution to use. The sample is from a population that is normally distributed as the bell curve, so we can use the standard normal distribution. A table of z-scores will be necessary. The test statistic is found by the formula for the mean of a sample, rather than the standard deviation we use the standard error of the mean. Here n=25, which has square root of 5, so the standard error is 0.6/5 = 0.12. Our test statistic is z = (98.9-98.6)/.12 = 2.5 Accepting and Rejecting At a 5% significance level, the critical value for a one tailed test is found from the table of z-scores to be 1.645. This is illustrated in the diagram above. Since the test statistic does fall within the critical region, we reject the null hypothesis. The p-Value Method There is a slight variation if we conduct our test using p-values. Here we see that a z-score of 2.5 has a p-value of 0.0062. Since this is less than the significance level of 0.05, we reject the null hypothesis. Conclusion We conclude by stating the results of our hypothesis test. The statistical evidence shows that either a rare event has occurred, or that the average temperature of 17 year olds is in fact greater than 98.6 degrees.
How to Calculate a Standard Deviation 1. 2. 3. Calculate the mean of your data set. Subtract the mean from each of the data values and list the differences. Square each of the differences from the previous step and make a list of the squares.
2. 3. 4. 5.
In other words, multiply each number by itself. Be careful with negatives. A negative times a negative makes a positive. Add the squares from the previous step together. Subtract one from the number of data values you started with. Divide the sum from step four by the number from step five. Take the square root of the number from the previous step. This is the standard deviation.
You may need to use a basic calculator to find the square root. Be sure to use significant figures when rounding your answer.
A Worked Example Suppose you're given the data set 1,2,2,4,6. Work through each of the steps to find the standard deviation. 1. Calculate the mean of your data set. The the mean of the data is (1+2+2+4+6)/5 = 15/5 = 3. 2. Subtract the mean from each of the data values and list the differences. Subtract 3 from each of the values 1,2,2,4,6 1-3 = -2 2-3 = -1 2-3 = -1 4-3 = 1 6-3 = 3 Your list of differences is -2,-1,-1,1,3 3. Square each of the differences from the previous step and make a list of the squares. You need to square each of the numbers -2,-1,-1,1,3 Your list of differences is -2,-1,-1,1,3 (-2)2 = 4 (-1)2=1 (-1)2=1 12=1 32=9 Your list of squares is 4,1,1,1,9 4. Add the squares from the previous step together. You need to add 4+1+1+1+9=16 5. Subtract one from the number of data values you started with. 6. You began this process (it may seem like awhile ago) with five data values. One less than this is 5-1 = 4. Divide the sum from step four by the number from step five.
The sum was 16, and the number from the previous step was 4. You divide these two numbers 16/4 = 4. 7. Take the square root of the number from the previous step. This is the standard deviation. Your standard deviation is the square root of 4, which is 2.
Tip: Its sometimes helpful to keep everything organized in a table, like the one shown below. Data 1 2 2 4 6 Data-Mean -2 -1 -1 1 3 (Data-Mean)2 4 1 1 1 9
What Is the Formula for the Chi-Square Statistic and How Do You Use It?
To see how to compute a chi-square statistic using the formula, suppose that we have the following data from an experiment:

Expected: 25 Observed: 23 Expected: 15 Observed: 20 Expected: 4 Observed: 3 Expected: 24 Observed: 24 Expected: 13 Observed: 10
Next, compute the differences for each of these. Because we will end up squaring these numbers, you may subtract them in any order. Staying consistent with our formula, we will subtract the observed counts from the expected ones:

25 23 = 2 15 20 =-5 43=1 24 24 = 0 13 10 = 3
Now square all of these differences: and divide by the corresponding expected value:

22/25 = 0 .16 (-5)2/15 = 1.6667 12/4 = 0.25 02/24 = 0 32 /13 = 0.5625
Finish by adding the above numbers together: 0.16 + 1.6667 + 0.25 + 0 + 0.5625 = 2.693 Further work involving hypothesis testing would need to be done to determine what significance there is with this value of 2.
To see how a chi-square hypothesis test works with a multinomial experiment, we will investigate the following two examples.
chi-square
Example 1: A Fair Coin A fair coin has a equal probability of 1/2 of coming up heads or tails. We toss a coin 1000 times and record the results of a total of 580 heads and 420 tails. We want to test the hypothesis at a 95% level of confidence that the coin we flipped is fair. More formally, the null hypothesis H0 is that the coin is fair. Since we are comparing observed frequencies of results from a coin toss to the expected frequencies from an idealized fair coin, a chisquare test should be used. Compute the Chi-Square Statistic
We begin by computing the chi-square statistic for this scenario. There are two events, heads and tails. Heads has an observed frequency of f1 = 580 with expected frequency of e1 = 50% x 1000 = 500. Tails has an observed frequency of f2 = 420 with expected frequency of e1 = 500. We now use the formula for the chi-square statistic and see that 2 = (f1 - e1 )2/e1 + (f2 - e2 )2/e2= 802/500 + (80)2/500 = 25.6. Find the Critical Value Next we need to find the critical value for the proper chi-square distribution. Since there are two outcomes for the coin there are two categories to consider. The number of degrees of freedom is one less than the number of categories: 2 - 1 = 1. We use the chi-square distribution for this number of degrees of freedom and see that 20.95=3.841. Reject or Fail to Reject? Finally we compare the calculated chi-square statistic with the critical value from the table. Since 25.6 > 3.841, we reject the null hypothesis that this is a fair coin. Example 2: A Fair Die A fair die has a equal probability of 1/6 of rolling a one, two, three, four, five or six. We roll a die 600 times and note that we roll a one 106 times, a two 90 times, a three 98 times, a four 102 times, a five 100 times and a six 104 times. We want to test the hypothesis at a 95% level of confidence that we have a fair die. Compute the Chi-Square Statistic There are six events, each with expected frequency of 1/6 x 600 = 100. The observed frequencies are f1 = 106, f2 = 90, f3 = 98, f4 = 102, f5 = 100, f6 = 104, We now use the formula for the chi-square statistic and see that 2 = (f1 - e1 )2/e1 + (f2 - e2 )2/e2+ (f3 - e3 )2/e3+(f4 - e4 ) 2 /e4+(f5 - e5 )2/e5+(f6 - e6 )2/e6 = 1.6. Find the Critical Value Next we need to find the critical value for the proper chi-square distribution. Since there are six categories of outcomes for the die, the number of degrees of freedom is one less than this: 6 - 1 = 5. We use the chi-square distribution for five degrees of freedom and see that 20.95=11.071.
Reject or Fail to Reject? Finally we compare the calculated chi-square statistic with the critical value from the table. Since the calculated chi-square statistic is 1.6 < 11.071, we do not reject the null hypothesis that this is a fair die. What Is ANOVA? Many times when we study a group, we are really comparing two populations. Depending upon the parameter of this group we are interested in and the conditions we are dealing with, there are several techniques available. Statistical inference procedures that concern the comparison of two populations cannot usually be applied to three or more populations. To study more than two populations at once, we need different types of statistical tools. Analysis of variance, or ANOVA, is a technique from statistical interference that allows us to deal with several populations. Comparison of Means
To see what problems arise and why we need ANOVA, we will consider an example. Suppose we are trying to determine if the mean weights of green, red, blue and orange M&M candies are different from each other. We will state the mean weights for each of these populations, 1, 2, 3 4 and respectively. We may use the appropriate hypothesis test several times, and test C(4,2), or six different null hypotheses:

H0: 1 = 2 to check if the mean weight of the population of the red candies is different than the mean weight of the population of the blue candies. H0: 2 = 3 to check if the mean weight of the population of the blue candies is different than the mean weight of the population of the green candies. H0: 3 = 4 to check if the mean weight of the population of the green candies is different than the mean weight of the population of the orange candies. H0: 4 = 1 to check if the mean weight of the population of the orange candies is different than the mean weight of the population of the red candies. H0: 1 = 3 to check if the mean weight of the population of the red candies is different than the mean weight of the population of the green candies. H0: 2 = 4 to check if the mean weight of the population of the blue candies is different than the mean weight of the population of the orange candies.
There are many problems with this kind of analysis. We will have six p-values. Even though we may test each at a 95% level of confidence, our confidence in the overall process is less than this because probabilities multiply: .95 x .95 x .95 x .95 x .95 x .95 is approximately .74, or an 74% level of confidence. Thus the probability of a type I error has increased. At a more fundamental level, we cannot compare these four parameters as a whole by comparing them two at a time. The means of the red and blue M&Ms may be significant, with the mean weight of red being relatively larger than the mean weight of the blue. However, when we consider the mean weights of all four kinds of candy, there may not be a significant difference. Analysis of Variance To deal with situations in which we need to make multiple comparisons we use ANOVA. This test allows us to consider the parameters of several populations at once, without getting into some of the problems that confront us by conducting hypothesis tests on two parameters at a time. To conduct ANOVA with the M&M example above, we would test the null hypothesis H 0:1 = 2 = 3= 4. This states that there is no difference between the mean weights of the red, blue and green M&Ms. The alternative hypothesis is that there is some difference between the mean weights of the red, blue, green and orange M&Ms. This hypothesis is really a combination of several statements H a:

The mean weight of the population of red candies is not equal to the mean weight of the population of blue candies, OR The mean weight of the population of blue candies is not equal to the mean weight of the population of green candies, OR The mean weight of the population of green candies is not equal to the mean weight of the population of orange candies, OR The mean weight of the population of green candies is not equal to the mean weight of the population of red candies, OR The mean weight of the population of blue candies is not equal to the mean weight of the population of orange candies, OR The mean weight of the population of blue candies is not equal to the mean weight of the population of red candies.
In this particular instance in order to obtain our p-value we would utilize a probability distribution known as the Fdistribution. Calculations involving the ANOVA F test can be done by hand, but are typically computed with statistical software. Multiple Comparisons
What separates ANOVA from other statistical techniques is that it is used to make multiple comparisons. This is common throughout statistics, as there are many times where we want to compare more than just two groups. Typically an overall test suggests that there is some sort of difference between the parameters we are studying. We then follow this test with some other analysis to decide which parameter differs.
Analysis of Variance, or ANOVA for short, is a statistical test that looks for significant differences between means. For example, say you are interested in studying the education level of athletes in a community, so you survey people on various teams. You start to wonder, however, if the education level is different among the different teams. You could use an ANOVA to determine if the mean education level is different among the softball team versus the rugby team versus the Ultimate Frisbee team.
ANOVA Models There are four types of ANOVA models. Following are descriptions and examples of each. One-way between groups ANOVA. A one-way between groups ANOVA is used when you want to test the difference between two or more groups. This is the simplest version of ANOVA. The example of education level among different sports teams above would be an example of this type of model. There is only one grouping (type of sport played) that you are using to define the groups. One-way repeated measures ANOVA. A one-way repeated measures ANOVA is used when you have a single group on which you have measured something more than one time. For example, if you wanted to test students understanding of a subject, you could administer the same test at the beginning of the course, in the middle of the course, and at the end of the course. You would then use a one-way repeated measures ANOVA to see if students performance on the test changed over time. Two-way between groups ANOVA. A two-way between groups ANOVA is used to look at complex groupings. For example, the students grades in the previous example could be extended to see if students abroad performed differently to local students. So you would have three effects from this ANOVA: the effect of the final grade, the effect of abroad versus local, and the interaction between the final grade and overseas/local. Each of the main effects are one-way tests. The interaction effect is simply asking if there is any significant difference in performance when you test he final grade and overseas/local acting together. Two-way repeated measures ANOVA. Two-way repeated measures ANOVA uses the repeated measures structure but also includes an interaction effect. Using the same example of one-way repeated measures (test grades before and after a course), you could add gender to see if there is any joint effect of gender and time of testing. That is, do males and females differ in the amount of information they remember over time?
Assumptions of ANOVA The following assumptions exist when you perform an analysis of variance:

The expected values of the errors are zero. The variances of all errors are equal to each other. The errors are independent from one another. The errors are normally distributed.
How an ANOVA is Done

The mean is calculated for each of your groups. Using the example of education and sports teams from the introduction in the first paragraph above, the mean education level is calculated for each sports team. The overall mean is then calculated for all of the groups combined. Within each group, the total deviation of each individuals score from the group mean is calculated. This is called within group variation. Next, the deviation of each group mean from the overall mean is calculated. This is call between group variation.
Finally, an F statistic is calculated, which is the ratio of between group variation to the within group variation.
If the between group variation is significantly greater than the within group variation, then it is likely that there is a statistically significant difference between the groups. The statistical software that you use will tell you if the F statistic is significant or not. All versions of ANOVA follow the basic principles outlined above, but as the number of groups and the interaction effects increase, the sources of variation will get more complex. Performing an ANOVA It is very unlikely that you would do an ANOVA by hand. Unless you have a very small data set, the process would be very time consuming. All statistical software programs provide for ANOVA. SPSS is okay for simple one-way analyses, however anything more complicated becomes difficult. Excel also allows you to do ANOVA from the Data Analysis Add-on, however the instructions are not very good. SAS, STATA, Minitab, and other statistical software programs that are equipped for handling bigger and more complex data sets are all better for performing an ANOVA. What Is a Degree of Freedom? In many statistical problems we are required to determine the degrees of freedom. This refers to a positive whole number that indicates the lack of restrictions in our calculations. The degree of freedom is the number of values in a calculation that we can vary. A Few Examples For a moment suppose that we know the mean of data is 25 and that the values are 20,10, 50, and one unknown value. To find the mean of a list of data, we add all of the data and divide by the total number of values. This gives us the formula (20 + 10 + 50 + x)/4 = 25, where x denotes the unknown . Despite calling this unknown, we can use some algebra to determine that x = 20. Let's alter this scenario slightly. Instead we suppose that we know the mean of a data set is 25, with values 20, 10, and two unknown values. These unknowns could be different, so we use two different variables, x and y to denote this. The resulting formula is (20 + 10 + x + y)/4 = 25. With some algebra we obtain y = 70 - x. The formula is written in this form to show that once we choose a value for x, the value for y is determined. This shows that there is one degree of freedom. Now we'll look at a sample size of one hundred. If we know that the mean of this sample data is 20, but do not know the values of any of the data, then there are 99 degrees of freedom. All values must add up to a total of 20 x 100 = 2000. Once we have the values of 99 elements in the data set, then the last one has been determined. Student t Distribution Degrees of freedom play an important role when using the Student t-score table. There are actually several t-score distributions. We differentiate between these distributions by use of degrees of freedom. Here the probability distribution that we use depends upon the size of our sample. If our sample size is n, then the number of degrees of freedom is n - 1. For instance, a sample size of 22 would require us to use the row of the t-score table with 21 degrees of freedom. Chi-Square Distribution The use of a chi-square distribution also requires the use of degrees of freedom. Here, in an identical manner as with the t distribution, the sample size determines which distribution to use. If the sample size is n, then there are n - 1 degrees of freedom.
Standard Deviation Another place where degrees of freedom show up is in the formula for the standard deviation. This occurrence is not as overt, but we can see it if we know where to look. To find a standard deviation we are looking for the "average" deviation from the mean. However after subtracting the mean from each data value and squaring the differences, we end up dividing by n - 1 rather than n as we might expect. The presence of the n - 1 comes from the number of degrees of freedom. Since the n data values and the sample mean are being used in the formula, there are n - 1 degrees of freedom. Advanced Techniques More advanced statistical techniques use more complicated ways of counting the degrees of freedom. When calculating the test statistic for two means with independent samples of n1 and n2 elements, the number of degrees of freedom has quite a complicated formula. It can be estimated by using the smaller of n1 - 1 and n2 1 Student t Distribution Table Another example of a different way to count the degrees of freedom comes with an F test. In conducting an F test we have k samples each of size n. The degrees of freedom in the numerator is k - 1 and in the denominator is k(n 1). This table is a compilation of data from the Student t distribution. Anytime that a t-distribution is being used, a table such as this one can be consulted to perform calculations. This distribution is similar to the standard normal distribution, or bell curve, however the table is arranged differently than the table for the bell curve. The table below provides critical t-values for a particular area of one tail (listed along the top of the table) and degrees of freedom (listed along the side of the table). Degrees of freedom range from 1 to 30, with the bottom row of "Large" referring to several thousand degrees of freedom.
Critical Values of t Distribution t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0.40 0.324920 0.288675 0.276671 0.270722 0.267181 0.264835 0.263167 0.261921 0.260955 0.260185 0.259556 0.259033 0.258591 0.258213 0.257885 0.257599 0.25 1.000000 0.816497 0.764892 0.740697 0.726687 0.717558 0.711142 0.706387 0.702722 0.699812 0.697445 0.695483 0.693829 0.692417 0.691197 0.690132 0.10 3.077684 1.885618 1.637744 1.533206 1.475884 1.439756 1.414924 1.396815 1.383029 1.372184 1.363430 1.356217 1.350171 1.345030 1.340606 1.336757 0.05 6.313752 2.919986 2.353363 2.131847 2.015048 1.943180 1.894579 1.859548 1.833113 1.812461 1.795885 1.782288 1.770933 1.761310 1.753050 1.745884 0.025 12.70620 4.30265 3.18245 2.77645 2.57058 2.44691 2.36462 2.30600 2.26216 2.22814 2.20099 2.17881 2.16037 2.14479 2.13145 2.11991 0.01 31.82052 6.96456 4.54070 3.74695 3.36493 3.14267 2.99795 2.89646 2.82144 2.76377 2.71808 2.68100 2.65031 2.62449 2.60248 2.58349 0.005 63.65674 9.92484 5.84091 4.60409 4.03214 3.70743 3.49948 3.35539 3.24984 3.16927 3.10581 3.05454 3.01228 2.97684 2.94671 2.92078 0.0005 636.6192 31.5991 12.9240 8.6103 6.8688 5.9588 5.4079 5.0413 4.7809 4.5869 4.4370 4.3178 4.2208 4.1405 4.0728 4.0150
17 18 19 20 21 22 23 24 25 26 27 28 29 30 Large
0.257347 0.257123 0.256923 0.256743 0.256580 0.256432 0.256297 0.256173 0.256060 0.255955 0.255858 0.255768 0.255684 0.255605 0.253347
0.689195 0.688364 0.687621 0.686954 0.686352 0.685805 0.685306 0.684850 0.684430 0.684043 0.683685 0.683353 0.683044 0.682756 0.674490
1.333379 1.330391 1.327728 1.325341 1.323188 1.321237 1.319460 1.317836 1.316345 1.314972 1.313703 1.312527 1.311434 1.310415 1.281552
1.739607 1.734064 1.729133 1.724718 1.720743 1.717144 1.713872 1.710882 1.708141 1.705618 1.703288 1.701131 1.699127 1.697261 1.644854
2.10982 2.10092 2.09302 2.08596 2.07961 2.07387 2.06866 2.06390 2.05954 2.05553 2.05183 2.04841 2.04523 2.04227 1.95996
2.56693 2.55238 2.53948 2.52798 2.51765 2.50832 2.49987 2.49216 2.48511 2.47863 2.47266 2.46714 2.46202 2.45726 2.32635
2.89823 2.87844 2.86093 2.84534 2.83136 2.81876 2.80734 2.79694 2.78744 2.77871 2.77068 2.76326 2.75639 2.75000 2.57583
3.9651 3.9216 3.8834 3.8495 3.8193 3.7921 3.7676 3.7454 3.7251 3.7066 3.6896 3.6739 3.6594 3.6460 3.2905
Definition of The F Distribution / F-Distribution: The F distribution is defined in terms of two independent chi-squared variables. Let u and v be independently distributed chi-squared variables with u1 and v1 degrees of freedom, respectively. Then the statistic: F=(u/u1)/(v/v1) has an F distribution with (u1,v1) degrees of freedom. As can be computed from the definition of the t distribution, the square of a t statistic may be written: t2=(z2/1)/(v/v1), where z2, being the square of a standard normal variable, has a chi-squared distribution. Thus the square of a t variable with v1 degrees of freedom is an F variable with (1,v 1) degrees of freedom, that is: t2=F(1,v1). (Econterms)
Chi-Square Goodness-of-Fit Test

The Chi-Square Test for Goodness of Fit tests claims about population proportions. It is a nonparametric test that is performed on categorical(nominal or ordinal) data. Let's try an example. In the 2000 U.S. Census, the ages of individuals in a small town were found to be the following:
In 2010, ages of n = 500 individuals from the same small town were sampled. Below are the results:
Using alpha = 0.05, would you conclude that the population distribution of ages has changed in the last 10 years? Using our sample size and expected percentages, we can calculate how many people we expected to fall within each range. We can then make a table separating observed values versus expected values:
Figure 3. Let's perform a hypothesis test on this new table to answer the original question. Steps for Chi-Square Test for Goodness of Fit
1. Define Null and Alternative Hypotheses 2. State Alpha 3. Calculate Degrees of Freedom 4. State Decision Rule 5. Calculate Test Statistic 6. State Results 7. State Conclusion 1. Define Null and Alternative Hypotheses
Figure 4. 2. State Alpha alpha = 0.05 3. Calculate Degrees of Freedom df = k 1, where k = your number of groups. df = 3 1 = 2 4. State Decision Rule Using our alpha and our degrees of freedom, who look up a critical value in the Chi-Square Table. We find our critical value to be 5.99.
5. Calculate Test Statistic The Chi-Square statistic is found using the following equation, where observed values are compared to expected values:
6. State Results
Reject the null hypothesis. 7. State Conclusion The ages of the 2010 population are different than those expected based on the 2000 population.
One-Way ANOVA Steps/Formula
SSwithin = SStotal - SSamong dfamong = r-1 dfwithin = N-r
x = individual observation r = number of groups N = total number of observations (all groups) n = number of observations in group Steps (assuming three groups) Create six columns: "x1", "x12", "x2", "x22", "x3", and "x32" 1. Put the raw data, according to group, in "x1", "x2", and "x3" 2. Calculate the sum for group 1. 3. Calculate (Sx)2 for group 1. 4. Calculate the mean for group 1 5. Calculate Sx2 for group 1. 6. Repeat steps 2-5 for groups 2 and 3 7. Set up SStotal and SSamong formulas and calculate 8. Calculate SSwithin 9. Enter sums of squares into the ANOVA table, and complete the table by calculating: dfamong, dfwithin, MSamong, and MSwithin, and F 10. Check to see if F is statistically significant on probability table with appropriate degrees of freedom and p < .05.
Problem: Susan Sound predicts that students will learn most effectively with a constant background sound, as opposed to an unpredictable sound or no sound at all. She randomly divides twenty-four students into three groups of eight. All students study a passage of text for 30 minutes. Those in group 1 study with background sound at a constant volume in the background. Those in group 2 study with noise that changes volume periodically. Those in group 3 study with no sound at all. After studying, all students take a 10 point multiple choice test over the material. Their scores follow: group 1) constant sound 2) random sound 3) no sound x1 7 4 6 8 6 6 2 9 Sx1 = 48 (Sx1)2 = 2304 M1 = 6 x 49 16 36 64 36 36 4 81 Sx12 = 322
2 1
test scores 7 4 6 8 6 6 2 9 5 5 3 4 4 7 2 2 2 4 7 1 2 1 5 5 11. x22 25 25 9 16 16 49 4 4 Sx22 = 148 x3 2 4 7 1 2 1 5 5 Sx3 = 27 (Sx3)2 = 729 M3 = 3.375 x32 4 16 49 1 4 1 25 25 Sx32 = 125
x2 5 5 3 4 4 7 2 2 Sx2 = 32 (Sx2)2 = 1024 M2 = 4
12. 13. = 595 - 477.04 14. SStotal = 117.96
15. 16. = 507.13 - 477.04 17. SSamong = 30.08 18. SSwithin = 117.96 - 30.08 = 87.88 Source SS df MS F
Among 30.08 2
15.04 3.59
Within 87.88 21 4.18 19. *(according to the F sig/probability table with df = (2,21) F must be at least 3.4668 to reach p < .05, so F score is statistically significant) 20. Interpretation: Susan can conclude that her hypothesis may be supported. The means are as she predicted, in that the constant music group has the highest score. However, the signficant F only indicates that at least two means are signficantly different from one another, but she can't know which specific mean pairs significantly differ until she conducts a post-hoc analysis (e.g., Tukey's HSD).
One-Way ANOVA
Let's perform a one-way ANOVA: Researchers want to test a new anti-anxiety medication. They split participants into three conditions (0mg, 50mg, and 100mg), then ask them to rate their anxiety level on a scale of 1-10. Are there any differences between the three conditions using alpha = 0.05?
Steps for One-Way ANOVA
1. Define Null and Alternative Hypotheses 2. State Alpha 3. Calculate Degrees of Freedom 4. State Decision Rule 5. Calculate Test Statistic 6. State Results 7. State Conclusion Let's begin. 1. Define Null and Alternative Hypotheses
2. State Alpha Alpha = 0.05 3. Calculate Degrees of Freedom Now we calculate the degrees of freedom using N = 21, n = 7, and a = 3. You should already recognize N and n. "a" refers to the number of groups ("levels") you're dealing with:
4. State Decision Rule To look up the critical value, we need to use two different degrees of freedom.
We now head to the F-table and look up the critical value using (2, 18) and alpha = 0.05. This results in a critical value of 3.5546, so our decision rule is: If F is greater than 3.5546, reject the null hypothesis. 5. Calculate Test Statistic To calculate the test statistic, we first need to find three values:
Figure 5.
Figure 6.
Figure 7.
Figure 8. All the values we've found so far can be organized in an ANOVA table:
Figure 9. Now we find each MS by diving each SS by their respective df:
And finally, we can calculate our F:
6. State Results F = 86.56 Result: Reject the null hypothesis. 7. State Conclusion The three conditions differed significantly on anxiety level, F(2, 18) = 86.56, p < 0.05.

Hypothesis Testing, Test Statistic (Z, P, T, F)

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Hypothesis Testing, Test Statistic (Z, P, T, F)

Încărcat de

Drepturi de autor:

Formate disponibile

Hypothesis Testing Examples 1.

22.60 20 = 2.50 7.28 49

84.50 90 = 14.50 -2.078 30

69,800 61,650 = 5,000 9.78 36

= 15.2 < 15.2

16.32 16 = 0.8 2.19 30

22/25 = 0 .16 (-5)2/15 = 1.6667 12/4 = 0.25 02/24 = 0 32 /13 = 0.5625

How an ANOVA is Done

Chi-Square Goodness-of-Fit Test

One-Way ANOVA Steps/Formula

SSwithin = SStotal - SSamong dfamong = r-1 dfwithin = N-r

x2 5 5 3 4 4 7 2 2 Sx2 = 32 (Sx2)2 = 1024 M2 = 4

12. 13. = 595 - 477.04 14. SStotal = 117.96

Steps for One-Way ANOVA

Figure 9. Now we find each MS by diving each SS by their respective df:

And finally, we can calculate our F:

S-ar putea să vă placă și