Sunteți pe pagina 1din 4

GEOG 683: Introduction to Geographic Analysis Lab 6: The Normal Distribution and Hypothesis Testing

Assigned on October 28, 2004 (Friday, 6th week) Due at the start of lab, November 4 (Friday, 7th week). T.A.: Fang Ren (ren.21@osu.edu, 1083 Derby Hall, 688-3936) 1. Overview of the Lab 6 The goal of this lab is for students to further explore the normal distribution and effectively perform hypothesis testing in the SPSS environment. 2. Setting up the Lab 6 We will continue Beadata.xls for this lab. Open your beadata SPSS file and create a new data set named lab6_data.sav and keep only BEAFIPS, BEANAME, CREGION, URBAN9, HINCOME9, and POVFAM9 in the data set. 3. Confidence Intervals In SPSS, we can compute the confidence Intervals for a sample of data. To do this we need to know the data values mean, standard deviation, the confidence interval that we wish to use, and whether or not the population standard deviation is known. Important elements: mean: a sample mean sd: population standard deviation or sample standard deviation n: size of a sample conf. level: a confidence level, e.g. 0.95 (not 95) distribution: normal or t. In this assignment, you need to calculate confidence intervals for the HIGHSCHL9 variable in sample regions and record your results in a table. To do this, make a table with columns for Confidence Level, Significance Level, Number of Standard Errors, Confidence Interval, Confidence Interval Range for the MIDWEST Region of the HIGHSCHL9 variable in the lab6_data.sav data set. Refer to Table 7.2 in McGrew and Monroe for an example. To calculate the confidence intervals: First, create a new blank data set. Add a new variable named CONLEVEL for confidence level. In the Data View, type in the following value for CONLEVEL: .80, .90, .95, .99 as feasible confidence intervals for which we want to find further information. Save your data set as confidence1.sav. Now calculate the corresponding two-tailed confidence levels by computing TTCONLV as the Target Variable and entering in the following formulation:: (1-CONLEVEL)/2 Calculate t values for each confidence level by using Transform->Compute. The Target Variable is TV and use the following inverse T distribution function: IDF.T(TTCONLV, df)

Remember: dealing with the population standard deviation vs. the sample standard deviation makes a difference in how you should calculate your confidence intervals. You will have to decide what df should be (see class notes or textbook) Calculate the confidence interval by setting CI as the Target Variable and typing the following into the formula box: TV*(std. deviation/sqrt(n)). Subtract and add the resulting number to the sample mean to get your confidence interval (i.e. if the mean is 9.6 and CI=.49, your confidence intervals lower bound would be 9.6-.49=9.11 and the upper bound would be 9.6+4.9=10.09.

Assignment I We assume that the different census regions (CREGION) denote different sample sets for the population and HIGSCHL9 is normally distributed. Assume the standard deviation of overall HIGSCHL9 is the population standard deviation (for the following two questions) o Calculate confidence intervals for the MIDWEST sample in HIGSCHL9 at the .80, .90, .95, and .99 levels report the results in a table format like table 7.2 and briefly discuss the differences among different confidence levels in terms of the confidence intervals width. o Calculate confidence intervals for the different regions (NE, MW, S, and W) in HIGSCHL9 at the 0.95 confidence level, report the results in a table format like the one you did above (the first column should be replaced by regions), and briefly discuss the differences among different regions in terms of the confidence interval width Assume the population standard deviation is not known o Recalculate confidence intervals for the different regions (NE, MW, S, and W) in HIGSCHL9 at the 0.95 confidence level. Report the results in a table format like that above, and briefly discuss the differences between two tables, one from above and the other from here, in terms of the confidence interval width. 4. Difference of Means Test One-Sample Difference of Means Test The Difference of Means Z-test is as follows: X where Z= / n Z=test statistics X =the sample mean = population mean = population standard deviation n=sample size Read the following research task below and conduct a one-sample difference of means test according to the 6-step scheme of hypothesis testing presented in class. Remember to use the Transform->Compute dialogue to calculate any necessary new variables.

[Research Task] We know from previous analysis that the poverty level (POVFAM9) among BEA Economic Analysis zones is highest in the SOUTH region, among the four census regions (CREGION), which means that it is higher than the average value for the overall US. Now, you are asked to test on whether the poverty level for SOUTH is significantly higher than that for the overall US at the 95% confidence interval. Assignment II Answer the following questions o [Step 1] What is your null and alternate hypothesis in an equation format (i.e. H 0 = = 2.61) o [Step 2] What is the appropriate statistical test? Explain your selection. o [Step 3] What is your significance level ( )? Why did you choose it? What level of type errors are you willing to accept? o [Step 4] What is your decision rule? o [Step 5] Provide the test statistics at the precision of 2. o [Step 6] What is your decision about the hypothesis (reject or fail to reject)? 5. Normal Probability Distribution Calculations Assignment III Answer the following questions by hand calculations using normal distribution tables or using SPSS. If you use SPSS, write down in your answer the function you used and its syntax and make sure you know how it calculates the output values.

1. Assuming a standard normal distribution, what are the z values for equal to 0.10, 0.05, and 0.01. What do these values correspond to? 2. Assuming a standard normal distribution, what the probabilities of a z value of 1, 2, and 2.34? 3. What are the (90%, 95%, and 99%) confidence intervals associated with the average herd sizes in Ohio dairy farms. X i ~ N ( , 2 ), n = 37, 2 = 1050 , and estimated sample mean is

90.2 ( X = 90.2) . That is, find:

Pr X z 2 = 0.90 X + z 2 n n Pr X z 2 = 0.95 X + z 2 n n Pr X z 2 = 0.99 X + z 2 n n 4. How would these intervals change if variance was unknown with s 2 = 1050 ? Calculate the new intervals.
6. Log Out and Wrap Up

o Do not forget to log out when you are finished. o Your hand-in should look like a professional report that has a cover sheet with your name and lab section number and where documents, tables, and graphs are well presented as professional as possible and in the order in which questions are asked. o Do your homework on your own.

S-ar putea să vă placă și