0 evaluări0% au considerat acest document util (0 voturi)
54 vizualizări4 pagini
A statistical procedure to examine the degree of correlation is required. Correlation coefficients are used to quantitatively describe the strength and direction of a relationship between two variables. A statistic that measures the correlation between two 'rank' measurements is Spearman's r.
A statistical procedure to examine the degree of correlation is required. Correlation coefficients are used to quantitatively describe the strength and direction of a relationship between two variables. A statistic that measures the correlation between two 'rank' measurements is Spearman's r.
A statistical procedure to examine the degree of correlation is required. Correlation coefficients are used to quantitatively describe the strength and direction of a relationship between two variables. A statistic that measures the correlation between two 'rank' measurements is Spearman's r.
Correlation and Association between Variables To answer questions like: Are ticket prices for professional basketball games related to attendance at the games? Is there a statistical significant relationship? We would like to predict university grade point average of newly admitted students. Do grade 13 marks or SAT scores predict first year university grades accurately? How accurately can we predict gas consumption from temperatures? Is there a relationship between muscle strength and functional capacity in arthritis patients? A statistical procedure to examine the degree of correlation is required. If the variables tend to increase or decrease together, Positive correlation As one variable increases in value, the other tend to decreases, Negative correlation Correlation Between Interval or Ratio Measurements Correlation coefficients are used to quantitatively describe the strength and direction of a relationship between two variables. When both variables are at least interval measurements, may report Pearson product moment coefficient of correlation that is also known as the correlation coefficient, and is denoted by r. Pearson correlation coefficient is only appropriate to describe linear correlation. The appropriateness of using this coefficient could be examined through scatter plots. The rationale of this statistic to measure linear correlation is to be discussed in class. A statistic that measures the correlation between two rank measurements is Spearmans , a nonparametric analog of Pearsons r. Spearmans is appropriate for skewed continuous or ordinal measurements. Correlation matrix presents the correlation coefficients for all pairs of variables in a matrix form. Appropriateness of using r will be examined. ADMS 3352 3.0 Sampling Technique and Survey Studies Statistical tests are available to test hypotheses on . Ho: There is no correlation between the two variables (H0: = 0). Analysis of Two-way Contingency Tables Sampling models: Multinomial Independent Binomial Poisson Correlation between ordinal or nominal measurements are usually referred to as association Examine the association through a contingency table. (Try a scattergram. The need for further display of information is very transparent.) Chi Square test of independence of the Row and Column Variables Testing of independence using the likelihood ratio chi-squared statistic G 2 Fishers Exact Test of independence If one can consider margins to be fixed Assume hypergeometric distribution Use Fishers Exact Test Odd Ratio (OR) as a measure of association Let p1=n11/ n1+, p2=n21/ n2+ OR = [ p1 / (1- p1 ) ] / [ p2 / (1- p2 ) ] Retrospective studies: OR estimates relative risks (RR) When outcome is a rare event (n11 and n21 are small): OR estimates RR In prospective studies: RR=p1/ p2 For independent groups (say, the Row variable), one may compare the proportion in Column C j given Row R i to that of Row R i , and test the difference between the two proportions, d. Pearsons Chi Square statistic is proportional to d 2 . ADMS 3352 3.0 Sampling Technique and Survey Studies The Chi square Statistic Assumptions: 1. Frequencies represent individual counts 2. Categories are exhaustive and mutually exclusive Rationale: Test of independence between the Row and Column Variables: Compare the observed to the expected cell counts under the assumption of independence. Test Goodness of Fit: Compare the observed to the expected cell counts under the theoretical distribution. Validity: Expected cell size > 5 Yates correction General note on Chi Square Statistics 1. Require large samples 2. Chi square statistic is sensitive to increase in sample size. Increase in sample size increases Chi square even if the association is the same. 3. Ignore information if the variables were ordinal in nature less powerful Common Coefficients of Association for Ordinal Variables Pearsons product-moment correlation Spearmans rho () Cochran-Armitage trend test Kendalls tau, Gamma, and Somers D statistics 1. Based on the classification of all possible pairs of subjects in the table as concordant or discordant pairs 2. All take on values from 1 to +1 3. Somers D: Adjustment for ties are made on the independent variable only 4. Gamma is the least conservative among three 5. Gamma ignores ties ADMS 3352 3.0 Sampling Technique and Survey Studies Nominal Ordinal Tables Mantel-Haenszel correlation statistic 1. Measures association between two variables (ordinal) across strata of the third variable. 2. The MH statistic is approximately Chi-square distributed 3. Validity: requires the across-strata sum of sample size to be at lease 40. 4. The Mantel-Haenszel test is not sensitive to association of different directions across strata. Kappa Cohens kappa coefficient assesses raters agreement Measures the extent of agreement beyond the expected due to chance. Lambda coefficient Measures how well the knowledge of one categorical variable predicts the other. Correlation versus Comparison Correlation does not provide any information relative to the difference between the variables, only to the relative order of the scores. Therefore, it is inappropriate to draw conclusions on the differences or similarities between distributions of the variables based on correlation coefficient. Causation and Correlation Knowing that two variables, X and Y, correlate does not provide any information on how they relate. The correlation could be a result of: 1. Common response: Both variables X and Y respond to changes in some unobserved variable(s). 2. Confounding: Xs effect on Y is hopelessly mixed up with another unobserved variables effect on Y. 3. X causes Y: The order of events has to be clear. Usually, valid conclusion can only be based on controlled experiments.