Documente Academic
Documente Profesional
Documente Cultură
Lesson Objectives
After studying this session you would be able to
Understand and infer results from data in order to answer the associational and differential research questions using different parametric and non parametric tests. understand implement and interpret the chi-square, phi and cramers V understand, implement and interpret the correlation statistics understand, implement and interpret the regression statistics understand, implement and interpret the T-test statistics
Lesson Outline
1.Non parametric test. 1.Chi square /Fisher exact 2.Phi and cramers v 3.Kendall tau-b 2.Parametric test 1.Correlation 1.Pearson correlation 2.Spearman correlation 2.Regression 1.Simple regression 2.Multiple regression 3.T-Test 1.One-sample T-test 2.Independent sample T-test 3.Paired sample T-test
Inferential Statistics
Inferential statistics are used to make inferences (conclusions) about a population from a sample based on the statistical relationships or differences between two or more variables using statistical tests with the assumption that sampling is random in order to generalize or make predictions about the future.
Inferential Statistics
Inferential statistics are used
To test some hypothesis either to check relationship between
variables (two/more) or to compare two groups to measure the differences among them.
To generalize the results about a population from a sample To make predictions about the future. To make conclusions
Statistical significance test is the test of a null hypothesis Ho which is a hypothesis that we attempt to reject or nullify. i.e. Ho =There is no relationship /Difference between variable 1 and variable 2
p value > 0.05 p value < 0.05 Ho is accepted and H1 is rejected. Ho is rejected and H1 is accepted.
Confidence Interval Confidence interval is a range of values constructed for a variable of interest so that this range has a specified probability of including the true value of the variable. The specified probability is called the confidence level, and the end points of the confidence interval are called the confidence limits.
It is one of the alternatives to null hypothesis significance testing (NHST).
The effect size (weak, moderate or strong) Effect size is the strength of the relationship between the independent variable and the dependent variable, and/or the magnitude of the difference between levels of the independent variable with respect to the dependent variable. 0 >0 0.33 >0.33 0.70 >0.70 <1 1 No effect Small effect Medium/typical effect Large effect Maximum effect No relationship Weak relationship Moderate relationship Strong relationship Perfect relationship
10
Inferential statistics include a wide variety of tests to infer the results. This variety of tests can be classified in two broader categories that are 1. Non parametric tests 2. Parametric tests
11
Non parametric tests are the statistical tests that are used
When the level of measurement is nominal or ordinal. E.g. chi-square test or Kendalls tau-b. When assumptions about normal distribution in the population is not met e.g. spearman correlation
http://www.cliffsnotes.com/WileyCDA/Section/Statistics-Glossary.id-305499,articleId30041.html#ixzz0c38lKKZC retrieval data: 07/01/10
12
Chi-Squared Test
Assumptions and Conditions for the Chi-Squared
test
The data of the variables must be independent. Both the variables should be nominal. All the expected counts are greater than 1 for chi-square. At least 80% of the expected frequencies should be greater
than or equal to 5.
Chi-Squared Test
Checking Assumptions and Conditions for the Chi-Squared test
geometry in h.s. * gender Crosstabulation gender male geometry in h.s. not taken Count Expected Count % of Total Taken Count Expected Count % of Total Total Count Expected Count 10 17.7 13.3% 24 female 29 21.3 38.7% 12 Total 39 39.0 52.0% 36
16.3
32.0% 34 34.0 45.3%
19.7
16.0% 41 41.0 54.7%
36.0
48.0% 75 75.0 100.0%
% of Total
N
geometry in h.s. * gender
Percent 75 100.0%
N 0
Percent .0%
N 75
Percent 100.0%
Chi-Square Tests
Value Pearson Chi-Square Continuity Correctionb Likelihood Ratio Fisher's Exact Test 12.714a 11.112 13.086 df 1 1 1 Asymp. Sig. (2sided) .000 .001 .000 .000 12.544 1 .000 .000 Exact Sig. (2-sided) Exact Sig. (1-sided)
Linear-by-Linear Association
N of Valid Casesb
75
Symmetric Measures
Value
Nominal by Nominal
Phi
Cramer's V
N of Valid Cases
Interpretation:
To check the association between gender and geometry in h.s. chi-square test is conducted. The case processing summary table indicates that there is no participant with missing value. The assumptions are checked through crosstabs. The Crosstabulation table includes the Counts and Expected Counts, and their relative percentages within gender. The result shows that there are 24 males who had taken geometry which is 71% of total 34 male students. On the other hand, 12 of 41 females took geometry; that is only 29% of the females. It looks like a higher percentage of males took geometry than female students. The Ch-Square Test table tell us whether we can be confident that this apparent difference is not due to chance. Note, in the Cross Tabulation table, that the Expected Count of the number of male students who didnt take geometry is 17.7 and the observed or actual Count is 10. Thus, there are 7.7 fewer males who didnt take geometry than would be expected by chance, given the Totals shown in the Table. There are also the same discrepancies between observed and expected counts in the other three cells of the table. A question answered by the chi-square test is whether these discrepancies between observed and expected counts are bigger than one might expect by chance. The Chi-Square Tests table is used to determine if there is a statistically significant relationship between two dichotomous or nominal variables. It tells you whether the relationship is statistically significant but does not indicate the strength of the relationship, like phi or a correlation does. In output, we use the Pearson Chi-Square or (for small samples) the Fishers exact test to interpret the results of the test. They are statistically significant (p < .001), which indicates that we can be quite certain that males and females are different on whether they take geometry. Phi is -.412, and like the chi-square, it is statistically significant. Phi is also a measure of effect size for an associational statistic and, in this case, effect size is moderate according to Cohen (1988)
18
KENDALLS TAU-B
If the variables are ordered (i.e. ordinal), you have several other choices. We will use Kendalls tau-b in this problem.
Expected Count
% of Total Total Count Expected Count % of Total
Symmetric Measures
Value Ordinal by Ordinal Kendall's tau-b .494 N of Valid Cases a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. 73
Approx. Tb 3.846
Interpretation:
To investigate the relationship between fathers education and mothers education, Kendalls tau-b was used. The analysis indicated a significant positive association between fathers education and mothers education, tau =.572, p<.001. This means that more highly educated fathers were married to more highly educated mothers and less educated fathers were married to less educated mothers. This tau is considered to be a large effect size (Cohen, 1988).
22
Interpretation Eta was used to investigate the strength of the association between gender and number of math courses taken (eta=.33). This is a weak to medium effect size (Cohen, 1988). Males were more likely to take several or all the math courses than females.
24
25
25