Sunteți pe pagina 1din 15

Uses and interpretation of correlation

Interpretation of Correlation
Correlation refers to a technique used to measure the relationship between two or
more variables.When two things are correlated, it means that they vary together.Positive
correlation means that high scores on one are associated with high scores on the other, and
that low scores on one are associated with low scores on the other. Negative correlation, on
the other hand, means that high scores on the first thing are associated with low scores on the
second. Negative correlation also means that low scores on the first are associated with high
scores on the second. An example is the correlation between body weight and the time spent
on a weight-loss program. If the program is effective, the higher the amount of time spent on
the program, the lower the body weight. Also, the lower the amount of time spent on the
program,
the
higher
the
body
weight.
Pearson r is a statistic that is commonly used to calculate bivariate correlations.
For an Example Pearson r = -0.80, p < .01. What does this mean?
To
interpret
correlations,
four
pieces
of
information
are
necessary.
1. The numerical value of the correlation coefficient.Correlation coefficients can vary
numerically between 0.0 and 1.0. The closer the correlation is to 1.0, the stronger the
relationship between the two variables. A correlation of 0.0 indicates the absence of a
relationship. If the correlation coefficient is 0.80, which indicates the presence of a strong
relationship.
2. The sign of the correlation coefficient.A positive correlation coefficient means that as
variable 1 increases, variable 2 increases, and conversely, as variable 1 decreases, variable 2
decreases. In other words, the variables move in the same direction when there is a positive
correlation. A negative correlation means that as variable 1 increases, variable 2 decreases
and vice versa. In other words, the variables move in opposite directions when there is a
negative correlation. The negative sign indicates that as class size increases, mean reading
scores decrease.
3. The statistical significance of the correlation.A statistically significant correlation is
indicated by a probability value of less than 0.05. This means that the probability of obtaining
such a correlation coefficient by chance is less than five times out of 100, so the result
indicates the presence of a relationship. For -0.80 there is a statistically significant negative
relationship between class size and reading score (p < .001), such that the probability of this
correlation occurring by chance is less than one time out of 1000.
4. The effect size of the correlation.For correlations, the effect size is called the coefficient
of determination and is defined as r2. The coefficient of determination can vary from 0 to 1.00
and indicates that the proportion of variation in the scores can be predicted from the
relationship between the two variables. For r = -0.80 the coefficient of determination is 0.65,
which means that 65% of the variation in mean reading scores among the different classes

can be predicted from the relationship between class size and reading scores. (Conversely,
35% of the variation in mean reading scores cannot be explained.)
A correlation can only indicate the presence or absence of a relationship, not the nature of the
relationship. Correlation is not causation. There is always the possibility that a third variable
influenced the results. For example, perhaps the students in the small classes were higher in
verbal ability than the students in the large classes or were from higher income families or
had higher quality teachers.
F test assumption and uses

Stats: F-Test
The F-distribution is formed by the ratio of two independent chi-square variables divided by
their respective degrees of freedom.
Since F is formed by chi-square, many of the chi-square properties
carry over to the F distribution.

The F-values are all non-negative

The distribution is non-symmetric

The mean is approximately 1

There are two independent degrees of freedom, one for the numerator, and one for the
denominator.

There are many different F distributions, one for each pair of degrees of freedom.

F-Test
The F-test is designed to test if two population variances are equal. It does this by comparing
the ratio of two variances. So, if the variances are equal, the ratio of the variances will be 1.

All hypothesis testing is done under the assumption the null hypothesis is true

If the null hypothesis is true, then the F test-statistic given above can be simplified
(dramatically). This ratio of sample variances will be test statistic used. If the null hypothesis

is false, then we will reject the null hypothesis that the ratio was equal to 1 and our
assumption that they were equal.
There are several different F-tables. Each one has a different level of
significance. So, find the correct level of significance first, and then look up the
numerator degrees of freedom and the denominator degrees of freedom to find
the critical value.
You will notice that all of the tables only give level of significance for right tail tests. Because
the F distribution is not symmetric, and there are no negative values, you may not simply take
the opposite of the right critical value to find the left critical value. The way to find a left
critical value is to reverse the degrees of freedom, look up the right critical value, and then
take the reciprocal of this value. For example, the critical value with 0.05 on the left with 12
numerator and 15 denominator degrees of freedom is found of taking the reciprocal of the
critical value with 0.05 on the right with 15 numerator and 12 denominator degrees of
freedom.

Avoiding Left Critical Values


Since the left critical values are a pain to calculate, they are often avoided altogether. This is
the procedure followed in the textbook. You can force the F test into a right tail test by
placing the sample with the large variance in the numerator and the smaller variance in the
denominator. It does not matter which sample has the larger sample size, only which sample
has the larger variance.
The numerator degrees of freedom will be the degrees of freedom for whichever sample has
the larger variance (since it is in the numerator) and the denominator degrees of freedom will
be the degrees of freedom for whichever sample has the smaller variance (since it is in the
denominator).
If a two-tail test is being conducted, you still have to divide alpha by 2, but you only look up
and compare the right critical value.

Assumptions / Notes

The larger variance should always be placed in the numerator

The test statistic is F = s1^2 / s2^2 where s1^2 > s2^2

Divide alpha by 2 for a two tail test and then find the right critical value

If standard deviations are given instead of variances, they must be squared

When the degrees of freedom aren't given in the table, go with the value with the
larger critical value (this happens to be the smaller degrees of freedom). This is so that
you are less likely to reject in error (type I error)

The populations from which the samples were obtained must be normal.

The samples must be independent

F-Test
Explorable.com 39.7K reads 0 Comments
Share this page on your website:
<a href="https://explorable.com/f-test">F-Test</a>

Any statistical test that uses F-distribution can be called a F-test. It is used when the sample
size is small i.e. n < 30.

This article is a part of the guide:


Select from one of the other courses available:
Statistical Tests
Submit

form-mYC_ftRtsW guide_courses_o

Discover 34 more articles on this topic


Don't miss these related articles:

1ANOVA

2Correlation

3Two-Way ANOVA

4Multiple Regression

5One-Way ANOVA

Browse Full Outline

1Statistical Hypothesis Testing

2Relationships

3Correlation

4Regression

5Students T-Test

6ANOVA

7Nonparametric Statistics

8Other Ways to Analyse Data


o 8.1Chi Square Test
o 8.2Z-Test
o 8.3F-Test
o 8.4Factor Analysis
o 8.5ROC Curve Analysis
o 8.6Meta Analysis

Save this course for later


Don't have time for it all now? No problem, save it as a course and come back to it later.
Add to my courses
form-KzDFAov3o

guide_course_sta

8.3 F-Test

For example suppose one is interested to test if there is any significant difference between the
mean height of male and female students in a particular college. In such a situation, t-test for
difference of means can be applied.
However one assumption of t-test is that the variance of the two populations is equal- here
two populations are the population of heights of male and female students. Unless this
assumption is true, the t-test for difference of means cannot be carried out.
The F-test can be used to test the hypothesis that the population variances are equal.

F-test's for Different Purposes


There are different types of t-tests each for different purpose. Some of the popular types are
outlined below.
1. F-test for testing equality of variance is used to test the hypothesis of equality of two
population variances. The example considered above requires the application of this
test.
2. F-test for testing equality of several means. Test for equality of several means is
carried out by the technique named ANOVA.

For example suppose that the efficacy of a drug is sought to be tested at three levels
say 100mg, 250mg and 500mg. A test is conducted among fifteen human subjects
taken at random- with five subjects being administered each level of the drug.
To test if there are significant differences among the three levels of the drug in terms
of efficacy, the ANOVA technique has to be applied. The test used for this purpose is
the F-test.
3. F-test for testing significance of regression is used to test the significance of the
regression model. The appropriateness of the multiple regression model as a whole
can be tested by this test. A significant F indicates a linear relationship between Y and
at least one of the X's.

Assumptions
Irrespective of the type of F-test used, one assumption has to be met. The populations from
which the samples are drawn have to be normal. In the case of F-test for equality of variance,
a second assumption has to be satisfied in that the larger of the sample variances has to be
placed in the numerator of the test statistic.
Like t-test, F-test is also a small sample test and may be considered for use if sample size is <
30.

Deciding
In attempting to reach decisions, we always begin by specifying the null hypothesis against a
complementary hypothesis called alternative hypothesis. The calculated value of the F-test
with its associated p-value is used to infer whether one has to accept or reject a null
hypothesis.
All software's provide these p-values. If the associated p-value is small i.e. (<0.05) we say
that the test is significant at 5% and one may reject the null hypothesis and accept the
alternative one.
On the other hand if associated p-value of the test is >0.05, one may accept the null
hypothesis and reject the alternative. Evidence against the null hypothesis will be considered
very strong if p-value is less than 0.01. In that case, we say that the test is significant at 1%.

Chi Square Test

Explorable.com 86K reads 1 Comment

Any statistical test that uses the chi square distribution can be called chi square test. It is
applicable both for large and small samples-depending on the context.

This article is a part of the guide:


Select from one of the other courses available:

Statistical Tests

Submit

form-h8guBPNZKE guide_courses_ot

Discover 34 more articles on this topic


Don't miss these related articles:
1ANOVA
2Correlation
3Two-Way ANOVA
4Multiple Regression
5One-Way ANOVA

Browse Full Outline

1Statistical Hypothesis Testing


2Relationships
3Correlation
4Regression
5Students T-Test
6ANOVA
7Nonparametric Statistics
8Other Ways to Analyse Data
8.1Chi Square Test
8.2Z-Test
8.3F-Test
8.4Factor Analysis
8.5ROC Curve Analysis
8.6Meta Analysis

Save this course for later


Don't have time for it all now? No problem, save it as a course and come back to it later.
Add to my courses
form-XlBCj_KcXkF

guide_course_sta

8.1 Chi Square Test

For example suppose a person wants to test the hypothesis that success rate in a
particular English test is similar for indigenous and immigrant students.
If we take random sample of say size 80 students and measure both indigenous/immigrant
as well as success/failure status of each of the student, the chi square test can be applied to
test the hypothesis.
There are different types of chi square test each for different purpose. Some of the popular
types are outlined below.

Tests for Different Purposes


1.

Chi square test for testing goodness of fit is used to decide whether there is any
difference between the observed (experimental) value and the expected
(theoretical) value.
For example given a sample, we may like to test if it has been drawn from a normal
population. This can be tested using chi square goodness of fit procedure.

2.

Chi square test for independence of two attributes. Suppose N observations are
considered and classified according two characteristics say A and B. We may be
interested to test whether the two characteristics are independent. In such a case,
we can use Chi square test for independence of two attributes.
The example considered above testing for independence of success in the English test vis
a vis immigrant status is a case fit for analysis using this test.

3.

Chi square test for single variance is used to test a hypothesis on a specific
value of the population variance. Statistically speaking, we test the null hypothesis
H0: = 0 against the research hypothesis H1: # 0 where is the population
mean and 0 is a specific value of the population variance that we would like to
test for acceptance.
In other words, this test enables us to test if the given sample has been drawn from a
population with specific variance 0. This is a small sample test to be used only if sample
size is less than 30 in general.

Assumptions
The Chi square test for single variance has an assumption that the population from which
the sample has been is normal. This normality assumption need not hold for chi square
goodness of fit test and test for independence of attributes.
However while implementing these two tests, one has to ensure that expected frequency in
any cell is not less than 5. If it is so, then it has to be pooled with the preceding or
succeeding cell so that expected frequency of the pooled cell is at least 5.

Non Parametric and Distribution Free


It has to be noted that the Chi square goodness of fit test and test for independence of
attributes depend only on the set of observed and expected frequencies and degrees of
freedom. These two tests do not need any assumption regarding distribution of the parent
population from which the samples are taken.

Since these tests do not involve any population parameters or characteristics, they are also
termed as non parametric or distribution free tests. An additional important fact on these
two tests is they are sample size independent and can be used for any sample size as long
as the assumption on minimum expected cell frequency is met.

Type 1 and 2 errors


The statistical practice of hypothesis testing is widespread not only in statistics, but also
throughout the natural and social sciences. While we are conducting a hypothesis test there a
couple of things that could go wrong. There are two kinds of errors, which by design cannot
be avoided, and we must be aware that these errors exist. The errors are given the quite
pedestrian names of type I and type II errors.
What are type I and type II errors, and how we distinguish between them?

Hypothesis Testing
The process of hypothesis testing can seem to be quite varied with a multitude of test
statistics. But the general process is the same. Hypothesis testing involves the statement of a
null hypothesis, and the selection of a level of significance. The null hypothesis is either true
or false, and represents the default claim for a treatment or procedure. For example, when
examining the effectiveness of a drug, the null hypothesis would be that the drug has no
effect on a disease.
After formulating the null hypothesis and choosing a level of significance, we acquire data
through observation. Statistical calculations tell us whether or not we should reject the null
hypothesis.
In an ideal world we would always reject the null hypothesis when it is false, and we would
not reject the null hypothesis when it is indeed true. But there are two other scenarios that are
possible, each of which will result in an error.

Type I Error
The first kind of error that is possible involves the rejection of a null hypothesis that is
actually true. This kind of error is called a type I error, and is sometimes called an error of the
first kind.
Type I errors are equivalent to false positives. Lets go back to the example of a drug being
used to treat a disease. If we reject the null hypothesis in this situation, then our claim is that
the drug does in fact have some effect on a disease. But if the null hypothesis is true, then in
reality the drug does not combat the disease at all. The drug is falsely claimed to have a
positive effect on a disease.
Type I errors can be controlled. The value of alpha, which is related to the level of
significance that we selected has a direct bearing on type I errors. Alpha is the maximum
probability that we have a type I error. For a 95% confidence level, the value of alpha is 0.05.
This means that there is a 5% probability that we will reject a true null hypothesis. In the long

run, one out of every twenty hypothesis tests that we perform at this level will result in a type
I error.

Type II Error
The other kind of error that is possible occurs when we do not reject a null hypothesis that is
false. This sort of error is called a type II error, and is also referred to as an error of the
second kind.
Type II errors are equivalent to false negatives. If we think back again to the scenario in
which we are testing a drug, what would a type II error look like? A type II error would occur
if we accepted that the drug had no effect on a disease, but in reality it did.
The probability of a type II error is given by the Greek letter beta. This number is related to
the power or sensitivity of the hypothesis test, denoted by 1 beta.

How to Avoid Errors


Type I and type II errors are part of the process of hypothesis testing. Although the errors
cannot be completely eliminated, we can minimize one type of error.
Typically when we try to decrease the probability one type of error, the probability for the
other type increases. We could decrease the value of alpha from 0.05 to 0.01, corresponding
to a 99% level of confidence. However, if everything else remains the same, then the
probability of a type II error will nearly always increase.
Many times the real world application of our hypothesis test will determine if we are more
accepting of type I or type II errors. This will then be used when we design our statistical
experiment.

Parametric vs. non-parametric tests


Explanations > Social Research > Analysis > Parametric vs. non-parametric tests

There are two types of test data and consequently different types of analysis. As
the table below shows, parametric data has an underlying normal distribution
which allows for more conclusions to be drawn as the shape can be
mathematically described. Anything else is non-parametric.

Parametric

Non-parametric

Assumed
distribution

Normal

Any

Assumed variance

Homogeneous

Any

Typical data

Ratio or Interval

Ordinal or Nominal

Data set
relationships

Independent

Any

Usual central
measure

Mean

Median

Benefits

Can draw more


conclusions

Simplicity; Less
affected by outliers

Choosing

Choosing
parametric test

Choosing a nonparametric test

Correlation test

Pearson

Spearman

Independent
measures, 2 groups

Independentmeasures t-test

Mann-Whitney test

Independent
measures, >2
groups

One-way,
independentmeasures ANOVA

Kruskal-Wallis test

Repeated
measures, 2
conditions

Matched-pair t-test

Wilcoxon test

Repeated
measures, >2
conditions

One-way, repeated
measures ANOVA

Friedman's test

Tests

ONCLUSIONS 251 How to Conduct a Linear Regression Analysis:


A Summary
Know the background literature on your subject. Does previous
literature indicate that your outcome variable is not distributed normally or
is not continuous? Has previous literature found interactions between
explanatory variables? What control variables are important for your model?
Do you anticipate measurement error? What might you do to correct for it?
What theoretical frameworks or conceptual models have been used in the
past? Use your imagination/ intuition about the social or behavioral
processes of interest.
Establish hypotheses and decision rules. Which theory or conceptual
model guides your selection of variables or the ways you think they are
associated? Why do you think the variables are associated? Will you use
directional or non-directional hypotheses? Will you use one- or two-tailed
significance tests to decide whether coefficients are statistically distinct from
zero? Set up the hypotheses.

Know the data set. Are your data longitudinal? If they are, consider using a
regression technique such as a GEE that is designed for longitudinal data.
Do your data come from a survey? Have you read the codebook and
documentation for information about how the data were collected? How are
the variables of interest coded? Are there missing values? How are they
coded? How will you handle missing data? You will probably need to recode
and construct new variables. Are there sample weights in the data set? Will
you use a split-sample approach to verify the model?
Check the distributions of the variables. Use q-q plots and box-andwhisker plots. Run descriptive statistics. Is your outcome variable distributed
normally? What about the explanatory variables? If your variables are not
distributed normally, consider using a transformation such as the natural
logarithm, square root, or quadratic (depending onthe direction and degree
of skewness). If the outcome variable is binary, use logistic regression. Are
there outliers or other influential observations that you can see? Consider
their source. Do you need to compute dummy variables and include them in
your model? Check to make sure youve taken care of missing values they
can throw everything off if they are not adjusted for in the model!
Assess the bivariate associations in the data. Use scatterplots for
continuous variables. Plot linear and nonlinear lines to determine the
bivariate associations. Compute bivariate correlations for continuous
variables. Look for outliers and potential collinearity problems.
Estimate the regression model. Avoid automated variable selection
procedures unless your goal is simply to find the best prediction model.
Assess the results. Are there unusual coefficients (overly large or small;
negative when they should be positive)? Save the collinearity diagnostics;
save the influential observation diagnostics (studentized residuals, leverage
values, Cooks D, and DFFITS); run a scatterplot of deleted studentized
residuals by standardized predicted values; estimate partial residual plots;
ask for a normal probability plot of the residuals; ask for the Durbin-Watson
statistic, if appropriate; compute Morans I, if using spatial data and it is
available. Assess the goodness-of-fit statistics (adjusted R 2; F-value and its
accompanying p-value; SE). Run nested models if appropriate and use nested
F-tests to compare them. These are particularly useful for assessing potential
specification errors.
Check the diagnostics. Are there any collinearity problems? If yes, you
might need to combine variables, collect more data, or, as last resort, drop
variables. If the collinearity problem involves interaction or quadratic terms,
use centered values such as z-scores to recompute them. Are the residuals
normally distributed? If not, consider a transformation. Do the partial
residual plots provide evidence of nonlinear associations? Is there evidence
of heteroscedasticity? (If theplot is inconclusive visually, try Whites test or
Glejsers test.) If yes, consider transforming a variable, weighted least
squares regression, or using Huber-White sandwich estimators. Is there
evidence of autocorrelation? Consider the source and try to correct for it. If
you have spatial data, a spatial regression model may be needed. Use PraisWinsten regression or time-series techniques for data collected over time, if
appropriate. Are there influential observations? If yes, consider their source.
Are there coding errors? Will a transformation help? If not, use a robust
regression technique to adjust for influential observations.
Interpret and present the results. Interpret the unstandardized slopes and pvalues. What do the goodness-of-fits statistics tell you about the model?

Compare the results to the guiding hypotheses. Given the decision rules, are the
hypotheses or the conceptual model supported by the analysis? Consider
presenting predicted values, especially from models that include interaction
terms. Consider graphical presentations of coefficients, nonlinear associations,
and interactions. These can provide intuitive information that is often lost when
presenting only numbers.
Descriptive statistics are summative methods to depict the data in succinct ways. I will
have you know it was very difficult to write a definition of descriptive statistics that did not
include the word 'descriptive' or 'describe.' My sixth grade teacher always told us never use
the word we're defining in the definition.
Here is a list of descriptive statistics, and then we will move onto talking more about them.
Some of these you will find very familiar, and some may be new. Some may have wordings
that you aren't used to:

Mean

Median

Mode

Range

Standard deviation

Coefficient of variation

Mean
In statistics, mean simply means average score of the sample. This is where you add up all
your values and then divide by the number of participants. To expose you to some
additional terms, sum means simply to add up. Using sum may be new to you if you haven't
taken many math classes, but the term makes it easier to write and more educated
sounding.
A quick example would be:
Number of bites it takes me to eat a fun-sized candy bar: 4, 2, 1, 1, 4, 1, 2, 1.
The sum total is 16, which makes the mean 2. Fairly simple, right?
Mean is useful for helping us understand the average participant's score in your study. It
gives us, the readers, a quick way to formulate what a typical or normal variable is in your
study. This is most often used to describe the average age of the participants but could also
be used to describe the average scores on a test or number of years involved in something.

With this in mind, we can take an individual score and compare it to an average. For
example, if I said the average height of the islanders is 5 feet tall and I'm 6 foot 3, then we
know that in comparison, I am much taller than the average islander.

Median
Median is the middle score after the scores have been arranged in numerical order. For
example, if we looked at the candy bar eating numbers from before: 4, 1, 1, 1, 4, 2, 2, 1, we
will need to reorganize them into 4, 4, 2, 2, 1, 1, 1, 1. As you can see, I put the highest
number first, but you could put the lowest number first, and your median will be the same.
We have eight numbers, so we count halfway. Counting obviously works better with odd
numbers since you land on a single number. With even numbers, you will take the middle
two numbers and then average them. Our middle numbers are 2 and 1. This means our
median is 1.5.
Median is useful for similar reasons to mean: It provides the reader with an understanding
of an average, or normal, participant or measured variable. Median, however, reduces the
effect of outliers, or a point of data that is distant from the others, either extremely high or
extremely low. In our candy-eating example, if it took me 15 bites to eat a really chewy and
delicious candy bar, then this would be an outlier. If you add 15 to the scores, our new mean
would be 3.4, with our median at 2. Which better describes the data?

Mode
Mode, defined as the most often occurring value, is by far the easiest to compute,
particularly if you have scores in numerical order. In our candy-chewing example, the most
often occurring number is 1. Another way of thinking about mode is what is the most
common number.
But who cares? Well, you should. Why care? Because the candy bar example is not very good
when we're trying to make important decisions, but there are many studies out there where
individual descriptions matter.
For example, if we were looking at a school with financial issues, and the most common type
of staff is administration, what does that tell us? (That like almost everywhere in the world,
bureaucracy thrives so much so that it makes me crazy.) Or what if you were going to be a
senator, and you needed to know what ethnicity was most common in your district? So
there are a lot of ways that mode can help us describe the world we live in.

Range
Range is simply a single number representing the spread of the data. This is where you take
the highest number you have - in our candy example it's 4 - and subtract the smallest
number. In this example, it's 1. This gives us a total range of candy bar bites at 3.
Range is kind of a funny thing. While it does tell us how spread out the data is, if you want a
lot of variability, then you want a high range. For example, if you're wondering if a new
teaching program educates both intellectually high and low children, you should have a

wider range. If you want little variability, like a study on people with severe depression, then
you want the range to be relatively small. It all depends on what you're looking at.

Standard Deviation and Variance


Standard deviation is a number corresponding to a bell curve describing how spread out
the data is. Variance is a numerical value indicating how spread out the data is. Why are
these two together? Because the standard deviation is the square root of the variance.
Briefly, the math involved for computing these involves comparing each score with the
average and then summing up the distance each score has from the average and comparing
them to a standardized score, which has more math involved than we can compute. It's
complicated, I know.

Tables adv and dav


Advantages: It's easy to see the relationship between the input value and the
output value. If you were given a few pairs of numbers (the inputs and their
corresponding output values), they'll ask you to find a common pattern between
the pair, aka, how you got from the input value to the output value.
Disadvantages: It doesn't click with the more visually-oriented audience. You can
only readily see increasing and decreasing values as the x-value increments
uniformly (aka, the x-values are evenly spaced apart). It's tough to find, say, the
y-intercept if the x-value 0 and its pair is not given, or the slope. You have to
run a series of tests to determine what kind of function the relationship between
x and y is (whether it's linear, exponential, quadratic, etc.).

S-ar putea să vă placă și