Using The Chi Square Test

USING THE CHI SQUARE TEST
In a perfect world, without the complexities of statistics, we would expect a "large

family" of 160 offspring from our dihybrid cross to have 90 plants that produce green,
inflated pods; 30 plants that produce green, constricted pods; 30 plants that produce
yellow, inflated pods and 10 plants that produce yellow, constricted pods.
Sadly, we do not live in such a simple world and statistics come into play. The ratios
might be slightly "skewed" - a math term for "off" - due to the number of individuals you
collected or just "bad luck". The ratios we expect from a dihybrid cross are not always
what we get in the experiment. One way to combat the problem of statistics is to use
statistics!
Let's take a step back and look at some of Mendel's original work with monohybrids
because they are an easier place to start. You will recall that Mendel did a lot of
monohybrid experiments and collected a lot of data from a lot of plants and crosses.
Here's one of those sets of data that I showed you earlier.
P = smooth seeds crossed with wrinkled seeds
F1 = all smooth seeds (so smooth is dominant and wrinkled is recessive)
F2 = 5,474 smooth seeds and 1,850 wrinkled seeds is a ratio of 2.96 : 1
Mendel and the Punnett square tell us that we should have a ratio of 3 : 1 not 2.96 : 1! So
is Mendel wrong? Is the Punnett square wrong? Is our entire understanding wrong?!
That depends upon how different the actual, observed numbers are from the calculated,
expected numbers. But how close is close enough? Is 2.96 : 1 close enough to 3 : 1 that
we should accept Mendel's ideas? Some folks would argue, "Well, Mendel says it should
be 3 : 1 and it is not 3 : 1, so Mendel is wrong!" But someone else would argue, "Hey,
lighten up! I think 2.96 : 1 is close enough to 3 : 1 so I will not reject Mendel's ideas."
Mendel wasn't bothered by the fact that his data was a little off because he knew that,
statistically speaking, he was "within bounds". But what are those bounds and how do
you calculate them? That's when Mendel fell back on his knowledge of math and showed
that these tiny differences were not significant enough to cause him to throw it all away.
Mendel used the chi-square (abbreviated 2) test and so will you.

The chi-square test, or simply the "chi-square", measures the significance of the data in
comparison with what you expect to get. This statistical test is useful in many problems
in genetics and other sciences, so it is important that you learn how to "do the chi-
square". The beauty of the 2 is that it only requires that you know the number of
individuals observed in each category and what numbers you expected them to be.
Actually, 2 is pretty easy and to some folks it's even obvious!

I will walk you through the entire process shortly but first let me tell you what we are
going to do. [Some folks find this "word version" a nice introduction to the "math
version", while others say it just scares them! I hope it doesn't scare you. Please read it
as an introduction to the ideas because you will see the math soon enough.]
In chi-square analysis you compare the number of individuals of a certain phenotype (or
anything else) that you have found in the experiment to the number you expected to have.
That is, you find the difference between the observed and the expected by simply
subtracting one from the other. [It doesn't matter which one you subtract from which - all
you want is the difference.]
Then, just to make that difference bigger and to make it always a positive number, you
square it (multiply it by itself). That gives you the "squared difference".
Then you divide the "squared difference" by what you expected in the first place in order
to give you a "squared difference per expected" for that group. [This step brings the
numbers into a reasonable zone to work with but the reason you do it has to do with the
theory of statistics and I won't go into that!]
Naturally, you have to take into account all the different types, and you do that by adding
together these "squared differences per expected" values. The final sum (of the "squared
differences per expected") gives you a number called the 2. (Scared yet? )
By the way, chi ( ) is the Greek letter for "c" which mathematicians often use as an
abbreviation for "comparisons". The chi-square ( 2) is a "comparison squared".
OK, let's look at those F2s again.

F2 = 5,474 smooth seeds and 1,850 wrinkled seeds for a ratio of 2.96 : 1
That ratio is convenient but we don't use the ratio in our calculations of 2. Instead we use
the "raw numbers" of the data and compare it to the "raw numbers" we expected.
So, if Mendel's 3 : 1 ratio is correct, how many smooth seeds would you expect and how
many wrinkled seeds would you expect in the F2s?
Grab your calculator, a pencil and a sheet of paper because here we go!
Step 1: calculate the EXPECTED number of each type.

To do that you must first add together both seed types and when you do that you will see
that you start with a total population of 7,324 seeds. [That's 5,474 smooth seeds plus
1,850 wrinkled seeds equals 7,324 seeds total. 5,474 + 1,850 = 7,324 but don't take my
word for it - check it with your calculator!]
Of those 7,324 seeds you expected a quarter of them (1 in 4) to be wrinkled. [That's
because the wrinkled seeds were the "1" in the 3 : 1 ratio and a 3 : 1 ratio is represented
as fractions of ¾ and ¼.]
So, how many of those 7,324 seeds should be wrinkled?
Just divide 7,324 by 4 (the same as multiplying 7,324 by ¼) to get 1,831 wrinkled seeds
expected.
Notice that Mendel got 1,850 wrinkled seeds in the experiment. Is that significant? We'll
see. (That's what the chi-square is all about!)
Now, how many smooth seeds should you expect from the total of 7,324 seeds?
Well, you expect three times as many smooth as wrinkled so simply multiple 1,831 by 3
to get 5,493. That means you expected to get 5,493 smooth seeds.
Does that make sense? Let's check it to make sure. We expected 5,493 smooth seeds and
1,831 wrinkled seeds. That's a total of 7,324 seeds and that is exactly the total number we
are working with. Another way to check your math is to notice that 1,831/7,324 = 0.25
(which is exactly ¼), the correct fraction of wrinkled seeds expected in the total
population. [You can check it again by looking at the fraction of smooth seeds. That's
5,493/7,324 = 0.75, which is ¾, the fraction of smooth seeds expected.]
By the way, sometimes you will get a fraction and you might think "I've gone wrong -
you cannot have a fraction of a plant!". Well, you are right that you cannot have a
fraction of an individual but when we do this mathematical analysis it is acceptable to get
fractions.
OK, step one is complete!
We expected 5,493 smooth seeds and 1,831 wrinkled seeds.
Step 2: calculate the "SQUARE OF THE DIFFERENCE PER EXPECTED".

Let's do the smooth seeds first. We observed 5,474 but expected 5,493. That's a
DIFFERENCE of 19. [5,474 - 5,493 = -19, but we can ignore the minus sign.]
Next we SQUARE THE DIFFERENCE. 19 x 19 (or 192) = 361.
Next we find the SQUARE OF THE DIFFERENCE PER EXPECTED by dividing that
number (361) by the total number of smooth seeds we expected to see (5,493). That is
361/5,493 = 0.066 (rounded to three decimal places is good enough for us). So the
"squared differences per expected" of the smooth seeds = 0.066.
Lets' do the wrinkled the same way. We observed 1,850 wrinkled seeds but expected
1,831. That's a difference of 19 (1,850 - 1,831 = 19) again. [It doesn't always work that
way, unless you are working with an experiment with only two outcomes, like this one. A
dihybrid cross has four outcomes and is more complicate.] So you square the difference
to get 361. You then divide it by the expected number of wrinkled seeds (NOT the
expected number of smooth seeds - a common mistake) so that is 361/1,831 = 0.197
(rounded to three decimal places, like before).
Step 3: congratulate yourself for having gotten through the toughest part!
Step 4: SUM (ADD up) the "squared differences per expected" from all the categories.
In this case there are only two categories so there are only two values to add. Add the
value you calculated for the smooth (0.066) to the value you calculated for the wrinkled
(0.197) to get 0.263.
This experiment has a chi-square equal to 0.263 ( 2 = 0.263).

There. That wasn't too difficult, was it? You found the chi-square value for this set of
data. Now all we have to do is ...
Step 5: COMPARE our chi-square value to the value in a chi- Chi Square Significance
square significance table and determine if our value is Table
significant. 5%
The chi-square significance table has been developed by Degrees of
Significance
statisticians. These tables come in all shapes and sizes Freedom
Levels
depending upon how exact you want to be and how many
1 3.84
categories you are dealing with. For our work we want to
know if these results pass a significance level of 5%. (This is 2 5.99
a fairly good level of significance and is often used as a "cut- 3 7.81
off" in experiments like these.)
4 9.49
OK, what does this mean? What is this "degrees of freedom"
stuff?
The simple answer is that your degrees of freedom are one less than the number of
categories you have to work with. [The complicated answer is that degrees of freedom
are the number of values that can be randomly assigned while the total is left unchanged.
Don't worry about it. ]
We have two categories, smooth and wrinkled, so we have one degree of freedom and
you see from this table that with one degree of freedom we could be allowed a chi-square
as large as 3.84 and the results would still be considered significant to 5%. That is, we
would have to get a chi-square value over 3.84 before we would say that our results were
so far from a 3 : 1 ratio that we would have to reject that ratio (and Mendel's explanation
of how he got that ratio). Or, to put that another way, with a 2 = 0.263 there's less than a
5% chance that this 3 : 1 ratio happened by accident. There is a better than 95% chance
that he 3 : 1 ratio has real meaning or is "significant" in this experiment.
The chi-square is a kind of "mathematical judge" of probabilities.

There are other "mathematical judges" used in other areas of science but the chi-square is
the only one we will use in this course.
Let's think a bit more about the chi-square and what it has told us.
Imagine that Mendel's work had come out with exactly the ratio we expected. That is,
imagine Mendel observed in this experiment 5,493 smooth seeds and 1,831 wrinkled
seeds. That is an exact 3 : 1 ratio.
Let's do a quick chi-square on that imaginary result.
Looking first at the smooth seeds we would see that the difference between the observed
and expected is zero! (That's because 5,493 - 5,493 = 0.) When we square zero we still
get zero. If we divide zero by the expected value we get zero!
The same happens when we calculate the values for the wrinkled seeds too.
Now we would add those two values together (because they are the "squared differences
averaged") to get a final 2 = 0.
In other words, when the chi-square equals zero the experimental results are in exactly
the ratio expected! [This rarely happens.]
5% Conversely, the farther the chi-square gets from zero the less
Degrees of likely the ratio "rule" is being followed. If the chi-square had
Significance
Freedom been 1.9 (instead of 0.263) we would have been less confident
Levels
but still within the 5% significance range. (Right?)
1 3.84
2 5.99 As a matter of fact, we could have gotten a chi-square value
3 7.81 as high as 3.84 and still feel that we were close enough to the
4 9.49 3 : 1 ratio to not be worried. With a chi-square of 3.84 the
chances of the results fitting a 3 : 1 ratio by chance (by
"accident") are 5%. But if our chi-square value was larger
than 3.84 we would be drifting into uncertainty. If the chi-
square were 13.4 we would not feel at all comfortable and
would have a good reason to suspect that the 3 : 1 ratio did
not apply . With a larger and more defined table we could
even see to what level our confidence had dropped!
Here's another set of results from Mendel's monohybrid cross experiments. Let's do the
chi-square analysis of it.
Here I'll condensed the "steps". You'll see it flows a little bit better and there is less "hand
holding" or explanation.
P = green seeds crossed with yellow seeds
F1 = all yellow seeds (So which color is dominant? I hope you agree that yellow
dominates green seeds.)
F2 = 6,022 yellow seeds and 2,001 green seeds
Is this close enough to the 3 : 1 ratio we expect?
First, calculate the expected number of each type.

You have a total of 8,023 seeds (6,022 yellow + 2,001 green = 8,023).
The green seeds should (are expected to) make up a quarter of that population, so
dividing 8,203 by 4 gives you 2,005.75. That means you expected 2,005.75 seeds. [It's
nonsense to think in terms of ¾ of a seed but for the chi-square it's OK to continue with
these silly fractions.]
The yellow seeds should make up the rest of the sample so we can find their number by
subtracting 2,005.75 from the total 8,023 to get 6,017.25. [Or you could have multiplied
8,023 by ¾ and get the same number. It's a good idea to try it both ways to make sure you
haven't made an error.]
So, from the total of 8,023 seeds you expected 6,017.25 to be yellow and 2,005.75 to be
green.
Second, calculate the "squared differences per expected".

Let's do the greens first. You expected 2,005.75 but observed 2001 and that is a
difference of 4.75 (2,005.75 - 2,001 = 4.75). When you square that number you get 22.56.
When you divide it by the number you expected (2,005.75) you get 0.011 (to three
decimal places).
Now the yellows. You expected 6,017.25 but observed 6,022 and that is a difference of
4.75 (6,017.25 - 6,022 = -4.75, but we can ignore the sign). When you square that number
you get 22.56. Now divide it (22.56) by the number you expected you get (6,017.25) to
get 0.004 (to three decimal places).
Notice that, because we are working with only two categories, the "squared differences"
are the same in both groups (22.56) because the differences are the same. (They MUST be
the same if there are only two groups! Think about it.) However, the "squared differences
per expected" are different because we have different expectations for the two groups
(6,017.25 to be yellow but 2,005.75 to be green) so we divide by different numbers. I point
this out because it can be used to highlight two of the most common mistakes in doing the
chi-square. Your "squared differences" in an experiment with only two categories (one
degree of freedom) must be the same - if they are not you made a math error. However, it
is very unlikely that your "squared differences per expected" are the same unless you
expected the same number for each group (a 1: 1 ratio) or you made the common mistake
of dividing both groups by the same number. Watch your numbers and pay attention.
Third, sum (add up) the "squared differences per Degrees of 5 % Significance
expected" from all the categories. Freedom Levels
That's 0.011 + 0.004 = 0.015 so your 2 = 0.015.
1 3.84
Wow, that's even better than before but let's look at the
table just to make sure. We are still working with only 2 5.99
one degree of freedom. (Right?) 3 7.81
4 9.49
Obviously, the ratio observed in this experiment (6,022
yellow : 2,001 green or a 3.01 : 1) is not so far off from
the 3 : 1 ratio as to cause concern.
Perhaps you found it difficult to follow through all those steps without a simple
"formula". This is a good time to present the formula in order to show you what you have
been doing and to help you in the future.
2
= [(O - E)2/E]
"O" is the number observed and "E" is the number expected.
The part within the brackets, (O - E)2/E, is the procedure you use to find the difference (O
- E), then square it (O - E)2 and then divide by the number of expected, (O - E)2/E. That's
what you do for all categories (smooth and wrinkled, green and yellow, etc.).
The symbol " " is called "sigma" and is used throughout math to mean "sum". Here it
tells you to add together (sum) the values you calculated for each category.
Some people enjoy equations and some people are panicked by them! Try to get use to
understanding and using this chi-square equation. I will not expect you to memorize the
equation, but I will expect you to "do the chi-square" and this formula will be useful to
help you through all those steps.
Let's do another chi-square (Ugh! ) with some other values.

Let's assume the results of a cross were
3,087 yellow seeds and 2,937 green seeds.
Is this close enough to a 3 : 1 ratio? (It doesn't look like it to me but let's do the chi-square
to find out.)
What would be the ratio if it were exactly 3 : 1?

There are 6,024 seeds in total. (That's 3,087 + 2,937 = 6,024 total.)
If the 3 : 1 ratio applies then one quarter of them should be green. That means 1,506
should be green (6,024/4 = 1,506) and the rest 4,518 should be yellow. (That's also 6,024
x ¾ = 4,518 yellows.)
OK. Lets' do the greens first. That's (O - E)2/E = (2,937 - 1,506)2/1,506 = 1359.7 [Run
that through your calculator to be sure you can do it.]
The yellows will be (O - E)2/E = (3,087 - 4,518)2/4,518 = 453.2.
Now add them together (that's what means) to get a 2

= 5%
Degrees of
1812.9. Significance
Freedom
Levels
Wow! Our calculated chi-square shows that these 1 3.84
experimental results are well outside the acceptable range for
2 5.99
a 3 : 1 ratio so we reject the idea that these results represent a
3 : 1 ratio. These offspring are NOT the result of a 3 7.81
monohybrid cross or Mendel was wrong! 4 9.49
Hmmm. Look at that data again.

3,087 yellow seeds and 2,937 green seeds
Are they close to any other ratios you've seen? What ratio are they close to and how
would you test the ratio to see if it is close enough?
I hope you decided that the ratio of 3,087 yellow seeds and 2,937 green seeds is close to a
1 : 1 ratio. Actually the observed ratio is 1.05 : 1 but is that close enough? Maybe. Maybe
not. Whenever you are confronted with a problem asking you if the ratio is close enough,
think about using the chi-square.
You can use the chi-square to test ANY ratios you have in mind. All you need to have is
a hunch of what the ratio should be and then use chi-square to see if the experimental
data is close enough. So, let's do another chi-square on that same data (3,087 yellow
seeds and 2,937 green seeds) and see if it is close enough to a 1 : 1 ratio!
First, what would be the "perfect" 1 :1 ratio among this group of seeds?
There is a total of 6024 seeds (still) so a 1: 1 ratio should show us 3,012 yellow seeds and
3,012 green seeds.
OK, that's what we expected. Now let's do the chi-square.
Lets' do the green's first. That's (O - E)2/E = (2,937 - 3,012)2/3,012 = 1.867.
The yellows will be (O - E)2/E = (3,087 - 3,012)2/3,012 = 1.867 (again).
[In this example the "squared differences per expected" should be the same because here
you expect the same 1 : 1 ratio for the expected in this two category puzzle. ONLY when
you have a 1 : 1 ratio to test on a two category chi-square will you get equal "squared
differences per expected".]
5%
Adding them together gives me the 2 = 3.734 and I Degrees of
Significance
compare that with the values in the table. Freedom
Levels
I see that my new 2, using a 1 : 1 ratio, is low enough to be 1 3.84
2
within the range of significance. (This is less than 3.84.) 2 5.99
Therefore, the results of this experiment are far from being 3 7.81
3 : 1. I think they are really 1 : 1!
4 9.49
Surprised? Well, you shouldn't be. First off, the observed ratios look closer to 1 : 1 than
to 3 : 1. (Right?) Second, I didn't tell you that this experiment was an F2 population of
seeds. (Did I?) Indeed, I made these numbers up to represent the results you would get
from a test cross where the unknown genotype turns out to be heterozygote. In other
words, this is an acceptable ratio if one parent was ss and the other parent was Ss.
The chi-square is used whenever you want to compare the observed results to the ones
you would expect from a certain ratio. That ratio could be 1 : 1 or 3 : 1 or even (oh, no
) 9 : 3 : 3 : 1. That's right! You can use the chi-square to determine if a dihybrid cross is
producing offspring in acceptable ratios. Note, however, that with four categories (instead
of two) you have three degrees of freedom and twice as many calculations to do.
Do the Chi-Square Workshop (Workshop Three) now and then do the SAQs for this
lesson so you will get plenty of practice with the chi-square.
Chi-square of a monohybrid cross as a "walk through"

Mendel's data from one experiment was ...
P = smooth seeds crossed with wrinkled seeds
F1 = all smooth seeds (so smooth is dominant and wrinkled is recessive)
F2 = 5,474 smooth seeds and 1,850 wrinkled seeds
1. What ratio did he observe?
5474 / 1850 = 2.9589189 : 1 = 2.96 : 1
2. What ratio did he expect?
3:1
You should understand that the chi-square compares the NUMBER (not ratio) observed
to the NUMBER (not ratio) expected. You are given the observed numbers and from that
data you might guess what the ratio should be. You then use that "guessed" ratio to
calculate what the expected numbers would by from that guessed ratio.
Calculating the expected number is critical to doing the chi-square and many students
have trouble with that first step - they forget how to do it, use it backwards or don't do it
at all!
Let's work through this important step together so you will understand that logic.
You already know the number observed.
Smooth = 5474
Wrinkled = 1850
3. What is the total number of seeds?
7324
4. What number of wrinkled is expected?
7324 / 4 = 183
5. What number of smooth is expected?
1831 X 3 = 5493 or 7324 X 3/4 = 5493
OK, you now have the expected numbers calculated from the expected ratio.
The best (easiest) way to COMPARE two values is to find their DIFFERENCE (by
SUBTRACTION).
6. What is the difference between observed and expected smooth?
5474 - 5493 = -19
7. What is the difference between observed and expected wrinkled?
1850 - 1831 = 19
For "statistical magnification" we INCREASE those differences by squaring them.
8. What is the square of the difference between the observed and expected smooth?
-192 = 361 or -19 X -19 = 361

9. What is the square of the difference between the observed and expected wrinkled?
192 = 361 or 19 X 19 = 361
These "square of the differences" are too large and must be "NORMALISED" by
dividing each by the number EXPECTED (NOT the number observed). This could be
called the "squared differences per expected".
10. What is the square of the difference between the observed and expected smooth,
divided by the expected number of smooth?
361 / 5493 = 0.06572 = 0.066
11. What is the square of the difference between the observed and expected wrinkled,
divided by the expected number of wrinkled?
361 / 1831 = 0.19716 = 0.197
Lastly, we add together these "squared differences per expected" to give us the TOTAL
"squared differences per expected".
12. What is the sum of the "squared differences per expected"?
2
0.066 + 0.197 = 0.263 the = 0.263
2
Therefore, the chi-square for this experiment is = 0.263.
OK - so what?
Statisticians have developed chi-square tables, based upon the probabilities that a
particular chi-square value will come about purely by chance. There are two "features" to
consider.
A. Significance Level….
We (scientists) like to use the level of 5% as our significant "cut-off". Any chi-square
larger than the value from the 5% Table indicates an experiment in which the ratios
observed are so far off the ratios expected that we have to conclude that the ratios
expected are wrong!
B. Degrees of Freedom…
The more "classes" (categories) the more likely that a statistical "blip" will increase the
acceptable limits of the chi-square. The "degrees of freedom" are one less than the
number of classes.
13. Name all the different classes in the experiment (earlier)…..
Smooth and Wrinkled
14. How many degrees of freedom were in that experiment?
2-1=1
One degree of freedom.
Degrees of 5 % Significance
Freedom Levels
Here's a portion of the Chi Square Significance 1 3.84

Table.
2 5.99
15. Is the chi-square you calculated within the
boundary of "the possible"?
3 7.81
4 9.49
Yes! We calculated a 2 = 0.263. With one degree of freedom we could have a chi-square
up to 3.84 before we would become suspicious that the observed data was in a ratio too
far removed from the ratio we tested.
Chi-square of a monohybrid cross as a quick table

When doing a Chi-square it helps to set it up as a table and to understand that all we have
been doing is represented by the equation 2 = [(O - E)2/E]
Consider these results among the F2s

4,400 yellow seeds
1,624 green seeds
First, set up a table like the one below

(O-E)2
Phenotypes O E O-E (O-E)2
E
Yellow
Green
Total
Second, enter the data. Remember, data is what is observed. So data goes in the
"observed" (O) column.
(O-E)2
E
Yellow 4400
Green 1624
Total 6024
Next you fill in the "expected" (E) column. Using the total as a starting point divide that
number into the two sets of data that would produce the 3 to 1 ratio you expect.
Note that it might be easier to do the 1 (green) of the 3 :1 ratio first. However, if you are
comfortable with fractions it shouldn't be too hard to do them in any order.
(O-E)2
E
6024 X 3/4
Yellow 4400
4518
6024 X 1/4
Green 1624
1506
Total 6024 6024

Notice that the total expected is the same as the total observed. If they don't add up to the
same number you have made an error in the math.
Now fill in the rest of the table. It's a lot of work but, now that you have it all organized,
it should be just a matter of using your calculator correctly. There is no reason to "total"
columns O-E or(O-E)2 so leave them blank. However, it is very important to complete the
"total" in the last column, (O-E)2/E, because that is the chi-square!
Fill in the rest of the table.
(O-E)2
E
6024 X 3/4 4400 - 4518 -1182 13924 / 4518

Yellow 4400
4518 -118 13,924 3.08
6024 X 1/4 1624 -1506 -1182 13924 /1506

Green 1624
1506 -118 13,924 9.24
Total 6024 6024 12.32
Is the chi-square you calculated here within the boundary of "the possible"?
(To answer that, first go back to the Chi Square Significance Table you saw earlier. Then
page back down to here.)
NO! 2 = 12.32 but, with one degree of freedom we cannot accept any ratio that gives us
a chi-square larger than 3.84.
Do we accept that these results are within acceptable range of a 3 : 1 ratio?
No! We must reject the 3 : 1 ratio. This data is far off the 3 : 1 ratio.
Chi-square of a dihybrid cross as a quick table

Consider these results from a dihybrid cross
30 red tall
65 white tall
83 red short
206 white short
Before we dive into the chi-square we have to first determine what ratio we will test and
which category (class) fits with each part of the ratio.
Based upon these numbers, which phenotypes are dominant and recessive for the two
loci? (Remember, these are the F2s from a dihybrid cross so they should be close to a
specific ratio that you learned earlier. And you also learned which traits end up in each
part of that ratio.)
Also, as best you can, assign genotypes to these phenotypes.
A dihybrid cross should produce a 9 : 3 : 3 :1 ratio in the F2s and a simple look at the
numbers will give you an idea of which belongs to each category.
The biggest group is the white shorts so they must be the doubly dominant class. In other
words, white shorts can be assigned the genotype W-S-.
On the opposite end of the ratio, the least represented group, would be the doubly
recessive so the red talls are the "1" in the 9 : 3 : 3 :1 ratio and have the genotype wwss.
You can deduce the other two classes, making up the "3" in the ratio. The white talls have
the genotype W-ss and the red shorts are wwS-.
Now that you have identified each category and assigned it to the ratio, we can begin the
chi-square to determine if it fits.
Let's begin by first arranging our computation table. It will be twice the size of the
previous table. It might help to arrange them in the table in a descending order to
represent the 9 : 3 : 3 : 1 ratio. Draw the appropriate table including the observed
numbers.
(O-E)2
E
White and
short 206
(W-S-)
Red and
short 83
(wwS-)
White and
tall 65
(W-ss)
Red and
tall 30
(wwss)
Total 384
Great! We are ready to start. First determine the "expecteds". It might be easier to do the
"1" part of the ratio first and work up the table. Regardless, take your time and calculate
what the expected numbers should be and fill in the "E" column.
(O-E)2
E
White and
24 X 9
short 206
216
(W-S-)
Red and
24 X 3
short 83
72
(wwS-)
White and
24 X 3
tall 65
72
(W-ss)
Red and
24 X 1
tall 30
24
(wwss)
Total 384 384
I hope you were able to work through that and get these numbers too. Did you check your
math by adding up the column to make sure the E column equals the C column?
Now it is time to fill in the rest of the table and calculate the chi-square.
Go ahead and complete the calculations before paging down.
(O-E)2
E
White and
24 X 9 206 - 216 102 100 / 216
short 206
216 10 100 0.463
(W-S-)
Red and
24 X 3 83 - 72 112 121 / 72
short 83
72 11 121 1.681
(wwS-)
White and
24 X 3 65 - 72 -72 49 / 72
tall 65
72 -7 49 0.681
(W-ss)
Red and
24 X 1 30 - 24 62 36 / 24
tall 30
24 6 36 1.500
(wwss)
Total 384 384 4.325
Did you get 4.325 for the answer?
If you didn't, look over my answer and figure out where you went wrong - and try to
learn from your error so you can do it right next time. [A common mistake occurs in the
last column - many students divide by either the observed or by some other expected
number. Remember to always divide by the expected number for that category.]
Degrees of 5 % Significance
Freedom Levels
OK, you have calculated the chi-square and it is
now time to do something with it.
1 3.84
Here's a portion of the Chi Square Significance
Table. 2 5.99
How many "classes" (categories, groups) are in 3 7.81

this experiment?
4 9.49
Four (Red and tall, White and tall, Red and short, White and short)
Some students get through the difficult chi-square but then make a simple mistake at this
point. Some get confused and pick a number out of the ratio and say there at nine classes!
Or three. Or some other number and I cannot figure out where it came from. So, just to
keep yourself thinking clearly, it is smart to list the categories.
Now, how many degrees of freedom are in this experiment?
Degrees of Freedom 5 % Significance Levels
1 3.84
Three (4 -1 )
2 5.99
Does the 9 : 3 : 3 : 1 ratio fit the
data?
3 7.81
4 9.49
Yes! With three degrees of freedom you can have a chi-square as large as 7.81 before we
would be beyond our 5% significance.
Notice that if you had been so foolish as to stick with the one degree of freedom (that we
were using with the monohybrid crosses) you would have decided that the chi-square was
too large and would have (WRONGLY) rejected the ratio!
The chi-square can be used whenever there is an

expected ratio
What is the expected ratio of boys to girls?
1:1
What is the degrees of freedom in that example?

There are two categories (classes) so there is one degree of freedom.
There are in vitro fertilization (IVF) methods that can increase the chances that a girl will
be born or a boy will be born. You can use the chi-square to determine if a particular IVF
clinic is really increasing the chances of having a boy or girl. You could look at the
number of girls and boys born to women who wanted girls or boys and calculate the chi-
square.
If a particular IVF clinic can, indeed, increase the odds, would you expect the chi-square
to be above or below the value of 3.84 (which I got from the table above)?
If the IVF clinic can change the ratio from the expected 1 : 1 then the chi-square,
calculated on the number of daughters or sons born, would be greater than 3.84.
I hope you understand that here we are "hoping" that the ratio will NOT be 1 : 1. (In point
of fact, scientists aren't supposed to "hope" for results but the fact remains that they often
hope a lot! )
Let's consider another situation.
You are the district manager of three fast food restaurants and you are looking over the
revenues. You see that store A made $1,000,000, store B made $3,000,000 and store C
brought in $5,000,000. You wonder if that is just a statistically blip. How would you use
the chi-square to test the idea that these stores are different - beyond luck? (Don't do the
chi-square - just tell me how you would set it up.)
You would "expect" a 1 : 1 : 1 ratio in the revenues if they were all the same. In other
words, the total revenues of $9,000,000 would be distributed evenly. You would
expect ...
Store A = $3,000,000
Store B = $3,000,000
Store C = $3,000,000
You could now find, for each store, the difference between expected and observed
revenues, square the difference, divide that by the expected and then add all three
together to get a chi-square value.
Suppose the manager of store A complains that you are not being fair because you
haven't taken into account the differences in local population around each store. His store
serves a smaller community. So, you go to the population records and discover that store
A serves a population that is only a quarter the size of the communities served by stores
B and C. Can you redo the chi-square? How?
The information about the populations tells you that there are four times as many likely
customers for stores B and C as A. You can express that as a ratio of 1 : 4 : 4. If revenues
are dependent upon population you would expect ("expect" is the magic word that means
"here comes a chi-square")
Store A = $1,000,000
Store B = $4,000,000
Store C = $4,000,000
The observed revenues were
Store A = $1,000,000
Store B = $3,000,000
Store C = $5,000,000
Now you would do another chi-square to determine if these numbers fit a 1 : 4 : 4 ratio
(thus showing that revenues are probably dependent upon population).
And finally, what is the degree of freedom for this-three store problem?
There are three categories (Stores, A, B and C) so there are two degrees of freedom.
These last few puzzles, about sex ratios and revenue ratios, are to show you that the chi-
square has many uses and that all you have to do is identify how to think about the ratios,
expectations and outcomes.

Using The Chi Square Test

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Using The Chi Square Test

Încărcat de

Drepturi de autor:

Formate disponibile

USING THE CHI SQUARE TEST

In a perfect world, without the complexities of statistics, we would expect a "large

Mendel used the chi-square (abbreviated 2) test and so will you.

Actually, 2 is pretty easy and to some folks it's even obvious!

OK, let's look at those F2s again.

Step 1: calculate the EXPECTED number of each type.

Step 2: calculate the "SQUARE OF THE DIFFERENCE PER EXPECTED".

This experiment has a chi-square equal to 0.263 ( 2 = 0.263).

The chi-square is a kind of "mathematical judge" of probabilities.

First, calculate the expected number of each type.

Second, calculate the "squared differences per expected".

Let's do another chi-square (Ugh! ) with some other values.

What would be the ratio if it were exactly 3 : 1?

Now add them together (that's what means) to get a 2

Hmmm. Look at that data again.

Chi-square of a monohybrid cross as a "walk through"

1. What ratio did he observe?

5474 / 1850 = 2.9589189 : 1 = 2.96 : 1

2. What ratio did he expect?

3. What is the total number of seeds?

4. What number of wrinkled is expected?

5. What number of smooth is expected?

1831 X 3 = 5493 or 7324 X 3/4 = 5493

6. What is the difference between observed and expected smooth?

5474 - 5493 = -19

7. What is the difference between observed and expected wrinkled?

For "statistical magnification" we INCREASE those differences by squaring them.

-192 = 361 or -19 X -19 = 361

192 = 361 or 19 X 19 = 361

361 / 5493 = 0.06572 = 0.066

361 / 1831 = 0.19716 = 0.197

12. What is the sum of the "squared differences per expected"?

13. Name all the different classes in the experiment (earlier)…..

Smooth and Wrinkled

14. How many degrees of freedom were in that experiment?

Here's a portion of the Chi Square Significance 1 3.84

Chi-square of a monohybrid cross as a quick table

Consider these results among the F2s

First, set up a table like the one below

Total 6024 6024

Fill in the rest of the table.

6024 X 3/4 4400 - 4518 -1182 13924 / 4518

6024 X 1/4 1624 -1506 -1182 13924 /1506

Total 6024 6024 12.32

Do we accept that these results are within acceptable range of a 3 : 1 ratio?

Chi-square of a dihybrid cross as a quick table

Also, as best you can, assign genotypes to these phenotypes.

Total 384 384

Total 384 384 4.325

Did you get 4.325 for the answer?

How many "classes" (categories, groups) are in 3 7.81

Now, how many degrees of freedom are in this experiment?

Degrees of Freedom 5 % Significance Levels

The chi-square can be used whenever there is an

What is the degrees of freedom in that example?

Let's consider another situation.

S-ar putea să vă placă și