Print

Stage I
Fresh, inflammatory usually discolored marks.
Stage II White, superficial marks without laddering and without palpable a depression at the surface of the skin. Stage II White, superficial marks without laddering but with palpable b depression at the surface of the skin. Stage III White, atrophic striae with laddering measuring less than 1 cm width, a without deep pearliness. Stage III White, atrophic striae with laddering measuring less than 1 cm width, b with deep pearliness. Stage IV White, atrophic striae with laddering measuring more than 1 cm width, with or without deep pearliness.
bottom or low-variability case. Why? Because there is relatively little overlap between the two bell-shaped curves. In the high variability case, the group difference appears least striking because the two bell-shaped distributions overlap so much.
Figure 2. Three scenarios for differences between means.
Mean The mean (or average) of a set of data values is the sum of all of the data values divided by the number of data values. That is:
This leads us to a very important conclusion: when we are looking at the differences between scores for two groups, we have to judge the difference between their means relative to the spread or variability of their scores. The t-test does just this. Statistical Analysis of the t-test The formula for the t-test is a ratio. The top part of the ratio is just the difference between the two means or averages. The bottom part is a measure of the variability or dispersion of the scores. This formula is essentially another example of the signal-to-noise metaphor in research: the difference between the means is the signal that, in this case, we think our program or treatment introduced into the data; the bottom part of the formula is a measure of variability that is essentially noise that may make it harder to see the group difference. Figure 3 shows the formula for the t-test and how the numerator and denominator are related to the distributions.
Example 1 The marks of seven students in a mathematics test with a maximum possible mark of 20 are given below: 15 13 18 16 14 17 12 Find the mean of this set of data values. Solution:
Figure 3. Formula for the t-test. The top part of the formula is easy to compute -- just find the difference between the means. The bottom part is called the standard error of the difference. To compute it, we take the variance for each group and divide it by the number of people in that group. We add these two values and then take their square root. The specific formula is given in Figure 4:
So, the mean mark is 15.
The T-Test The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups, and especially appropriate as the analysis for the posttestonly two-group randomized experimental design.
Figure 4. Formula for the Standard error of the difference between the means. Remember, that the variance is simply the square of the standard deviation. The final formula for the t-test is shown in Figure 5:
Figure 1. Idealized distributions for treated and comparison group posttest values. Figure 1 shows the distributions for the treated (blue) and control (green) groups in a study. Actually, the figure shows the idealized distribution -- the actual distribution would usually be depicted with a histogram or bar graph. The figure indicates where the control and treatment group means are located. The question the t-test addresses is whether the means are statistically different. What does it mean to say that the averages for two groups are statistically different? Consider the three situations shown in Figure 2. The first thing to notice about the three situations is that the difference between the means is the same in all three. But, you should also notice that the three situations don't look the same -- they tell very different stories. The top example shows a case with moderate variability of scores within each group. The second situation shows the high variability case. the third shows the case with low variability. Clearly, we would conclude that the two groups appear most different or distinct in the
Figure 5. Formula for the t-test.
The t-value will be positive if the first mean is larger than the second and negative if it is smaller. Once you compute the t-value you have to look it up in a table of significance to test whether the ratio is large enough to say that the difference between the groups is not likely to have been a chance finding. To test the significance, you need to set a risk level (called the alpha level). In most social research, the "rule of thumb" is to set the alpha level at .05. This means that five times out of a hundred you would find a statistically significant difference between the means even if there was none (i.e., by "chance"). You also need to determine the degrees of freedom (df) for the test. In the t-test, the degrees of freedom is the sum of the persons in both groups minus 2. Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of significance (available as an appendix in the back of most statistics texts) to
determine whether the t-value is large enough to be significant. If it is, you can conclude that the difference between the means for the two groups is different (even given the variability). Fortunately, statistical computer programs routinely print the significance test results and save you the trouble of looking them up in a table. The t-test, one-way Analysis of Variance (ANOVA) and a form of regression analysis are mathematically equivalent (see the statistical analysis of the posttestonly randomized experimental design) and would yield identical results. Posttest-Only Analysis To analyze the two-group posttest-only randomized experimental design we need an analysis that meets the following requirements:
ways to do the same thing. So, what are the three ways? First, we can compute an independent t-test as described above. Second, we could compute a one-way Analysis of Variance (ANOVA) between two independent groups. Finally, we can useregression analysis to regress the posttest values onto a dummy-coded treatment variable. Of these three, the regression analysis approach is the most general. In fact, you'll find that I describe the statistical models for all the experimental and quasi-experimental designs in regression model terms. You just need to be aware that the results from all three methods are identical. OK, so here's the statistical model in notational form. You may not realize it, but essentially this formula is just the equation for a straight line with a random error term thrown in (ei). Remember high school algebra? Remember high school? OK, for those of you with faulty memories, you may recall that the equation for a straight line is often given as: y = mx + b which, when rearranged can be written as: y = b + mx (The complexities of the commutative property make you nervous? If this gets too tricky you may need to stop for a break. Have something to eat, make some coffee, or take the poor dog out for a walk.). Now, you should see that in the statistical model yi is the same as y in the straight line formula, 0 is the same as b, b1 is the same as m, and Zi is the same as x. In other words, in the statistical formula, b0 is the intercept and b1 is the slope. It is critical that you understand that the slope,b1 is the same thing as the posttest difference between the means for the two groups. How can a slope be a difference between means? To see this, you have to take a look at a graph of what's going on. In the graph, we show the posttest on the vertical axis. This is exactly the same as the two bell-shaped curves shown in the graphs above except that here they're turned on their side. On the horizontal axis we plot the Z variable. This variable only has two values, a 0 if the person is in the control group or a 1 if the person is in the program group. We call this kind of variable a "dummy" variablebecause it is a "stand in" variable that represents the program or treatment conditions with its two values (note that the term "dummy" is not meant to be a slur against anyone, especially the people participating in your study). The two points in the graph indicate the average posttest value for the control (Z=0) and treated (Z=1) cases. The line that connects the two dots is only included for visual enhancement purposes -- since there are no Z values between 0 and 1 there can be no values plotted where the line is. Nevertheless, we can meaningfully speak about the slope of this line, the line that would connect the posttest means for the two values of Z. Do you remember the definition of slope? (Here we go again, back to high school!). The slope is the change in y over the change in x (or, in this case, Z). But we know that the "change in Z" between the groups is always equal to 1 (i.e., 1 - 0 = 1). So, the slope of the line must be equal to the difference between the average y-values for the two groups. That's what I set out to show (reread the first sentence of this paragraph). b1 is the same value that you would get if you just subtract the two means from each other (in this case, because we set the treatment group equal to 1, this means we are subtracting the control group out of the treatment group value. A positive value implies that the treatment group mean is higher than the control, a negative means it's lower). But remember at the very beginning of this discussion I pointed out that just knowing the difference between the means was not good enough for estimating the treatment effect because it doesn't take into account the variability or spread of the scores. So how do we do that here? Every regression analysis program will give, in addition to the beta values, a report on whether each beta value is statistically significant. They report a t-value that tests whether the beta value differs from zero. It turns out that the t-value for the b1 coefficient is the exact same number that you would get if you did a t-test for independent groups. And, it's the same as the square root of the F value in the two group one-way ANOVA (because t2 = F). Here's a few conclusions from all this:
has two groups uses a post-only measure has two distributions (measures), each with an average and variation assess treatment effect = statistical (i.e., non-chance) difference between the groups
Before we can proceed to the analysis itself, it is useful to understand what is meant by the term "difference" as in "Is there a difference between the groups?" Each group can be represented by a "bell-shaped" curve that describes the group's distribution on a single variable. You can think of the bell curve as a smoothed histogram or bar graph describing the frequency of each possible measurement response. In the figure, we show distributions for both the treatment and control group. The mean values for each group are indicated with dashed lines. The difference between the means is simply the horizontal difference between where the control and treatment group means hit the horizontal axis. Now, let's look at three different possible outcomes, labeled medium, high and low variability. Notice that the differences between the means in all three situations is exactly the same. The only thing that differs between these is the variability or "spread" of the scores around the means. In which of the three cases would it be easiest to conclude that the means of the two groups are different? If you answered the low variability case, you are correct! Why is it easiest to conclude that the groups differ in that case? Because that is the situation with the least amount of overlap between the bell-shaped curves for the two groups. If you look at the high variability case, you should see that there quite a few control group cases that score in the range of the treatment group and vice versa. Why is this so important? Because, if you want to see if two groups are "different" it's not good enough just to subtract one mean from the other -- you have to take into account the variability around the means! A small difference between means will be hard to detect if there is lots of variability or noise. A large difference will between means will be easily detectable if variability is low. This way of looking at differences between groups is directly related to the signal-tonoise metaphor -- differences are more apparent when the signal is high and the noise is low. With that in mind, we can now examine how we estimate the differences between groups, often called the "effect" size. The top part of the ratio is the actual difference between means, The bottom part is an estimate of the variability around the means. In this context, we would calculate what is known as the standard error of the difference between the means. This standard error incorporates information about the standard deviation (variability) that is in each of the two groups. The ratio that we compute is called a t-value and describes the difference between the groups relative to the variability of the scores in the groups. There are actually three different ways to estimate the treatment effect for the posttest-only randomized experiment. All three yield mathematically equivalent results, a fancy way of saying that they give you the exact same answer. So why are there three different ones? In large part, these three approaches evolved independently and, only after that, was it clear that they are essentially three
the t-test, one-way ANOVA and regression analysis all yield same results in this case the regression analysis method utilizes a dummy variable (Z) for treatment regression analysis is the most general model of the three.

Print

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Print

Încărcat de

Drepturi de autor:

Formate disponibile

Stage I

Fresh, inflammatory usually discolored marks.

Figure 2. Three scenarios for differences between means.

So, the mean mark is 15.

Figure 5. Formula for the t-test.

S-ar putea să vă placă și