Sunteți pe pagina 1din 17

Confounding

Identification of Extraneous Variables

Handout

1.

The problem of whether psychological Statistics students should be taught statistics in the computer lab using software or in a classroom using calculator-based analyses has been a point of controversy for many years. To attempt to decide this issue an Dr. Frank N. Furter plans to teach statistics to two groups, one by each method. PBA teaches only statistics using calculator-based math. "This is fine for one group," the Dr. Furter says. "Now I must find a college that used computer-based approach." Accordingly a visit is made to another school that uses the computer-based approach, where a sample of students is tested to see how well they can deal with statistical analysis and interpretation. After administering an extensive test, it is found that the students who learned using calculator-based course work are reliably superior to those who learned by the computerbased approach. It is then concluded that the calculatorbased course is superior to the computer-based method. Do accept or reject this conclusion? Why?

The confounding in this study is especially atrocious. The participants in the two groups undoubtedly differ in a large number of respects other than type of method. For instance, there may be differences in intelligence, opportunity to study, socioeconomic level, as well as differences in math and analytical proficiency prior to learning by either approach, and certainly there were different instructors. The proper approach would be to randomly assign participants from the same class in a given school to two groups, and then to randomly determine which group is taught by each method, both groups being taught by the same instructor.

2.

An Industrial/Organizational Psychologist is interested in whether training on a simulator facilitates accuracy in an actual assembly process. A group of new employees with no previous assembly experience is randomly divided into two groups. One group receives training on the simulator; the other does not. Both groups are then tested on their ability to assemble the target commodity under actual assembly line conditions. There are two assembly lines and associated equipment. The simulator-trained group is assigned to one assembly line and a corresponding set of target items, whereas the control group assembles the target items on the second assembly line. The tests show that the group previously trained on the simulator is significantly faster and make fewer errors than is the control group. The conclusion is that simulator training facilitates actual assembly line performance.

The characteristics of the individual assembly lines and target items are confounded with the independent variable. It may be that the machinery is more accurate on one assembly line than the other, and that one set of targets is easier to produce than the other. To control these variables one might have all participants assemble from the same assembly line with the same run of target items. Or half of the participants from each group could assemble from each assembly line with each set of targets. The simulator itself gave that group extra practice too, another confound.

3. A psychologist tests the hypothesis that early toilet training leads to a personality of excessive compulsiveness about cleanliness, and conversely, that late toilet training leads to sloppiness. Previous studies have shown that middle-class children receive their toilet training earlier than do lower-class children so that one group is formed of middle-class children and another of lower-class children. Both groups are provided with a finger painting task, and such data are recorded as the extent to which children smear their hands and arms with paints, whether they clean up after the session, and how many times they wash the paints from their hands. Comparisons of the two groups o these criteria indicate that the middle-class children are reliably more concerned about cleanliness than are those of the lowerclass. It is thus concluded that early toilet training leads to compulsive cleanliness, whereas later toilet training results in less concern about personal cleanliness.

Undoubtedly these classes differ in a number of respects, among which is age at which they are toilet trained. The dependent variable results may thus be due to some other difference between the groups such as amount of social stimulation or amount of money spent on family needs. The obvious, but difficult, way to conduct this experiment in order to establish a causal relation would be to randomly select a group of children, randomly assign them to two groups, and then randomly determine the age at which each group is toilet trained.

4. A hypothesis is that emotionally loaded words like sex and prostitute must be exposed for a longer time to be perceived than neutral words, an idea important for construction of ads and other messages. To test this hypothesis, various words are exposed to participants for extremely short intervals. In fact, the initial exposure time is so short that no participant can report any of the words. The length of exposure is then gradually increased until each word is correctly reported. The length of exposure necessary for each word to be reported is recorded. It is found that the length of time necessary to report the emotionally loaded words is longer than that for the neutral word. It is concluded that the hypothesis is confirmed.

There may be other reasons for not reporting an emotionally loaded word than that it is not perceived. For instance, sex may actually be perceived, but the participant waits until being absolutely sure that that is the word, possibly saving the person from a "social blunder." In addition, the frequency with which the loaded and neutral words are used in everyday life undoubtedly differs, thus affecting the threshold for recognition of the words. A better approach would be to start with a number of words that are emotionally neutral (or with nonsense syllables), and make some of them emotionally loaded (such as associating an electric shock or other negative consequence with them). The loaded and neutral words should be equated for frequency of use.

5. A health psychologist conducted an experiment to study the effect of acupuncture on pain. Half of the participants were treated for painful shoulders through acupuncture, whereas the other half received no special treatment. The participants who received acupuncture treatment reported a reliable improvement in shoulder discomfort to a "blind" evaluator after treatment. However, no statistically reliable improvement was reported by the control group. The physician concluded that acupuncture is an effective treatment for chronic shoulder pain.

One should not accept this conclusion, because there is no control for the effects of suggestion. The control group should have experienced some treatment similar to that of the experimental group, such as having a different pattern of needles inserted beneath the skin. Studies that have controlled for the effects of suggestion have indicated no such differences between acupuncture and placebo groups. Other scientific studies independently have confirmed that merely suggesting that pain will be reduced through experimental techniques is sufficient to lead patients to report decreased pain. Finally, it would have been stronger to show that the experimental group was superior to the control group.

6. Dr. Statsdud wants to study the effects of grades as rewards or punishments. He chooses two Introductory Psychology classes. Dr. Psycho teaches both classes (i.e., both sections). In one class students were given A, B, C, D, or F grades, whereas the other class either passed or failed. Tests indicated that there were no reliable differences between the two classes in terms of achievement, attitudes, or values. The conclusion was that students learn just as well without the reward or punishment of grades. Dr. Statsdud also observed a difference in classroom atmosphere in which the pass-fail class was more relaxed and free of grade-oriented tensions with better rapport between the Dr. Psycho and students.

This research does not allow us to draw any conclusions because of its faulty methodology in a number of respects. The two groups of students probably differed before the research was started. Not randomly assigning students to the two classes, but allowing them to be selected on the basis of class hours, produces a confound. The instructor's belief in the relative efficacy of the methods may have influenced performance of students in the two classes. Hence the students being taught by the pass-fail method may have outperformed what would have been normal for them. The most serious methodological criticism is that failure to find a difference between groups does not allow one to conclude that the two methods are equally effective. There are an infinite number of possible reasons why two conditions in any study may not differ significantly, only one of which is that the (population) means on the dependent variable scores of the two groups are equal. Failure to reject the null hypothesis is not equivalent to accepting the null hypothesis. A much more likely reason for failing to find a reliable difference between groups is that there is excessive experimental error in the conduct of the research; typically this is due to poor control methodology. Finally, casual observation of a difference in "classroom atmosphere" hardly provides the kind of information upon which educational curricula should be based. Regardless, as an extension of this kind of conclusion, one could probably predict that a course in which there were no grades at all would result in total freedom from "grade oriented tensions."

7. Wexley and Thornton (1972) conducted an experiment to test the hypothesis that providing employees in training with verbal feedback about assessment results would facilitate improvements. During the training period, 169 trainees were given four assessments. After each assessment, of the participants were reshown 18 of the 35 multiple-choice items. The training instructor read each of these 18 items, and gave the correct answer and a brief explanation of the correct answer. The instructor indicated that time did not permit going over all 35 items; therefore participants did not see or receive feedback on 17 items per quiz. The other of the trainees, the control group, were not reshown the items and, thus, did not receive verbal feedback. The final assessment in the training period consisted of 38 of the feedback questions and 38 of the nonfeedback questions. Wexley and Thornton found that, in general, trainees did better on items for which feedback was given than those for which feedback was not given. Wexley and Thornton concluded that the results confirm the assumption that post-assessment verbal feedback does facilitate workplace improvement.

The confounding variable in problem 7 is that verbal feedback is totally confounded with number of exposures to the assessment items. Recall that participants in the verbal feedback condition were reshown 18 test items, whereas participants in the nonfeedback condition were not reshown the questions. Could the effect observed be attributed to the fact that some participants were simply exposed more often to the questions than others were? This confounding cannot be easily eliminated, per se. An additional control group could, however, be added to help rule out the alternative explanation. A third group could be reshown the questions but not be given the feedback and explanation about the correct answer. Comparisons would then be made between the three groups to see just how feedback affects workplace training.

8. Okay, now have some fun! In an attempt to show a relationship between anxiety-producing conditions and sexual attraction, Dutton and Aron (1974) conducted a field experiment. Male participants were approached by either an attractive male or female experimenter and asked to participate in a psychological study on the effects of exposure to scenic attractions on creative expression. Participants were approached at one of two locations. One location was a 450-foot-long foot-bridge consisting of wooden slats supported by wire cables. This bridge was suspended 230 feet over a canyon (high bridge). The second location was a solid wood bridge 10 feet over a small, shallow river (low bridge). Participants were given an item from the Thematic Apperception Test (TAT) and asked to make up a story about it. After completing the task participants were told that they could receive more information about the study by contacting the experimenter directly. At that point the experimenter tore a small piece of paper from a sheet of paper and wrote his or her telephone number on it and invited the participant to call. The major measure of sexual attraction was the number of participants calling the experimenter for information. The results showed that fewer participants called the male experimenter than called the female and that participants who were approached on the high bridge were more likely to call the female experimenter than those who were approached on the low bridge.

The confounding factor in this experiment, which as we pointed out before was acknowledged by the authors, was a participant selection bias. It could be argued that participants who choose to cross the more dangerous bridge are more the type who would subsequently call an attractive female than participants who choose to cross the safer bridge. Participants who are willing to take the chance of crossing the high bridge may, for example, be bolder in general, more arousal seeking, and less inhibited. Participant selection biases like the one in this problem are nor easily eliminated. You could, however, conduct follow-up studies to test the limits of potential alternative explanations. For example, you could have interviewed participants 10 or 15 minutes after they crossed each bridge. If you found no differences in participants responses, you could be reasonably sure that the original data were not due to selection biases. You could then attribute the observed differences to the arousal actually produced on the bridge. If, on the other hand, participants still differed, then participant selection might still be a problem. In fact, Dutton and Aron (1974) did such follow-up experiments. These experiments support the conclusion that heightened anxiety is related to sexual attraction.

S-ar putea să vă placă și