Salas ROI

Team Training in the Skies: Does Crew Resource Management (CRM) Training Work?
Eduardo Salas C. Shawn Burke Clint A. Bowers Katherine A. Wilson University of Central Florida Orlando, FL, USA
Key Words:
crew resource management, teamwork, aviation, team training, training evaluation, multi-level evaluation, safety
Shortened Title: Team Training in the Skies
The aviation community has invested great amounts of money and effort into crew resource management (CRM) training. Using Kirkpatricks (1976) framework for evaluating training, we reviewed 58 published accounts of CRM training to determine its effectiveness within aviation. Results indicated that CRM training generally produced positive reactions, enhanced learning, and desired behavioral changes. However, we cannot ascertain whether CRM has an effect on an organizations bottom line (i.e., safety). We discuss the state of the literature with regard to evaluation of CRM training programs and, as a result, call for the need to conduct systematic, multi-level evaluation efforts that will show the true effectiveness of CRM training.
Address correspondence to:
Dr. Eduardo Salas Department of Psychology University of Central Florida P.O. Box 161390 Orlando, FL 32816-1350 (407) 823-2552(w); 823-5862 (fax) esalas@pegasus.cc.ucf.edu
2 Team Training In The Skies: Does Crew Resource Management (CRM) Training Work? It is well acknowledged that 60-80% of accidents and mishaps occurring in aviation have been attributed to human error (Freeman & Simmon, 1991) . A large part of these are due to failures in coordination among cockpit crews. For example, poor pilot performance and faulty crew resource management (CRM) have been cited as contributing factors in numerous accidents and incidents reported by major airlines during the time period covering 1983-1985 (U. S. GAO, 1997). In addition, CRM deficiencies (e.g., lack of coordination among cockpit crews, captains failure to assign tasks to other members, and a lack of effective crew supervision) were a contributing cause in approximately of the above reported accidents that involved one or more fatalities (U.S. GAO, 1997). Other reviews have found similar factors at work within cited accident reports (see Leedom & Simon, 1995; Chidester, Helmreich, Gregorich, & Geis, 1991; Gregorich, Helmreich, & Wilhelm, 1990). Within the aviation environment, teamwork deficiencies are not only embarrassing and highly publicized, but can lead to tragic consequences. For example, Eastern Airlines Flight 401 crashed in the Florida Everglades in December 1972, because the crew permitted their fully operational Lockheed L-1011 to fly into the ground. What the crew failed to realize was that the altitude hold feature of the autopilot had been accidentally disconnected (as cited in Kayten, 1993). Results of the investigation revealed that the entire three-person crew was pre-occupied with a landing gear light that had failed to illuminate at the time of the accident. Many other aviation accidents resulting in disastrous consequences have also been attributed to faulty CRM skills (Allegheny Airlines, 1971; 1978; Mohawk Airlines, 1972; United Airlines, 1978; and others as cited in Kayten, 1993; U.S. GAO, 1997). In an effort to manage some of these problems with teamwork and the resulting safety issues, the aviation industry introduced the concept of CRM (Wiener, Kanki, & Helmreich, 1993; Salas, Bowers, & Edens, 2001). CRM was introduced as a way to train aircrews to use all available resourcesequipment, people, and informationby communicating and coordinating as a team. At this point in time, CRM has been used within the aviation industry for over 20 years, and has undergone several evolutions with varying foci (Helmreich, Merritt, & Wilhelm, 1999; Helmreich & Foushee, 1993; Maurino, 1999). Specifically, during the first evolution, the emphasis was on changing individual styles and correcting deficiencies in individual behavior, with a heavy focus on psychological testing. The second evolution represented a focus on cockpit group dynamics, was more modular, and dealt more with specific aviation concepts related to flight operations. With the third evolution came a broadening of scope. Specifically, training began to recognize the characteristics of aviation systems in which crews must function, as well as expanding to areas outside the cockpit (e.g., cabin crews, maintenance personnel). With the fourth generation came integration and proceduralization. Specifically, under the Advanced Qualification Program (AQP), carriers were allowed to tailor training to fit the needs of their specific organization, they were required to provide both CRM and line-orientated flight training (LOFT) to all crews, and CRM training was integrated with technical training. The fifth and latest evolution represents an awareness that human error is inevitable and can provide a great
3 deal of information. CRM is now being used as a way to try and manage these errors by focusing on training teamwork skills that will promote: (1) error avoidance, (2) early detection of errors, and (3) minimization of consequences resulting from CRM errors. Programs are beginning to go beyond error management to include a focus on threat recognition and management. The evolutions that CRM has witnessed have occurred over roughly two decades of use within the aviation community; research during this time has produced several lessons learned (see Salas, Bowers, & Edens, 2001). For example, research has yielded information about how to maximize the design and delivery of CRM training through scenario design (Prince, Oser, Salas, & Woodruff, 1993; Prince & Salas, 1999), scenario feedback (Salas, Rhodenizer, & Bowers, 2000; Prince, Brannick, Prince, & Salas, 1997), and the training of operational personnel as observers and raters (Brannick, Salas, & Prince, 1997). In this vein, Helmreich and Wilhelm (1987) found that the systematic training of raters in CRM concepts made a significant difference in the quality of ratings and scale use (e.g., raters using entire scale) as compared to raters who were trained less systematically (as cited in Helmreich, Chidester, Foushee, Gregorich, & Wilhelm, 1990). Evidence has also been provided that suggests that low fidelity simulations can be used to practice/train CRM-related skills (Bowers, Salas, Prince, & Brannick, 1992; Baker, Prince, Shrestha, Oser, & Salas, 1993; Jentsch & Bowers, 1998). Furthermore, we have learned that national culture plays a powerful role in determining the effectiveness of CRM training programs (Maurino, 1994; Merritt & Helmreich, 1995b). Specifically, we have learned that attitudes that define the core concepts of CRM differ dramatically across national borders (e.g., individualism/ collectivism, power distance, uncertainty avoidance, and division of roles between sexes, see Hofstede, 1988). As such, initial attempts to apply CRM globally were often unsuccessful due to a failure to recognize the power of national culture (Helmreich, Wilhelm, Klinect, & Merritt, in press). Finally, we know that anecdotal evidence, as well as reactions to CRM training, generally suggests that CRM training can prevent accidents (see Diehl, 1991; Kayten, 1993), and that CRM is being applied in domains outside aviation (see Flin, 1995; Howard, Gaba, Fish, Yang, & Sarnquist, 1992; Merritt & Helmreich, 1995a). Although positive lessons have been learned, there are areas in need of improvement. To begin with, we know that, despite its long history, there remains a lack of consistency within the aviation industry with regard to definitions of CRM, training content, and methods of delivery (Wilhelm, 1991; Helmreich & Wilhelm, 1987; Salas, Prince, Bowers, Stout, Oser, & Cannon-Bowers, 1999). Second, we know that with the development of the Advanced Qualification Program (AQP) guidelines, each individual airline now has two choices in deciding how they want to implement CRM training: (1) traditional requirements as mandated by Federal Aviation Regulation (FAR) part 121, or (2) by using the AQP guidelines. FAR part 121 states the general operating requirements for domestic, flag, and supplemental operations, and contains the general requirements for CRM training. However, it leaves methods for CRM curriculum design and development as well as for evaluation ambiguous. In an effort to resolve the lack of guidance offered by the FAA regarding CRM training, the AQP guidelines were created. Under the AQP guidelines, airlines are provided with: (1) a process for developing the curriculum in order to integrate traditional CRM programs with technical training, (2) the required level of performance that must be achieved, and (3) guidelines that provide
4 inspectors with evaluation criteria to determine if the curriculum meets all FAA requirements (U.S. GAO, 1997; Helmreich et al., 1999). Despite the creation of AQP guidelines, airlines are not required to use them in replacement of FAR 121 requirements therefore, there is still much ambiguity and a lack of consensus when it comes to the design, delivery, and evaluation of CRM training programs. Although this ambiguity is surrounding the design (i.e., content) and evaluation of these programs, the aviation community continues to invest millions of dollars into CRM training and other communities (e.g., medical, offshore oil, maritime shipping companies) are now starting to jump on the bandwagon. Given the increased adoption of CRM as a worthwhile team training approach, there is a need to summarize the current state of knowledge about the effectiveness of CRM training in a systematic manner. That is, we need to assess the effectiveness of the implemented team training systems. Therefore, the purpose of this paper is to use Kirkpatricks (1976) typology for training evaluation, as a framework to evaluate the effectiveness of CRM training programs in aviation. Specifically, the review is organized via the type of evidence collected after training (i.e., reaction, learning, behaviors, and/or organizational effectiveness). THE NEED FOR EVALUATION Training evaluation has been defined as the systematic collection of descriptive and judgmental information necessary to make effective training decisions related to the selection, adoption, value, and modification of various instructional activities (Goldstein, 1993, p. 147). Although it is acknowledged that systematic training evaluation is not an easy task, it is the only way to ensure that training programs are having the desired effect and are a worthwhile investment for the organization. Similarly, Cannon-Bowers et al. (1989) have argued that training evaluation may serve a number of important functions. First and most obvious, program evaluation results can indicate whether the goals and objectives of a program are appropriate to achieve the desired outcome. Second, evaluation can indicate whether the content and methods used in training will result in achievement of the overall program goal. Third, evaluation data can be used to determine how to maximize transfer of training. Fourth, it can serve as feedback at both the individual and team level to suggest areas in need of improvement or revision. Goldstein (1993) offers similar arguments as to the benefits of evaluation. The most popular framework for guiding training evaluations is Kirkpatricks (1976) typology. Kirkpatrick argued for a multi-level approach to training evaluation consisting of four levels of evaluation: (1) reactions, (2) learning, (3) behavior (i.e., extent of performance change), and (4) results (i.e., degree of impact on organizational effectiveness or mission success). Within recent years, this typology has been expanded by several researchers (see Kraiger, Ford, & Salas, 1993; Salas & Cannon-Bowers, 2001). For example, Kraiger et al. (1993) expanded Kirkpatricks typology by arguing that learning is multi-dimensional and results in cognitive, affective, and skill-basedlearning outcomes. Moreover, Kraiger et al. suggest potential methods that can be used to evaluate each of these outcomes: (1) cognitive (verbal knowledge, knowledge organization, cognitive strategies), (2) affective (attitudinal, motivational), and (3) skillbased (compilation, automaticity). Goldsmith and Kraiger (1997) have built upon this work by describing a method for the structural assessment of an individual learners knowledge and skill, which has been successfully used in aviation research efforts (see Kraiger, Salas, & Cannon-Bowers, 1995; Stout, Salas, & Kraiger, 1997). The utilization
5 of Kirkpatricks typology and corresponding revisions, serves several important functions within the training evaluation process. First, it has served to organize the type of information that should be collected in the assessment of training. Second, it has served to argue for the added benefit/importance of collecting more than one level of evaluation information. Although both points perform important functions, the second has been more difficult to put into practice than the first. Specifically, Alliger and Janak (1989) reported that less than 10% of organizations assess training programs at all four levels of evaluation, as argued for by Kirkpatrick. HEEDING THE CALL: EVALUATION OF CRM TRAINING Our review resulted in the identification of 58 studies that appeared to evaluate the effectiveness of aviation CRM training programs. We next provide a description of the state of CRM evaluation efforts with respect to each of the levels of evaluation as identified by Kirkpatrick (1976). Specifically, studies that assessed training at only one level will be reviewed first, beginning with those collecting reaction data and ending with those collecting results/organizational effectiveness data. Following this will be a brief review of studies that assessed training at multiple levels, as argued for by Kirkpatrick. For a summary of individual studies see Table 1, which describes the findings of each of the identified studies in relation to reactions, learning, behavioral, and organizational effectiveness data. Do Aviators Like CRM?: Reaction Evidence Reaction evidence is the first level of Kirkpatricks (1976) typology and amounts to an assessment of trainees feelings toward the training program. Reaction data are assessed post training and examine the degree to which participants perceive that training was worthwhile, relevant, interesting, and/or well conducted. Reaction evidence is perhaps the easiest to collect and usually takes the form of a paper-and-pencil questionnaire where the response format is typically a Likert scale. A small sample of studies have also gathered reaction data through the utilization of questionnaires that ask participants to rank order the perceived usefulness of the CRM components included in training. After reviewing the available literature, reaction data is a commonly collected type of evaluation data. Specifically, of the studies included in our review, 27 of 58 (46%) involved the collection of reaction data (see Table 1). Of these 27 studies, 9 collected information solely related to participant reactions. Furthermore, the reaction data that was collected in the reviewed studies tended to reflect both affective feelings towards CRM programs, as well as the utility of these programs. Alliger, Tannenbaum, Bennett, and Traver (1997) argue that liking of training is the most common form of training assessment, and the results from the current review tend to support this argument in that 12 of the 27 studies (44%) assessed participants affective reactions towards training. Overall, the results of these studies suggest that most participants like CRM training. In addition to assessing the overall affective reactions to training, some studies (e.g., Baker et al., 1993; Horman, Goeters, Maschke, & Schiewe, 1995) also assessed affective reaction to particular components of CRM training. For example, Schiewe (1995) found that units that were based on case studies or used role play were very well liked by participants, while those based mostly on lecture were not rated favorably. Others have found similar findings (see Baker, Bauman, & Zalesny, 1991), suggesting
6 that perhaps methods that promote interaction among participants are liked better than those that are more passive. Reactions as to how participants liked training, are not the only type of reactions that may be collected from participants. Alliger et al. (1997) also suggest assessing participant reactions to the utility of training. Questions related to the assessment of utility,attempt to ascertain the perceived utility value, or usefulness, of training for subsequent job performance (p. 344). Nine of the 27 studies (33%) assessed the utility of training, while the remaining 7 studies assessed both affective and utility reactions. In terms of utility, the reviewed CRM training programs were seen to be worthwhile, useful, and applicable. Specifically, themes were seen as relevant (Grau & Valot, 1997; Horman et al., 1995) and participants felt that CRM class should be expanded to other fleets/populations (Incalcaterra & Holt, 1999). See Table 1 for further information. Positive reactions to CRM (affective, utility) were found to exist in single airline-studies, as well as in multi-fleet and multi-airline studies (see Butler, 1993; Helmreich & Wilhelm, 1991). Furthermore, the teaching of teamwork behaviors (e.g., communiation, decision-making, leadership [see Alkov, 1991; Alkov & Gaynor, 1991]), use of roleplaying exercises (see Baker et al., 1991), and inclusion of cabin crewmembers in training (see Vandermark, 1991) have all contributed to obtaining positive affective and utility reactions in training. Although the aviation community should be applauded for beginning to assess both affective reactions to training, as well as the perceived utility of training, there are a few suggested areas of improvement. First, in the assessment of the perceived utility of training, very few studies were found to actually ask participants how they would apply the newly learned behaviors back on the job (i.e., specific instances as to how/when these newly acquired skills might be beneficial to use/fit in). Second, Goldstein (1993) has argued for guidelines that should be followed in developing assessment of participant reaction, yet some of the reviewed studies seemed to fall short on these. For example, although Goldstein recommends that responses should be able to be tabulated and quantified, most of the studies reviewed, or at least the data presented, were not in a format where it was apparent that the data were able to be quantified. Typical evidence provided were things such as, most of the participants reported liking the training, or a few selected cases reported liking the training. A second concern is that while many studies reported using a Likert type scale to assess participant reaction there were a lack of studies that assessed the reliability of the scales used. A final concern is that many of the reviewed studies did not mention the specific components of CRM that were trained within the evaluated program. As CRM training is still not uniformly taught, the delineation of the particular skills taught is important in attempting to make sense of presented findings, as well as determining the extent to which findings may generalize. This last comment is not so much a critique of the data collection process as it is the dissemination of results. Summary. Despite the shortcomings mentioned above, the studies that assessed participant reactions to CRM training provide sufficient rigor and converging evidence to lead to the conclusion that CRM training in aviation settings does produce positive reactions. For the next part, aviators like and perceive that CRM training is worthwhile and useful to the safe conduct of their tasks. Although this is the simplest form of evaluation criteria, it serves an important purpose. Positive reactions to training are
7 crucial in that they can provide an avenue by which to garner the top level support that is essential for a programs lasting success, provide evidence of a programs credibility, and may enhance trainee motivation to learn. Conversely, negative reactions may point to areas of training that need to be revised, as well as providing insight into why desired changes at other levels of evaluation (e.g., learning, behavior) have not occurred (Orlady & Foushee, 1987). Do Aviators Learn About CRM?: Learning Evidence In a multi-level evaluation effort, learning evidence is the second level of evaluation, and it refers to the principles, facts, and skills which were understood and absorbed by participants (Kirkpatrick, 1976, p. 11). Although evidence at this level includes the learning that occurred during the program, it does not include the actual exhibition of learned behaviors (i.e., skills). Also included in this level is the extent to which training leads to desired attitude changes (i.e. positive attitudes towards CRM training). It is considered here because not only are both learning and attitude changes cognitive events, but both processes mediate performance, and as such, should be evaluated together. The bottom line is that assessment of learning criteria provides evidence as to how successful the training program was in imparting the targeted knowledge, skills, and attitudes, as well as providing the basis for feedback and areas in need of further refinement. Within the reviewed studies, 52% (30 of 58) collected information related to participant learning, with 11 of these studies solely assessing learning criteria (see Table 1). Within these efforts, the most common type of evidence offered in support of CRM affecting learning was changes in attitudes regarding CRM. Evaluation of attitudes was usually done by collecting information with the Cockpit Management Attitudes Questionnaire (CMAQ; Helmreich, 1984), or a modification of this instrument. The CMAQ is composed of three major scales: (1) communication and coordination (i.e., communication of intent and plans, delegation of tasks, assignment of responsibilities, and monitoring of crewmembers), (2) command responsibility (i.e., leadership), and (3) recognition of stressor effects (i.e., consideration and compensation for stressors). Overall, studies that assessed learning via attitude change seem to indicate that CRM training can produce positive changes in attitudes that are somewhat stable given top management support (see Table 1). For the most part, CRM programs seem to produce positive examples of participant learning, primarily as indexed by attitude change (see Table 1), however there are a few reported instances of CRM programs achieving a boomerang effect (e.g., instances of negative attitude change; Helmreich, 1991). One study found that personality type influenced whether participants had positive or negative attitude change (i.e., boomerang) as a result of CRM training (Chidester et al., 1991). The remaining studies reporting this effect, however, did not report the possible cause of the negative attitudes (see Irwin, 1991). For example, was this boomerang effect due to: (1) something within the training itself, (2) crews having very high, positive attitudes prior to training (Chidester et al., 1991), or (3) personality or cultural aspects which may have played a role (Helmreich, 1991; Chidester et al., 1991). Despite evidence of a boomerang effect with some participants, the preponderance of evidence seems to suggest that the majority of participants attending CRM training do learn in the sense that there is typically a positive change in targeted attitudes.
8 Although the assessment of learning via attitude change is the most popular form of assessing learning, a few studies used other methods (see Hayward & Alston, 1991; Incalcaterra & Holt, 1999; Salas, Fowlkes et al., 1999). For example, Hayward and Alston (1991) reported that as a result of CRM workshops there was an increased awareness of: human factors, crew performance, and potential stressors, as well as methods by which to handle these stressors. Assessing another form of learning (i.e., knowledge acquisition), Salas, Fowlkes et al. (1999) found that as compared to teams not trained in CRM, CRM trained groups exhibited higher levels of knowledge regarding CRM principles. Finally, several studies by Salas and colleagues (see also Stout, Salas, & Kraiger, 1997) have shown the positive effects of CRM training by assessing learning via a change in participant knowledge structures (i.e., mental models). Although the overall picture seems to suggest that CRM training does have a positive impact on participant learning, another important factor to consider is the stability of these learning changes over time. Although there have been many cited studies (see Table 1) that show CRM affecting initial learning/change, there have been fewer studies that have assessed the long term stability of these changes. The few studies that have examined whether the changes produced by training programs remain stable over time have found varying results. For example, although some have indicated that positive attitudes are stable anywhere from two (Incalcaterra & Holt, 1999) to five years out (Byrnes & Black, 1993), others have reported that initial attitude change produced by CRM programs declines over time, regressing towards pre-CRM levels (Irwin, 1991; Helmreich, 1991; Helmreich et al., 1999; Gregorich, 1993). Empirical results have suggested that one factor contributing to whether attitudes decline over time is whether management reinforces and supports the knowledge, skills, and attitudes learned in CRM programs or only provides lip" service to material learned in CRM programs (Gregorich, 1993; Helmreich, 1991; Helmreich et al., 1999). Recent work by Helmreich and colleagues has begun to examine the impact of organizational culture, including the safety culture within an organization, on the initial learning of targeted CRM knowledges, skills, and attitudes, as well as the stability of changes over time (see Helmreich et al., in press). Summary. Similar to the assessment of reaction data, the overall picture provided by learning criteria seems to suggest that CRM training is effective in producing changes in aviator knowledge and attitudes. This conclusion is further supported in that although the predominant form of collecting learning data is through the assessment of attitude change, several studies assessed other forms of learning as well (e.g., knowledge structures, paper-and-pencil tests), and found evidence of learning. The positive evidence offered by multiple measures of learning makes a stronger case for the effectiveness of CRM on learning. Although heading in the right direction, evaluation efforts must continue to strive for the inclusion not only of measures examining attitude change, but measures of declarative knowledge, as well as the assessment of knowledge structures and shared mental models (see Cannon-Bowers et al., 1989; Kraiger et al., 1993). Do Aviators Apply The Learned CRM Behaviors in the Cockpit?: Behavioral Evidence Behavioral evidence, the third level identified by Kirkpatrick (1976), provides an assessment of whether the lessons/knowledge learned in training transfer(s) to actual behavior on the job or a similar simulated environment. It has also been argued to
9 indicate: (1) the extent to which trainees learned how to perform the knowledge, skills, and attitudes (KSAs) taught during training, as well as when to apply these skills, and (2) an indication of trainee readiness and overall program effectiveness (Cannon-Bowers et al., 1989). Of the 58 studies reviewed for this effort, 32 (55%) gathered some type of behavioral data, as defined by Kirkpatrick (1976). Furthermore, 12 of the reviewed studies collected information solely at this level. The reader is referred to Table 1 for a specific breakdown of studies that collected behavioral evidence. Within the reviewed studies, the most common method of assessing behavioral change was through the measurement of CRM-related behaviors while participants performed line-orientated simulation, such as LOFT. More specifically, this type of assessment was evident in 18 of the 32 studies (56%). Less common was behavioral assessment as measured by online assessment of behavior (11 of 32), although a small subset of studies did evaluate both behavior in line-orientated simulators, as well as on-line behavior (3 of 32). The studies that collected behavioral data tended to use a combination of the following tools: behavioral observation forms, behavioral checklists, analysis of crew communication (via real time ratings or post-hoc analysis of video-tapes), and peer or self evaluations/reports. The predominant number of studies reviewed that collected some form of behavioral evidence indicated that CRM training had a positive impact on behavior (see Table 1). Specifically, results tended to indicate that CRM trained crews tended to exhibit: (1) improved performance as measured by behaviors indicative of CRM (e.g., decision making, mission analysis, adaptability, situation awareness, communication, leadership) or (2) improved performance as compared to crews not trained in behaviors indicative of CRM. For example, in a study by Leedom and Simon (1995) results indicated that after receiving CRM training, crews exhibited improved team communication patterns, more efficient management of crew resources, fewer team errors, and improved team coordination. Summary. Behavioral evidence has been argued to be highly valuable in determining the effectiveness of a training program because it provides a means to assess whether training participants can actually translate the knowledge, behaviors, and attitudes learned in training into action. Overall, the behavioral evidence we reviewed suggests rather strongly that CRM training does have an impact on behavior (primarily as evidenced through LOFT or similar evaluations). Aviators do exhibit more teamwork behavior in the cockpit. And presumably these will lead to safer outcomes. Although more evaluations were found at the behavioral level than initially expected, most of these evaluations have been conducted in simulated environments rather than actually on the job. Although resources may put restrictions on the ease with which behavioral data are captured back on the actual job, as opposed to simulated conditions, behavior on the actual job would provide even stronger support for the suggested effectiveness of CRM training with regard to behavioral change. However, behavioral data collected during simulated situations is definitely a close surrogate and a welcome start in the right direction. Are The Skies Safer?: Results/Evidence of Organizational Impact Organizational impact (i.e., increased safety, less errors) is the highest level of evaluation in Kirkpatricks (1976) framework. Although this type of evidence is highly valued, very few evaluations are conducted at this levelonly 10% (6 studies) of the
10 reviewed studies collected evaluation data at this level (see Table 1). Due to the difficulty of collecting this type of information (in terms of time, resources, identification of a clear criterion, and low occurrences of accidents and mishaps), evidence of CRMs training impact on the organization as a whole is not often sought, nor obtained. Specifically, evidence of this type is difficult to collect because it generally requires some type of longitudinal data, as it takes time for the impact of training to appear at the organizational level. In addition, criterion measures are difficult to identify and it is hard to control the various extraneous variables that may influence (e.g., moderate, mediate) the relationship between CRM training and organizational effectiveness. Of the six studies that collected some form of results measure, two collected information on organizational effectiveness alone, while the others collected additional types of evaluation evidence also (more on this later). Within the reviewed studies that collected data at the organizational level, most of the information tended to come from one of two sources: (1) anecdotal evidence (e.g., accident reports, incident reports) or (2) longitudinal studies. The predominant type of evidence that has been used to illustrate CRMs impact on aviation safety is anecdotal reports contained in incident or accident investigations conducted by the National Transportation Safety Board (NTSB). For example, Kayten (1993) cites several examples of reports by the NTSB in which good CRM practices were reported to limit the detrimental effects of either human or mechanical error. Although anecdotal reports contained in accident reports are perhaps the most common and easiest evidence of organizational impact to collect, there are some problems with accident data that argues for other types of organizational effectiveness measures. Perhaps the most predominant is the rarity with which accidents happen (see Maurino, 1999; Gregorich & Wilhelm, 1993; Helmreich & Foushee, 1993; Salas, Prince, et al., 1999), as such the investigation of incidents (as opposed to accidents) has been suggested. The other source of evidence used within the reviewed studies to assess the impact of CRM training on organizational effectiveness, is based on the longitudinal collection of data. For example, Byrnes and Black (1993) evaluated a CRM program implemented at Delta Airlines and found indications of CRMs impact on organizational effectivenessquarterly air carrier discrepancy reports were found to significantly decrease after CRM training was implemented. Although assessing organizational impact, the above study was designed with no control group to act as a comparison against potential intervening environmental confounds across the years referred to within the air discrepancy reports. As such, although the evidence provided is positive, it is not conclusive more controlled studies need to be conducted. Summary. So, what can we conclude from the reviewed evidence? Unfortunately, not much. Although anecdotal reports indicate that CRM behaviors may contribute to reducing the impact of human and mechanical error within the aviation community, much stronger evidence is needed. And this is easier said than done. There are a number of difficulties for establishing a clear cause and effect between CRM and safety. For example, various factors may intervene between the point at which a CRM program is implemented and the assessment of the programs impact, making inference somewhat problematic. Clearly, evaluation efforts that systematically track the impact of a particular CRM program on safety provide a stronger argument than anecdotal reports. The bottom line appears to be that although there is evidence of a positive trend regarding
11 the impact of CRM on safety more and better evaluations need to be conducted. And, in fact, of the evaluations reported, these need to be more detailedfor there were several evaluations that were found in the course of the literature search that, while hinting at the impact of CRM on a particular subset of crews, did not provide enough information to reach any real conclusions. Specifically, the information presented was either ambiguous or very general in nature; as such, those efforts were not included in the current review as it was felt that the results were too speculative. Multi-Level Evaluation Efforts Reviewing the 58 identified studies via Kirkpatricks typology has suggested that when each level of training evaluation criteria is considered independently, there is a fair amount of evidence to suggest that CRM training does have a positive impact on participant reaction, learning, and behavior, although the impact on safety can not yet be determined. However, the above conclusions are, in some sense, based upon weak data, for in order to truly assess whether CRM training is effective, evaluation efforts should focus on collecting information from multiple levels (e.g., reaction, learning, behavior, results). The collection of multi-level assessment data is important in that it provides a cleaner, more complete picture of training efficacy because the evaluator is not looking at one piece of evidence in isolation. In order for training to be truly effective, it must impact participant learning, learning must transfer to behavior, and behavior must transfer to a difference at the organizational level. The assessment of participant reactions is also an important part of the multi-level assessment, for reactions provide an initial check as to whether training is relevant to the knowledge, skills, and abilities needed on the job, as perceived by participants. In addition, reaction data serves as an important piece of information in that liking the training will serve to motivate participants in the learning process. Of the 58 identified studies, 24 (41%) collected information at multiple levels of Kirkpatricks (1976) typology. Of these 24, most collected information at only two levels (13 studies), typically the lower levels (e.g., reaction-learning, reaction-behavior-, see Table 1). Although a review of those studies indicated that CRM training generally produced positive results, assessment at only two levels still provides a limited view of the overall effectiveness of the program. A clearer, more complete picture would be provided by either collecting information at more levels or at least the higher levels (i.e., behavior-results). There was a fair number of studies that evaluated training programs at three of Kirkpatricks four levels (10 studies), but only one study was found that evaluated training at all four levels (see Table 1). As the studies that assess three or four levels of Kirkpatricks typology provide the clearest picture as to the actual effectiveness of CRM training, a brief overview of the value added by these studies will be described. The studies that assessed at least three levels of evidence attest to the fact that the various types of evaluation evidence provide different pieces of information, which may lead to a different overall conclusion as to the effectiveness of CRM training. For example, Smith (1994) found that when participant reactions and learning (attitudes) were assessed, a self analysis technique used to develop CRM skills in 10 undergraduate flight students was not highly valued and produced little change in attitudes. However, when behavior was assessed, it was found that self analysis did have an impact in that it helped crews to perform significantly better in three LOFT sessions and moderately
12 better in three others. Another example of the benefit of conducting multiple levelevaluation are the results of a study conducted by Stout et al (1997). Specifically, these researchers found positive reactions to training, evidence of learning via changed knowledge structures, and evidence of behavioral change in that trained participants performed an average of 8% more desired CRM behaviors than a control group. However, when learning was assessed via attitude change, there was a positive, but nonsignificant change. Had either of the above studies only collected single-level evaluation data, they might have come to different overall conclusions as to how to improve the implemented training program. Alkov and colleagues (see Alkov, 1991; Alkov & Gaynor, 1991) provided the only reviewed study that evaluated CRM training at all four levels of Kirkpatricks (1976) typology. Specifically, they evaluated a CRM training program targeted at 45 naval aviation training squadronshelicopters, attack bombers, and multi-placed fighters. Results suggest that squadrons reported the training as useful and they indicated a desire for training to continue. Positive attitude changes regarding CRM training were noted, and squadron commanding officers reported that training was found to contribute to better communication between instructors and students. Finally, some evidence of organizational impact was found in that following CRM training, overall aircrew mishap rate declined in all three communities. However, it should be noted that this study is not meant to serve as an example to others as a way of conducting multi-level evaluations, as it is a preliminary analytical effort and hence many of the above conclusions offered are tentative. We do however commend their efforts in attempting to evaluate CRM training at all four levels. Summary. It has been said that in the absence of a well-defined, measurable, ultimate criterion (that rarely exists in the real world), it is important to assess training at multiple levels for each additional source of data serves to increase confidence in the overall evaluation (Cannon-Bowers et al., 1989). For example, Although reaction data can indicate whether the trainee felt the program was worthwhile, it has little if any relation to whether the participant learned the material. Similarly, just because a trainee learned the knowledge during training does not guarantee that he/she can translate this knowledge into effective behavior, nor does it guarantee that, if applied, the behavior will have an effect on organizational outcomes. Each source of data provides a limited picture of the results. The studies reviewed above indicate that this advice, concerning the assessment of multiple-levels, is beginning to take hold within the aviation community. However, all the parties involved (aviators, researchers, regulators, the public, the airlines) need to continue to push for assessment at all levels of evaluation. Although the efforts reviewed within this paper suggest that CRM training programs are generally effective in producing some level of change in participants (e.g., reaction, learning, behavior), the lack of multi-level evaluation efforts makes it difficult to answer whether CRM is truly effective. Specifically, of the 58 reviewed studies, the predominant number of studies (34) collected data at only one level, whereas the remainder broke down in the following manner: 13 studies collected data at two levels, and 10 studies collected data at three levels, as argued for by Kirkpatrick (1976) and others (Robertson, Taylor, Stelly, & Wagner, 1995; Cannon-Bowers et al., 1989). In order to be truly effective and worth the investment companies and airlines make into CRM programs, positive reactions must
13 transfer to learning, learning must transfer to behavior, and finally changes in behavior must translate into reducing aviation related mishaps and accidents. We recognize that achieving four levels of evaluation might be impractical, if not impossible in many situations. But we need to establish stronger links between CRM training and reduction of accidents. At this point in time, there are not enough multi-level evaluations to assess whether or not this link is there. DOES CRM TRAINING WORK? As reported by Salas et al (1999) the data are encouraging. Although some have previously argued that there is no evidence that CRM is effective (Besco, 1995, 1997, 1998; Simmon, 1997; Komich, 1997), this review concludes that some evidence does exist. And this is important. The picture that has emerged after reviewing the existing evidence within the current framework suggests that CRM training is effective. But as stated earlier, the picture is not as clear as it should be after 20 years. The lack of systematic studies that can clearly show cause and effect, as well as the transfer of learned material to behavior and behavior to results, is a key factor in this unclear picture. Nevertheless, given that CRM training is one of a number of factors that may influence the practice and effectiveness of CRM behaviors, it may be argued that, although imperfect, the current evidence for the effectiveness of CRM training programs is impressive. Specifically, what can be said is that CRM (generally) produces: (1) positive reactions, (2) enhanced learning, primarily as measured through attitude change, although other learning criteria are also used (e.g., knowledge tests, knowledge structures), and (3) desired behavioral change in the cockpit (simulated or real). However, what can not be answered with certainty is whether CRM training has an effect on the bottom line -- in this case aviation safety. At this point, we believe the tools to determine this are there; what we need are the resources and a mandate to make it happen. WHERE DO WE GO FROM HERE? In terms of evaluation, the current review would seem to suggest two areas where future efforts should be concentrated. First, is that additional evaluation efforts need to begin to assess CRM training at multiple levels, using multiple sources of criteria within each level. Currently, less than half (40%) of the studies published used multiple levels to evaluate training. As noted, analyses of these studies suggest that the use of multilevel evaluations provides a much clearer and diagnostic picture of training efficacy than does single-level evaluations. Furthermore, we found that reaction data were primarily gathered via questionnaires or verbal reports of how well participants liked the course or thought it was worthwhile. Learning criteria were primarily collected through the assessment of attitude change via the CMAQ. Although there have been several other types of learning criteria argued for (structural knowledge, knowledge tests), few studies collected more than one type of learning criteria within the same evaluation study. In most of these studies, attitude change was the only criterion collected. Similar arguments can be made with regard to the collection of behavioral data and data examining organizational impact. Utilization of multiple methods to assess the same type of criteria (e.g., learning) increases confidence in the results obtained. The importance of longitudinal evaluation efforts that use multiple measures and methodologies have been made (see Helmreich &
14 Wilhelm, 1987; Helmreich, Wilhelm, Gregorich, & Chidester, 1990) and need to be conducted. The second area that would strengthen the evaluation effort(s) is to ensure that the evaluation and dissemination of results regarding CRM training provide diagnostic information. Although multi-level evaluation is one piece of providing diagnostic information, there are at least three others. Specifically, many of the evaluation efforts reviewed for the current paper have been written up in such a way as to make it difficult to assess the degree to which tools used to collect evaluation data were theoretically driven or possessed acceptable psychometric properties (e.g., reliability, validity). [This raises an issue that I don't recall seeing anywhere in your earlier discussions. It might be nice to have a foreshadowing of this earlier on in the paper.] Related to this need is Helmreich and Wilhelms (1987) plea for more guidance for the aviation community in how to implement and evaluate CRM programs. The ambiguity that exists in regards to the content of CRM training, implementation, and evaluation methods, combined with the lack of explicit descriptions as to the content contained within evaluated CRM programs makes it hard to compare findings across studies or provide general guidance. Finally, the predominant number of studies that we reviewed provided descriptive data only or generalized anecdotal evidence. For example, in relation to reaction data several researchers report that -- participants generally found training useful. This type of information is very subjective in interpretation. Specifically, there is no type of relational meaning, nor does the reader know how useful is defined (i.e., on a scale, are there anchor points, totally open responses). Although descriptive data are obviously better than generalized anecdotal evidence, it still makes it hard to determine the effect of CRM training as neither significance nor effect sizes can be determined. Future studies should attempt to report data at higher levels than merely descriptive information so that better conclusions can be drawn. Exploration and Expansion As a field in general, CRM training and the corresponding evaluation efforts are moving in two general directions. First, as the aviation community continues to invest money into CRM training, other communities are taking note and beginning to implement similar programs (Salas et al., 2001). The emerging extension of CRM training to other domains further drives the need for multi-level evaluation efforts so the question as to the effectiveness of CRM training can be answered once and for all. A second area that has begun to be investigated (Merrit & Helmreich, 1995b, Helmreich & Merritt, 1998; Maurino, 1999; Chidester et al., 1991; Helmreich, 1997), but must be further examined, is the identification of variables that may moderate or mediate the relationship between CRM training and performance. Understanding how such factors as culture (e.g., national, professional, safety, and organizational), personality, and organizational climate impact the message delivered in CRM training will help in the design, delivery, and evaluation phases of CRM. This in turn would serve to allow the aviation community, as well as other areas of industry, to get the most bang for their buck. CONCLUDING REMARKS After reviewing the 58 identified evaluations of CRM training programs within the aviation community the following can be said. First, CRM training programs seem to produce positive participant reactions, learning, and application of learned behavior via
15 simulators or on-line/on the job. However, the final word on whether CRM has an impact on safety remains to be seen. Second, although the aviation community should be commended in that multi-level evaluations are becoming more common, evaluation needs to become a systematically accepted cost of business. Third, the review raised a few methodological concerns in that, many times, descriptions of the components of CRM training or the methodology used to develop and test evaluation measures were not very clear within the literature, making it hard to determine the reliability, validity, and transferability of reported results. At this point in time it is unclear as to whether this is a concern with pure methodology or a combination of methodology and procedures used to disseminate findings. Finally, the current review illustrated that although there are still some rough spots in terms of evaluating implemented CRM training programs, the picture is not as bleak as some opponents would make it out to be trends seem to indicate that CRM training does have an impact on multiple aspects of the individuals and crews completing the program. However, more and better evaluations are needed. And the aviation community should demand it. We believe that time and continued systematic evaluations will reveal its long term impactimproved safety in the skies. ACKNOWLEDGEMENTS We would like to thank Eleana Edens, Deborah A. Boehm-Davis, Janis A. Cannon-Bowers, and two anonymous reviewers for their thoughtful comments on earlier drafts. This work was performed under the auspices of the UCF/FAA/NAWCTSD Partnership for Aviation Team Training.
16 REFERENCES Alkov, R. A. (1991). U. S. Navy aircrew coordination training a progress report. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 368-371). OH: The Ohio State University. Alkov, R. A., & Gaynor, J. A. (1991). Attitude changes in Navy/Marine flight instructors following an aircrew coordination training course. The International Journal of Aviation Psychology, 1(3), 245-253. Alliger, G. M., Tannenbaum, S. I., Bennett, W., Jr., & Traver, H. (1997). A metaanalysis of the relations among training criteria. Personnel Psychology, 50(2), 341-358. Alliger, G. M., & Janak, E. A. (1989). Kirkpatricks levels of training criteria: Thirty years later. Personnel Psychology, 42, 331-342. Arnold, R. L., & Jackson, D. L. (1985). Recurrent cockpit resource management training at United Airlines. In R. S. Jensen & J. Adrion (Eds.), Proceedings of the 3rd Symposium on Aviation Psychology (pp.345-351). OH: The Ohio State University. Baker, D. P., Bauman, M., & Zalesny, M. D. (1991). Development of aircrew coordination exercises to facilitate transfer. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.314-319). OH: The Ohio State University. Baker, D., Prince, C., Shrestha, L., Oser, R., & Salas, E. (1993). Aviation computer games for crew resource management training. The International Journal of Aviation Psychology, 3(2), 143-156. Barker, J. M., Clothier, C., Woody, J. R., McKinney, E. H., & Brown, J. L. (1996, January). Crew resource management: A simulator study comparing fixed versus formed aircrews. Aviation, Space, and Environmental Medicine, 67(1), 3-7. Besco, R. O. (1998). Crew resource management training: What to teach and how to teach it! Unpublished manuscript. Besco, R. O. (1997). The need for operational validation of human relations-centered CRM training assumptions. In R. S. Jensen and L. A. Rakovan, Proceedings of the 9th International Symposium on Aviation Psychology (pp. 536-540). OH: The Ohio State University. Besco, R. O. (1995). The potential contributions and scientific responsibilities of aviation psychologists. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology, Volume 2, 141-148. England: Avebury Aviation. Bowers, C. A., Salas, E., Prince, C., & Brannick, M. (1992). Games teams play: A method for investigating team coordination and performance. Behavior Research Methods, Instruments, and Computers, 24, 503-506. Brannick, M. T., Prince, A., Prince, C., & Salas, E. (1995). The measurement of team process. Human Factors, 37(3), 641-651. Brannick, M., T., Salas, E., & Prince, C. (Eds.) (1997). Team performance assessment and measurement: Theory, methods, and applications. Mahwah, NJ: LEA.
17 Butler, R. E. (1993). LOFT: Full mission simulation as crew resource management training. In E. L. Wiener, B. G., Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 231-259). CA: Academic Press. Butler, R. E. (1991). Lessons from cross-fleet/cross-airline observations: Evaluating the impact of CRM/LOFT training (pp. 326-331). In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.326-331). OH: The Ohio State University. Byrnes, R. E., & Black, R. (1993). Developing and implementing CRM programs: The Delta experience. In E. L. Wiener, B. G., Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 421-443). CA: Academic Press. Cannon-Bowers, J. A., Prince, C., Salas, E., Owens, J., Morgan, B., Jr., & Gonos, G. (1989). Determining aircrew coordination training effectiveness. Paper presented at the 11th Interservice/Industry Training Systems Conference, Fort Worth, TX. Chidester, T. R., Helmreich, R. L., Gregorich, S. E., & Geis, C. E. (1991). Pilot personality and crew coordination: Implications for training and selection. The International Journal of Aviation Psychology, 1(1), 25-44. Chute, R. D. & Wiener, E. L. (1995). Cockpit-cabin communication: I. A tale of two cultures. The International Journal of Aviation Psychology, 5(3), 258-276. Clark, R. E., Nielsen, R. A., & Wood, R. L. (1991). The interactive effects of cockpit resource management, domestic stress, and information processing in commercial aviation. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.776-781). OH: The Ohio State University. Clothier, C. C. (1991). Behavioral interactions across various aircraft types: Results of systematic observations of line operations and simulations. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 332-337). OH: Ohio State University. Connolly, T. J. & Blackwell, B. B. (1987). A simulator approach to training in aeronautical decision making. In R. S. Jensen (Ed.), Proceedings of the 4th International Symposium on Aviation Psychology (pp. 251-258). OH: The Ohio State University. Diehl, A. (1991). The effectiveness of training programs for preventing aircrew error. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 640-655). OH: The Ohio State University. Flin, R. (1995). Crew resource management for teams in the offshore oil industry. Journal of European Industrial Training, 19(9), 23-27. Fonne, V. M., & Fredriksen, O. K., Capt (1995). Resource management and crew training for HSV-navigators. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 585590). OH: The Ohio State University. Fowlkes, J. E., Lane, N. E., Salas, E., Franz, T., & Oser, R. (1994). Improving the measurement of team performance: The TARGETs methodology. Military Psychology, 6, 47-61. Fowlkes, J. E., Lane, N. E., Salas, E., Oser, R. L., & Prince, C. (1992). TARGETs for aircrew coordination training. Proceedings of the 14th Interservice/Industry Training Systems and Education Conference (pp. 344-352).
18 Freeman, C., & Simmon, D.A. (1991). Taxonomy of crew resource management: information processing domain. In R. S. Jensen (Ed.), Proceedings of 6th Annual International Symposium on Aviation Psychology (pp. 391-397). OH: The Ohio State University. Geis, C. E. (1987). Changing attitudes through training: A formal evaluation of training effectiveness. In R. S. Jensen (Ed.), Proceedings of the 4th International Symposium Aviation Psychology (pp. 392-398). OH: The Ohio State University. Goldsmith, T., & Kraiger, K. (1997). Structural knowledge assessment and training evaluation. In J. Ford, S. Kozlowski, K. Kraiger, E. Salas, & M. Teachout (Eds.), Improving training effectiveness in work organizations (pp. 19-46). New Jersey: Lawrence Erlbaum. Goldstein, I. L. (1993). Training in organizations: Needs assessment, development, and evaluation (3rd ed). Monterey, CA: Brooks/Cole Publishing Company. Grau, J. Y., & Valot, C. (1997). Evolvement of crew attitudes in military airlift operations after CRM course. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology (pp. 556-561). OH: The Ohio State University. Gregorich, S. E. (1993). The dynamics of CRM attitude change: Attitude stability. In Proceedings of the 7th International Symposium on Aviation Psychology (pp. 509-512). OH: The Ohio State University. Gregorich, S. E., Helmreich, R. L., & Wilhelm, J. A. (1990). Structure of cockpit management attitudes. Journal of Applied Psychology, 75(6), 682-690. Gregorich, S. E., & Wilhelm, J. A. (1993). Crew resource management training assessment. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 173-198). CA: Academic Press. Grubb, G., Morey, J. C., & Simon, R. (1999). Applications of the theory of reasoned action model of attitude assessment in the air force CRM program. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.) (1999), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 298-301). OH: The Ohio State University. Halliday, J. T., Maj., Biegalski, C. S., Lt Col., & Inzana, A., Maj. (1987). CRM training in the 349th military airlift wing. In H. W. Orlady & H. C. Foushee (Eds.), Proceedings of the NASA/MAC workshop on conference resource management (NASA Conference Publication No. 2455), pp. 148-158. Moffett Field, CA: NASA-Ames Research Center. Hansberger, J. T., Holt, R. W., & Boehm-Davis, D. (1999). Instructor/evaluator evaluations of ACRM effectiveness. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis Eds.), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 79-284). OH: The Ohio State University. Hayward, B. & Alston, N. (1991). Team building following a pilot labour dispute: Extending the CRM envelope. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.377-383). OH: The Ohio State University. Helmreich, R. L. (1991). Strategies for the study of flightcrew behavior. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp.338-343). OH: The Ohio State University.
19 Helmreich, R. L. (1984). Cockpit management attitudes. Human Factors, 26, 583-589. Helmreich, R. L., Chidester, T. R., Foushee, H. C., Gregorich, S., & Wilhelm, J. A. (1990, May). How effective is cockpit resource management training? Exploring issues in evaluating the impact of programs to enhance crew coordination. Flight Safety Digest, 1-17. Helmreich, R.L., & Foushee, H.C. (1993). Why crew resource management? Empirical and theoretical bases of human factors in aviation. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 3-45). CA: Academic Press. Helmreich, R. L., & Merritt, A. C. (1998). Culture at work in aviation and medicine: National, organizational, and professional influences. Aldershot: Ashgate. Helmreich, R. L., Merritt, A. C., Wilhelm, J. A. (1999). The evolution of crew resource management training in commercial aviation. The International Journal of Aviation Psychology, 9(1), 19-32. Helmreich, R. L., & Wilhelm, J. A. (1991). Outcomes of crew resource management training. The International Journal of Aviation Psychology, 1(4), 287-300. Helmreich, R. L., & Wilhelm, J. A. (1987). Evaluating cockpit resource management training. In R. S. Jensen (Ed.), Proceedings of the 4th International Symposium on Aviation Psychology (pp. 440-446). OH: The Ohio State University. Helmreich, R. L., Wilhelm, J. A., Gregorich, S. E., & Chidester, T. R. (1990). Preliminary results from the evaluation of cockpit resource management training: Performance ratings of flightcrews. Aviation, Space, and Environmental Medicine, 61(6), 586-589. Helmreich, R.L., Wilhelm, J.A., Klinect, J.R., & Merritt, A.C. (in press). Culture, error and Crew Resource Management. In E. Salas, C.A. Bowers, & E. Edens (Eds.), Applying resource management in organizations: A guide for professionals. Hillsdale, NJ: Erlbaum. Hofstede, G. (1988). McGregor in southeast Asia. In D. Sinha, H. Kao, Sr. (Eds.), Social values and development: Asian perspectives (pp. 304-314). Thousand Oaks, CA: Sage Publications. Holt, R. W., Boehm-Davis, D. A., & Hansberger, J. T. (1999). Evaluating the effectiveness of ACRM using LOE and line-check data. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 273-278). OH: The Ohio State University. Hormann, H. J., Goeters, K. M., Maschke, P., & Schiewe, A. (1995). Implementation and initial evaluation of DLR/LH CRM-training. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 591-596). OH: The Ohio State University. Howard, S., Gaba, D., Fish, K., Yang, G., & Sarnquist, F. (1992). Anesthesia crisis resource management training: Teaching anethesiologists to handle critical incidents. Aviation, Space, and Environmental Medicine, 63, 763-770. Ikomi, P. A., Boehm-Davis, D. A., Holt, R. W., & Incalcaterra, K. A. (1999). Jump seat observations of advanced crew resource management (ACRM) effectiveness. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the 10th
20 International Symposium on Aviation Psychology (pp. 292-297). OH: The Ohio State University. Incalcaterra, K. A., & Holt, R. W. (1999). Pilot evaluations of ACRM programs. In R. S. Jensen, B. Cox, J. D. Callister, & R. Lavis (Eds.), Proceedings of the 10th International Symposium on Aviation Psychology (pp. 285-291). OH: The Ohio State University. Irwin, C. M. (1991). The impact of initial and recurrent cockpit resource management training on attitudes. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 344-349). OH: The Ohio State University. Jackson, D. L. (1983). United Airlines cockpit resource management training. In R.S. Jensen (Ed.), Proceedings of the 2nd Symposium on Aviation Psychology (pp. 131137). OH: The Ohio State University. Jentsch, F., & Bowers, C. A. (1998). Evidence for the validity of PC-Based simulations in studying aircrew coordination. The International Journal of Aviation Psychology, 8(3), 243-260. Jentsch, F., Bowers, C. A., & Holmes (1995). The acquisition and decay of aircrew coordination skills. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 1063-1068). OH: The Ohio State University. Johnston, J. H., Smith-Jentsch, K. A., & Cannon-Bowers, J. A. (1997). Performance measurement tools for enhancing team decision-making training. In M. T. Brannick, E. Salas, & C. Prince (Eds.), Team performance assessment and measurement: Theory, methods, and applications (pp. 311-327). NJ: Lawrence Erlbaum. Kayten, P. J. (1993). The accident investigators perspective. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 283-314). CA: Academic Press. Kirkpatrick, D. L. (1976). Evaluation of training. In R. L. Craig (Ed.), Training and development handbook: A guide to human resources development (pp.18.118.27). New York, NY: McGraw-Hill. Komich, J. (1997). CRM training: Which crossroads to take now? In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology (pp. 541-546). OH: The Ohio State University. Kraiger, K., Ford, J. K., & Salas, E. (1993). Application of cognitive, skill-based, and affective theories of learning outcomes to new methods of training evaluation. Journal of Applied Psychology, 78(2), 311-328. Kraiger, K., Salas, E., & Cannon-Bowers, J. A. (1995). Measuring knowledge organization as a method for assessing learning during training. Human Performance, 37, 804-816. Lassiter, D. L., Vaughn, J. S., Smaltz, V. E., Morgan, B. B., Jr. & Salas, E. (1990). A comparison of two types of training interventions on team communication performance. Human Proceedings of the Factors Society 34th Annual Meeting, 2, 1372-1376. Leedom, D. K., & Simon, R. (1995). Improving team coordination: A case for behavioral-based training. Military Psychology, 7, 109-122.
21 Margerison, C., Davies, R., & McCann, D. (1987). High-flying management development. Training and Development Journal, 41 (2), 38-41. Maschke, P., Goeters, K. M., Hormann, H. J., & Schiewe, A. (1995). The development of the DLR/Lufthansa crew resource management training program. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology, 2, 23-31. England: Avebury Aviation. Maurino, D. E. (1999). Safety prejudices, training practices, and CRM: A mid-point perspective. International Journal of Aviation Psychology, 9, 413-427. Maurino, D. E. (1994). Cross-cultural perspectives in human factors training: Lessons from the ICAO Human Factors Program. The International Journal of Aviation Psychology, 4, 173-181. Merritt, A. C., & Helmreich, R. L. (1995a). CRM in 1995: Where to from here? In B. J. Hayward, & A. R. Lowe (Eds.), Applied aviation psychology: Achievement, change, and challenge. Proceedings of the Third Australian Aviation Psychology Symposium (pp. 111-126). Aldershot: Avebury Aviation. Merritt, A. C., & Helmreich, R. L. (1995b). Culture in the cockpit: A multi-airline study of pilot attitudes and values. Proceedings of the 8th International Symposium on Aviation Psychology (pp.676-681). OH: The Ohio State University. Morey, J. C., Grubb, G., & Simon, R. (1997). Towards a new measurement approach for cockpit resource management attitudes. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 9th International Symposium on Aviation Psychology (pp. 478-483). OH: The Ohio State University. Mudge, R. W. (1983). Cockpit management training for the professional pilot. In R. S. Jensen (Ed.), Proceedings of the 2nd Symposium on Aviation Psychology (pp. 165172). OH: The Ohio State University. Naef, W., Cpt. (1995). Practical application of CRM concepts: Swissairs human aspects development program (HAD). In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 597602). OH: The Ohio State University. Nullmeyer, R. T., & Spiker, V. A. (under review). The importance of crew resource management in MC-130P mission performance: Implications for training effectiveness evaluation. Military Psychology. Orlady, H. W., & Foushee, H. C. (1987). Cockpit resource management training, Technical Report Number NASA CP-2455 Moffett Field, CA: NASA Ames Research Center. Predmore, S. C. (1991). Microcoding communication in accident investigation: Crew coordination in the United 811 and United 232. In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 350-355). OH: The Ohio State University. Prince, C., Brannick, M., Prince, C., & Salas, E. (1997). The measurement of team process behaviors in the cockpit: Lessons learned. In M. T. Brannick, E. Salas, & C. Prince (Eds.), Team performance assessment and measurement: Theory, methods, and applications (pp. 289-310). Mahwah, NJ: LEA.
22 Prince, C., Oser, R., Salas, E., and Woodruff, W. (1993). Increasing hits and reducing misses in CRM/LOS scenarios: Guidelines for simulator scenario development. International Journal of Aviation, 3(1), 69-82. Prince, C. & Salas, E. (1999). Team processes and their training in aviation. In D. Garland, J. Wise, & D. Hopkins (Eds.), Handbook of aviation human factors (pp. 193-213). Mahwah, NJ: LEA. Robertson, M. M. & Taylor, J. C. (1995). Team training in aviation maintenance settings: A systematic evaluation. In B. J. Hayward & A. R. Lowe (Eds.), Applied aviation psychology: Achievement, change, and challenge. Proceedings of the Third Australian Aviation Psychology Symposium (pp. 373-383). Ashgate: Avebury Aviation. Robertson, M. M., Taylor, J. C., Stelly, J. W., & Wagner, R. (1995). A systematic training evaluation model applied to measure the effectiveness of an aviation maintenance team training program. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 631636). OH: The Ohio State University. Rollins, M. L. (1995). A descriptive study of crew resource management attitude change. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology, 2, 45-50. England: Avebury Aviation. Salas, E., Bowers, C.A., & Edens, E. (Eds.) Improving teamwork in organizations: Applications of resource management training. Hillsdale, NJ: LEA, Inc. Salas, E., & Cannon-Bowers, J. A. (2001). The science of training: A decade of progress. Annual Review of Psychology, 52, 471-499. Salas, E., Fowlkes, J. E., Stout, R. J., Milanovich, D. M., & Prince, C. (1999). Does CRM training improve teamwork skills in the cockpit? Two evaluation studies. Human Factors, 41(2), 326-343. Salas, E., Prince, C., Bowers, C., Stout, R., Oser, R. L., & Cannon-Bowers, J. A. (1999). A methodology for enhancing crew resource management training. Human Factors, 41(1), 161-172. Salas, E., Rhodenizer, L., & Bowers, C. A. (2000). The design and delivery of CRM training: Exploiting available resources. Human Factors, 42(3), 490-511. Schiewe, A. (1995). On the acceptance of CRM-methods by pilots: Results of a cluster analysis. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 540-545). OH: The Ohio State University. Silverman, D. R., Spiker, V. A., Tourville, S. J., & Nullmeyer, R. T. (1997). Team coordination and performance during combat mission training. Paper presented at the Interservice/Industry Training, Simulation, and Education Conference, Orlando, FL. Simmon, D. A., Capt (Ret.) (1997). How to fix CRM. In R. S. Jensen & L. A. Rakovan, Proceedings of the 9th International Symposium on Aviation Psychology (pp. 550553). OH: The Ohio State University. Simpson, P., & Wiggins, M. (1995). Human factor attitudes. In B. J. Hayward & A. R. Lowe (Eds.), Applied aviation psychology: Achievement, change, and challenge.
23 Proceedings of the Third Australian Aviation Psychology Symposium (pp. 185192). Ashgate: Avebury Aviation. Smith, G. M. (1994). Active learning strategies in undergraduate CRM flight training. In N. Johnston, R. Fuller, & N. McDonald (Eds.), Aviation psychology: Training and selection. Proceedings of the 21st Conference of the European Association for Aviation Psychology (pp.17-22). England: Avebury Aviation. Spiker, V. A., Nullmeyer, R. T., Tourville, S. J, & Silverman, D. R. (1998, July). Combat mission training research at the 58th special operations wing: A summary (iii-52). In USAF AMRL Technical Report (Brooks), July 1998, AL-HR-TR-1997-0182. Spiker, V. A., Silverman, D. R., Tourville, S. J., & Nullmeyer, R. T. (1998). Tactical resource management effects on combat mission training performance (iii-90). In USAF Technical Report (Brooks), July 1998, AL-HR-TR-1997-0137. Stout, R. J., Salas, E., & Fowlkes, J. E. (1997). Enhancing teamwork in complex environments through team training. Group dynamics: Theory, research, and practice, 1(2), 169-182. Stout, R. J., Salas, E., & Fowlkes, J. E. (1996). The efficacy of enhancing team performance in complex environments. In E. Salas and R. J. Stout (Co-Chairs), The science and practice of enhancing teamwork in organizations. Symposium conducted at the 104th annual meeting of the American Psychological Association, Toronto, Canada. Stout, R. J., Salas, E., & Kraiger, K. (1997). The role of trainee knowledge structures in aviation team environments. The International Journal of Aviation Psychology, 7, 235-250. Taggart, W. R. (1994). Crew resource management: Achieving enhanced flight. In N. Johnston, N. McDonald, & R. Fuller (Eds.), Aviation psychology in practice. England: Avebury. United States General Accounting Office (1997). Human Factors: FAAs guidance and oversight of pilot crew resource management training can be improved. (GAO/RCED-98-7). Washington, DC: GAO Report to Congressional Requesters. Vandemark, M. J. (1991). Should flight attendants be included in CRM training? A discussion of a major air carriers approach to total crew training. The International Journal of Aviation Psychology, 1(1), 87-94. Wiener, E. L., Kanki, B. G., & Helmreich, R. L. (Eds.) (1993). Cockpit resource management. CA: Academic Press. Wilhelm, J. (1991). Crew member and instructor evaluations of line orientated flight training (pp. 362-367). In R. S. Jensen (Ed.), Proceedings of the 6th International Symposium on Aviation Psychology (pp. 362-367). OH: The Ohio State University. Yamamori, H., & Mito, T. (1993). Keeping CRM is keeping the flight safe. In E. L. Wiener, B. G. Kanki, & R. L. Helmreich (Eds.), Cockpit resource management (pp. 399-420). CA: Academic Press. Young, J. P. (1995). Using group dynamics to reinforce CRM concepts in a collegiate course. In R. S. Jensen & L. A. Rakovan (Eds.), Proceedings of the 8th International Symposium on Aviation Psychology (pp. 1189-1191). OH: The Ohio State University.
24
Table 1 Summary of CRM Evaluative Efforts

Source Community CRM training content Type of Study/ Data collection
Quasiexperimental Self-report survey
Reactions
Learning
Findings Behavior
Results/ Organizational Impact
Reaction Studies
Baker, Bauman, & Zalesny (1991) 41 CH-46 pilots - Pre-Flight Brief - Assertiveness - Review of means indicated that Pre-Flight Brief exercise was a worthwhile addition to ACT course and likely to have an impact on next briefing experience (approx 4.4 on 5 pt scale) - Able to cite specific ways they planned to use information gained in both exercises - Open ended questionnaire revealed favorable impressions of both exercises - 75% ranked role play assertiveness exercise as 1st or 2nd choice (n=4) - 90% of aviators agreed that tabletop system could be used for CRM skills training - Most felt good way of learning - Most agreed system demonstrated importance of CRM - Good program that should be reinstated (joint) - Good reactions, suggest extend to cabin crew (other)
Baker, Prince, Shrestha, Oser, & Salas (1993)
112 male military aviators
- No specification of skills taught - Acceptability of tabletop training system to augment CRM training
Chute & Wiener (1995)
Survey conducted at 2 airlines 1 had joint CRM program for 1 yr (cockpit/cabin crew) 1 CRM for pilots 135 commercial airline pilots
- No specification of skills taught
Clark, Nielsen, & Wood (1991)
- No specification of exact skills taught (mention
Quasi-
- Felt CRM enhanced information processing
25
importance of stress management, communication, interpersonal skills) Maschke, Goeters, Hormann, & Schiewe (1995) Cockpit crews; DLR/Lufthansa - Judgement/ decision making - Communication - Leadership/teamwork
experimental Self-report survey Quasiexperimental Self-report survey
ability - Majority indicated that negative effects of aviation stress outweigh positive effects of CRM - 80% gave thought course was useful or extremely useful (as per Likert scale)
Hormann, Goeters, Maschke, & Schiewe (1995)
750 participants in Lufthansa cockpit crews
- Judgment/ decision making - Communication - Leadership/teamwork
Schiewe (1995)
724 cockpit members
- Comm. - Judgement/DM - Teamwork
Vandermark (1991)
America West Flight attendants and cockpit crew (n=appx. 1200) 42 Junior year flight students at Purdue University
- Those suggested by Hackman (1989) but no actual specification
Quasiexperimental Self-report survey Quasiexperimental
- Preliminary data indicate that 90% indicated that the course content was highly relevant - Preliminary data indicate that 84% thought the method of presentation was attractive - Units that were based on case studies or those that used role play in job related scenarios had very positive reactions - Methods based mostly on lecture was not rated favorably - Favorable reactions (Likert scale)
Young (1995)
- Communication skills - Stress management - Leadership - Psychological factors - Team building - Crew coordination - Conflict resolutions - Situational awareness - Decision making/ problem solving
- Positive reaction class worthwhile
Self-report survey
Learning Studies
Alkov & Gaynor (1991) 58 CRM training instructors -Two-week instructor training course -No specification of CRM skills taught - No specification of CRM skills taught Quasiexperimental Self-report survey/CMAQ Quasiexperimental Self-report - Positive shifts in attitudes were detected as a result of instructor training course
Chidester, Helmreich, Gregorich, & Geis (1991)
528 aviators from USAF military airlift command
- Training produced both positive and negative attitude change via type of personality
26
Gregorich (1993)
1191 participants major air carrier
- No specification of skills
survey/CMAQ Quasiexperimental Self-report survey
- Initial training produced significant positive attitude change - Initial increase followed by significant reductions in attitude levels between training cycles
Findings Source Community CRM training content Type of Study/ Data collection
Quasiexperimental Self-report survey/Revised CMAQ Quasiexperimental Self-report survey
Reactions
Learning
Behavior
Gregorich, Helmreich, & Wilhelm (1990)
National air carrier (696 participants)
- No specification of skills
- Positive attitude change (CMAQ) - Reduction in response variation for 2 scales of CMAQ after training - Results were combined for all platforms indicate significant positive attitude change as per all 8 behaviors
Grubb, Morey, & Simon (1999)
Air Force flight crews (n=2095) and mission crews (n=564)
- Air Force CRM behaviors (group dynamics, stress awareness, mission planning, risk mgmt behaviors, workload mgmt, comm., situation awareness, human performance behaviors) - No specification of skills taught
Helmreich, Merritt, & Wilhelm (1999)
Irwin (1991)
Pilots in several organizations surveyed just after completion of CRM course and several years later Major U.S. air carrier
- Decay in attitudes not immediately apparent, but some decay over time
- No specification of exact content
Morey, Grubb, & Simon (1997)
188 fighter, 198 transport, and 77 bomber Air Force pilots undergoing CRM training
Air Force CRM training: 8 core CRM behaviors
- Significant positive attitude change (CMAQ) - Few examples of boomerang - Found attitudes decline over time - Recurrent training results in positive attitude change - Pre-training attitudes varied by pilot group (transport vs. fighter, bomber) - For all 3 groups pilot attitudes toward CRM significantly improved with training
27
Rollins (1995)
3 commercial airlines (2 US, 1 Canadian); 1 US military (n=508) Approximately 88 general aviation pilots
- No description of specific CRM skills trained - Trained lasted 1-2 days
Quasiexperimental Self-report survey/CMAQ Quasiexperimental Self-report survey
- Improvement rate varied across groups with most positive post training attitudes being shown by transport, followed by bomber and fighter pilots - Measurable improvement in positive CRM attitude (CMAQ)
Simpson & Wiggins (1995)
Yamamori & Mito (1993)
2300 crew members of Japan airline
- Recognize and understand different interpersonal styles and effects on crew interaction in cockpit (inquiry, advocacy, conflict, problem def, critique, problem solving)
- Pilots that had previously completed human factors course were significantly different in their attitudes than those who hadnt - Pilots who had a human factors course were more confident in terms of their ability to cope with emergency situations and exhibited a relatively heightened level of selfawareness - Strengthens and crystallizes attitudes toward more effective CRM in cockpit
Behavior Studies
Arnold & Jackson (1985) 96 flight crews at United Airlines - Recurrent training (topic changes each year) - Decision making - No specification of CRM skills previously taught - Compared fixed and formed crews Quasiexperimental Rating Quasiexperimental Observation - No statistical analysis presented - On a 6 pt scale that evaluates overall CRM performance in LOFT most fall between 3-6 (most 4-5) - No significant differences in CRM behaviors between 2 crews - Average overall rating using LOS checklist was 3.35 (fixed) and 3.38 (formed) - Fixed crews committed more minor errors than formed crews (4.4 vs 2.6), while there were no significant differences in the number of major errors committed via crew formation - On average participants were rated as slightly above average in performing behaviors related to CRM (mean=3.56, 3.67 Scenario A and B)
Barker, Clothier, Woody, McKinney, & Brown (1996)
17 crews of active duty USAF
Brannick, Prince, Prince, & Salas (1995)
51 military air crews (Navy)
- No specification of skills trained - Skills measured (assertiveness, decision making/mission analysis, adaptability,
Quasiexperimental Observation
28
situation awareness, leadership, comm.) Clothier (1991) Major domestic airline (3,000 crews; 2000 untrained, 1000 trained) Also look at data across 5 airlines same results Post-hoc - Significant improvement seen in LOFT exercises (485 trained outperformed 1625 untrained crews) - Crews flying on the line exhibited improved performance (significant improvement in 12 of 14 areas) - Crew interactions earned higher scores (LOS) as they progressed through initial training to recurrent training
Experimental Observation
Reactions
Learning
Behavior
Connolly & Blackwell (1987)
Aeronautical Science students at Embry Riddle University (16 exp., 13 control)
- Risk assessment - Decision making - Hazardous thought patterns
- Checklist scores and flight ratings indicated exp. group performed significantly better on post-test than control - Compared with control group exp. group witness a significantly greater amount of change on both checklist scores and flt ratings after training
Helmreich, Wilhelm, Gregorich, & Chidester (1990)
Major airline Over 2000 line flights & LOFT sessions For 859 crews LOFT involved in flight emer. Major airline
Quasiexperimental Observational ratings
Helmreich & Foushee (1993); Taggart (1994)
Quasiexperimental Observations
- Global rating of overall crew performance using LINE/LOFT worksheet behavior in both on line and in LOFT indicates positive changes in CRM performance - Wide variation in specific CRM behaviors used between fleets at same airline - Significant positive shifts in process behavior during line operations across a 3-year period.
29
Hansberger, Holt, & BoehmDavis (1999)
19 I/E evaluating 2 above fleets; considering all pilots that these I/Es have evaluated over past 6 months
- Communication - Situation assessment - Planning/Decision Making
Quasiexperimental Instructor Ratings after the fact
- ACRM pilots assessed higher than non ACRM pilots on workload management, comm., and planning - Expected questions from behavioral observation form to factor into workload, comm., and situation awareness; situation awareness factored into planning instead - I/E has positive evaluations for ACRM training especially in area of communication - ACRM strong in facilitating establishment of bottom lines and back up plans, not so strong in reducing distractions or helping crew to be generally organized and composed - Communication ratio decreased for groups receiving training - Crews tended to maintain efficient comm. patterns through all 7 flights - Comm. frequencies did not change significantly after 45 days - Assessed performance differences, none - Team coordination behaviors positively related to mission performance - Crew coordination processes were differentially related to performance across missions - Quality of mission planning related to mission performance - Self report of mission performance and crew coordination positively related
Jentsch, Bowers, & Holmes (1995)
20 instrument rated pilots from Embry Riddle
- Mission analysis - Situation awareness
Experimental Observation
Nullmeyer & Spiker (under review); Spiker, Nullmeyer, Tourville, & Silverman (1998); Spiker, Silverman, Tourville, & Nullmeyer (1998); Silverman, Spiker, Tourville, & Nullmeyer (1997)
11 Air Force SOC MC-130P aircrews
- Mission planning/debrief - Task management - Situational awareness - Crew coordination - Communication - Risk management - Tactics employment
Quasiexperimental Instructor ratings, Selfreport survey, Observation
Nullmeyer & Spiker (under review)
87 students in MC130P
- Mission planning/debrief - Task management - Situation awareness - Crew coordination - Communication - Risk management - Tactics employment
Quasiexperimental Instructor comments translated into ratings
- Based on rating system devised from instructor comments in grade folders found students rated above average in mission preparation and crew coordination - Students rated below average in decision making and comm
30
Predmore (1991)
United Flight 232 (comm. analysis)
- No specification of CRM skills taught at United
Post-hoc Case Study
- Efficient use of resources - Distribution of communication across multiple tasks and members - Maximum utilization of 4th crew member - Explicit prioritizing of tasks - Active involvement of Captain through entire process
Post-hoc
Reactions
Learning
Behavior
Results Studies
Diehl (1991) Summary paper - No specification of specific skills - Reduction in accident rates cited from 4 environments (USAF, USNavy helicopters, USNavy fighterbomber, Petroleum Helicopter Inc) Several near incidents saved by good CRM practices cites incidents - Overall aircrew mishap rate declined in all three communities
Kayten (1993)
NTSB reports
- No specification of skills taught within particular incidents
Post-hoc
Multi-Level Studies
Alkov (1991) 45 Naval aviation training squadrons (helicopters, attack bombers, multiplaced fighters) - Pilot judgment - Situation awareness - Decision making - Policy and regulations - Command authority - Workload performance - Use of available resources - Communication skills - Crew dynamics (leadership/followership) - Communication - Decision making - Stress management Quasiexperimental Self-report survey - Squadrons report benefit and wish to continue - Positive change in attitudes regarding CRM behaviors - Contributed to better communication between instructors and students
Byrnes & Black (1993)
Delta airlines
Quasiexperimental Self-report survey Quasiexperimental Self-report survey - 140 felt material applicable to their job
- Positive attitude change as measured by CMAQ - Stability of change over 5 year period
Geis (1987)
838 US Army pilots 163 completed both pre and post CMAQ (results based on these)
- Decision making - Communication - Resolving conflict - Assertiveness - Feedback/criticism - Pilot attitudes
- Paired t-tests indicated significant positive attitude change - At an item level most of the individual items of the CMAQ witnessed positive
- Flight attendants reported that cockpit crew treated them with more respect after CRM - Flight attendants reported that cockpit crew made them feel more a part of crew and were included in more pre-flight briefings -Effective in helping them to reduce potential for mishaps; consciously apply techniques taught - 135 felt had become safer pilots - Some were able to comment on specific instances where had used
- Quarterly air carrier discrepancy reports significantly decreased
31
Grau & Valot (1997)
Follow up: 290 questionnaires mailed to random subjects who had participated from the original 838; 3 months later data analyzed based on 142 responses Questionnaire sent to 312 crew members that had participated in CRM training (response of 172); French Air Force
- Situation awareness - Judgment - Problem solving - Workload mgmt - Stress management - Identification of resources - More.
change - Training had positive effect regardless of position in cockpit, level of experience, aircraft flown
principles learned (40 comments in all)
- Crew topics - Communication - Understanding the situation - Confidence/Doubt - Occupational stress - Fatigue - Human error
- 95% of trainees see themes as relevant
Halliday, Biegalski, Inzana (1987)
349th Military Airlift Wing Approximately 250 crew members trained
- Problem solving - Decision-making
Quasiexperimental Self-report survey, Peer survey
- 90% response rate to survey; indications are that students developed a highly receptive attitude toward seminar format
- Argue that attitudes toward CRM are also improving
- 50% said they frequently changed their cockpit behavior with other crew members - 65% report that the way they analyze situations or their own behavior has changed - 65% reported a difference in the ways that crews operate after CRM training - 80% self reported changes during flight, 53% mentioned changes during preparation phase, 35% reported changes during debriefing phase - 10% report a large impact in terms of changing life at squadron level (51% felt some occasional change) - Of those not trained in CRM who had flown with a person trained in CRM, 75% indicated that CRM trained crew members exhibited recognizable behavioral changes - 80% of untrained individuals felt that they had observed better coordination and flightdeck atmosphere from crew members who had undergone training
Hayward & Alston (1991)
Pilots and spouses/ partners
- No specification of specific skills taught
- Most participants responded enthusiastically
Helmreich (1991)
Summary article Based on data collected through NASA/University of Texas Crew Research Project
- 8,000 surveys from 3 airlines indicate that LOFT is valued by crews as a training technique, considerable variability in quality of scenarios
- Report that workshops promoted increased awareness of human factors, crew performance, and safety implications - Report increased awareness of potential stressors and way could be dealt with - Without reinforcement flightdeck management attitudes began to regress to pre-CRM levels - Cite normative cultural differences in attitudes
32
Helmreich & Wilhelm (1991)
2 major airlines; independently developed CRM courses
- No specification of exact content
developed - Overall pattern of evaluation was extremely positive - Lots of variability across seminars
toward CRM concepts - Significant positive change on all 3 scales of CMAQ
Source
Community
CRM training content
Type of Study/ Data collection

Quasiexperimental Observation
Reactions
Learning
Findings Behavior

- Trained fleet performed significantly better overall (observer summary evaluation jump seat observations)
Ikomi, BoehmDavis, Holt, & Incalcaterra (1999);
50 crews eastern US regional airline; 2 fleets (experimental/control fleet)
- Crews from ACRM group showed superior performance in 13 of 20 items (5 pt Likert scale jump seat observations)
Incalcaterra & Holt (1999)
600 active pilots in above airline (184 trained, 84 non ACRM group)
- 93% of pilots voted to expand use of ACRM to other fleets
- Pilots trained in ACRM showed positive attitudes toward CRM (CMAQ) - Positive attitudes toward ACRM (neutral on wrkld) - Correct knowledge of content and timing of ACRM procedures (2 yrs after training)
- 91% felt it had improved their flight performance
Holt, BoehmDavis, & Hansberger (1999) Jackson (1983)
All pilots in ACRM fleet and control fleet
- Communication - Situation assessment - Planning/Decision Making - Inquiry - Advocacy - Conflict resolution - Critique - Decision making
Quasiexperimental Observation Quasiexperimental Self-report survey - Descriptive stats only - Participants tend to find experience quite rewarding (avg. response 6.8 on a 9 pt scale) - Over 95% of trained crew members felt CRM training has value to crew and can be applied to operation of their aircraft - LOFT program seems quite successful overall; most participants rate LOFT experience as very good - 41% shift in selfperceived style of interaction crew members more aware of own interaction style
- Performance for 9 of 10 behaviors significantly higher for trained group (LOE)
- Trained fleet superior on 6 of 12 line check items
4059 participants at United Airlines
- Argues that improvement has been noted in nearly all of the targeted areas during each of four quarters of the first year sounds like selfreported improvement. - Steady decline in the number of proficiency check failures since advent of LOFT at United
Lassiter, Vaughn, Smaltz,
Undergraduates from University of Central
- Communication - 3 conditions (control,
Experimental
- Attitude change via CMAQ non-significant
- Teams receiving skill based comm. training exhibited significantly
33
Morgan, & Salas (1990)
Florida (n=90)
knowledge based training, skill based training)
Leedom & Simon (1995)
32 US Army UH-60 aviators (battlerostered crews)
- No specification of exact skills
Self-report survey, Observational ratings Quasiexperimental Self-report survey, Observations
better communication skills than other two conditions (behavioral observation rating scale) - Small positive shift of attitudes (AACQ) - Improved team communication patterns, more efficient mgmt. of crew resources, fewer team errors; team coordination improved (ACE checklist) - Mission performance ratings improved as did mission accomplishment (# completed as well as quality)
30 US Army AH-64 aviators (battlerostered crews)
- No specification of exact skills
Quasiexperimental Self-report survey, Observations Quasiexperimental Self-report survey Quasiexperimental Self-report survey - No final conclusions as program is ongoing present trends - No pilot has rated a study unit at less than good more flight time=higher ratings
- No significant change (already had positive attitude toward CRM)
- Improvement across all 13 dimensions of team coordination - Higher flight proficiency ratings - Higher mission performance
Margerison, Davies, & McCann (1987)
Australian airline
- Decision making - Planning/priority setting - Delegation - Communication - Interpersonal Skills - No specification of skills taught
- Favorable reactions
- Provides evidence of some learning (aircrew members better understand interpersonal issues; discussions more productive) - Too early to answer if produces positive behavioral changes, but positive indications - The four pilots that have graduated at this point self report that their actions on the flight deck are changing as a result of program - Also have unsolicited comments from several still in program that their own actions are changing - 53% response rate on survey taken 6 months later indicated that 97% of flight crew members reported one or more positive behavior transfer
Mudge (1983)
25 pilots enrolled some with FAA, corporate pilots, individual pilots, airline executives testing program
Naef (1995)
Swissair line pilots and instructors
- Communication - Feedback - Decision making - Judgment - Functioning under pressure
Salas, Fowlkes, Stout, Milanovich, &
35 pilots & 34 enlisted aircrewmen from Navy transport
- Assertiveness - Communication - Situational awareness
Quasiexperimental
- Majority of pilots felt course presentation was above average (approx. 560 out of 640) - Majority of instructor pilots felt the course related to their needs and was worthwhile - 53% response rate 6 months later indicated that 70% of instructors would invest an off duty day for this type of training - Strong endorsement of training usefulness
- Trained group showed more positive attitudes towards use of CRM
- Overall trained teams performed better than untrained teams (TARGETS)
34
Prince (1999)
helicopter squadron
- Mission analysis
Self-report survey, Multiple choice, Observation
- Significant increases in positive attitudes via overall attitude scale (ACAQ) and CMAQ Communication and Coordination subscale - Trained group exhibited higher levels of knowledge regarding CRM principles - Strong endorsement of training - Positive support for CRM training - Reported that skills learned would be implemented for more effective pre and debriefs - No significant change (restriction of range) (ACAQ) - Trained scored higher on knowledge test than untrained
- Correctly managed 15% more during prebrief and 9% more during higher workload segment
27 aviators from naval helicopter community (15 experimental, 12 control)
- Decision Making - Assertiveness - Mission analysis - Communication - Coordination - Leadership - Adaptability - Situational awareness
Quasiexperimental Self-report survey, Multiple choice, Observation
- Trained teams performed better during preflight brief (TARGETS) - No differences in low workload times (TARGETS) - Trained teams engaged in greater number of teamwork behaviors during high workload segment (TARGETS)
Source
Community
CRM training content
Type of Study/ Data collection

Quasiexperimental Self-report survey, Observation
Reactions
Learning
Findings Behavior
Smith (1994)
10 undergraduate flight students
- Self analysis vs. traditional debrief to develop CRM skills
Stout, Salas, & Fowlkes (1996); Stout, Salas, & Fowlkes (1997)
42 student pilots in Navy Advanced Maritime Curriculum 20 experimental 22 control 22 aviators, helicopter community
- Communication - Assertiveness - Situation Awareness
Experimental Self-report survey, Multiple choice Experimental Self-report survey, Observation
- Self analysis was not valued highly as a training technique - LOFT reported as being more helpful than self analysis technique - Positive reaction to active learning technique - Positive reaction to training
- Self analysis produced little change in attitudes
- Self analysis helped crews to perform significantly better in 3 LOFT sessions, moderately better in 3
Stout, Salas, & Kraiger (1997); Fowlkes, Lane, Salas, Franz, Oser (1994); Fowlkes, Lane, Salas, Oser, & Prince (1992) Wilhelm (1991) Subset of this also presented in
- Communication - Assertiveness
- Positive reaction to training (n=12)
- Examples provided of how use new skills on job - Trained group scored higher on knowledge test - Positive attitude change found (CMAQ) - Evidence of learning via changed knowledge structures (n=22) - Attitude change positive, but not significant (n=12) - Knowledge test did not show learning effects (suggest restriction of range)
- Trained teams performed better (TARGETs) as per both real-time instructor ratings and post-hoc ratings of video
- Trained participants performed an average of 8% more desired behaviors than control (TARGETS, n=12)
8300 crew members from 4 airlines
- Summary article results obtained NASA/UT/LOFT survey - No specification of skills
Quasiexperimental Self-report
- Crew members value LOFT as a training technique
- Crews typically think they do a much better than average job during LOFT (self evaluation) - Found low to moderate agreement
35
Butler (1991; 1993)
across airlines
survey, Instructor evaluation (1 organization)
in performance ratings between self evaluation and instructor evaluation (one organization) - When asked about the exhibition of specific CRM behaviors in LOFT large differences found across organizations in CRM behavior patterns, although self reports all tended to be high - Significant, but small relationship between instructor ratings and self ratings of specific CRM behaviors - Over a 2 yr period at one airline CRM behaviors decreased as did ratings of scenario and instructor quality in LOFT - Over a 3 year period at one airline scenario, instructor, and LOFT delivery remained constant in perceived quality and self reports of CRM behavior steadily increased; overall this airline received lower ratings than others on several LOFT scales
Biographies Eduardo Salas Affiliation: Institute for Simulation and Training University of Central Florida Orlando, FL 32826 and University of Central Florida Psychology Department Orlando, FL 32816 Degree/Institution: Ph.D. Industrial/Organizational Psychology, 1984 Old Dominion University, Norfolk, VA C. Shawn Burke Affiliation: Institute for Simulation and Training University of Central Florida Orlando, FL 32826 Degree/Institution: Ph.D. Industrial/Organizational Psychology, 2000 George Mason University, Fairfax, VA Clint A. Bowers Affiliation: Team Performance Laboratory University of Central Florida Orlando, FL 32826 and University of Central Florida Psychology Department Orlando, FL 32816 Degree/Institution: Ph.D. Clinical and Community Psychology, 1987 University South Florida, Tampa, FL Katherine A. Wilson Affiliation: Institute for Simulation and Training University of Central Florida Orlando, FL 32826 Degree/Institution: Graduate Student Human Factors and Applied Experimental Doctoral Program University of Central Florida, Orlando, FL

Salas ROI

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Salas ROI

Încărcat de

Drepturi de autor:

Formate disponibile

Team Training in the Skies: Does Crew Resource Management (CRM) Training Work?

Shortened Title: Team Training in the Skies

Address correspondence to:

Table 1 Summary of CRM Evaluative Efforts

Results/ Organizational Impact

Baker, Prince, Shrestha, Oser, & Salas (1993)

112 male military aviators

Quasiexperimental Self-report survey

Chute & Wiener (1995)

- No specification of skills taught

Quasiexperimental Self-report survey

Clark, Nielsen, & Wood (1991)

- No specification of exact skills taught (mention

- Felt CRM enhanced information processing

experimental Self-report survey Quasiexperimental Self-report survey

Hormann, Goeters, Maschke, & Schiewe (1995)

750 participants in Lufthansa cockpit crews

- Judgment/ decision making - Communication - Leadership/teamwork

Quasiexperimental Self-report survey

724 cockpit members

- Comm. - Judgement/DM - Teamwork

Quasiexperimental Self-report survey

- Those suggested by Hackman (1989) but no actual specification

Quasiexperimental Self-report survey Quasiexperimental

- Positive reaction class worthwhile

Chidester, Helmreich, Gregorich, & Geis (1991)

528 aviators from USAF military airlift command

1191 participants major air carrier

survey/CMAQ Quasiexperimental Self-report survey

Results/ Organizational Impact

Gregorich, Helmreich, & Wilhelm (1990)

National air carrier (696 participants)

Grubb, Morey, & Simon (1999)

Air Force flight crews (n=2095) and mission crews (n=564)

Helmreich, Merritt, & Wilhelm (1999)

Quasiexperimental Self-report survey

- No specification of exact content

Quasiexperimental Self-report survey

Morey, Grubb, & Simon (1997)

Air Force CRM training: 8 core CRM behaviors

Quasiexperimental Self-report survey

- No description of specific CRM skills trained - Trained lasted 1-2 days

Quasiexperimental Self-report survey/CMAQ Quasiexperimental Self-report survey

Simpson & Wiggins (1995)

- No specification of skills taught

Yamamori & Mito (1993)

2300 crew members of Japan airline

Quasiexperimental Self-report survey

Barker, Clothier, Woody, McKinney, & Brown (1996)

17 crews of active duty USAF

Brannick, Prince, Prince, & Salas (1995)

51 military air crews (Navy)

Results/ Organizational Impact

Connolly & Blackwell (1987)

Aeronautical Science students at Embry Riddle University (16 exp., 13 control)

- Risk assessment - Decision making - Hazardous thought patterns

Helmreich, Wilhelm, Gregorich, & Chidester (1990)

- No specification of skills taught

Quasiexperimental Observational ratings

Helmreich & Foushee (1993); Taggart (1994)

- No specification of skills taught

Hansberger, Holt, & BoehmDavis (1999)