Sunteți pe pagina 1din 14

Evaluation

http://evi.sagepub.com How Legitimate and Justified are Judgments in Program Evaluation?


Marthe Hurteau, Sylvain Houle and Stphanie Mongiat Evaluation 2009; 15; 307 DOI: 10.1177/1356389009105883 The online version of this article can be found at: http://evi.sagepub.com/cgi/content/abstract/15/3/307

Published by:
http://www.sagepublications.com

On behalf of:

The Tavistock Institute

Additional services and information for Evaluation can be found at: Email Alerts: http://evi.sagepub.com/cgi/alerts Subscriptions: http://evi.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.co.uk/journalsPermissions.nav Citations http://evi.sagepub.com/cgi/content/refs/15/3/307

Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation Copyright 2009 SAGE Publications (Los Angeles, London, New Delhi, Singapore and Washington DC) DOI: 10.1177/1356389009105883 Vol 15(3): 307319

How Legitimate and Justified are Judgments in Program Evaluation?


MARTHE HURTEAU, SYLVAIN HOULE AND STPHANIE MONGIAT
Universit de Qubec Montral, Canada

The main function of program evaluations is to describe programs in order to generate judgments of value. To be considered credible, judgments should be both legitimate and justified. The research presented in this article posed the following question: do program evaluation practitioners generate legitimate and justified judgments? A meta-analysis of 40 program evaluation reports was carried out, which found that only 50 percent of the reports generated judgments. While these judgments seemed legitimate, they were rarely justified. However, the elements required to support legitimate and justified judgments were present in the reports in similar proportions, whether the reports generated a judgment or not. KEYWORDS: judgment judgment in program evaluation; legitimate justified

Introduction
The main purpose of a program evaluation is to determine the quality of a program by formulating a judgment (Stake and Schwandt, 2006). During the past 30 years, it has been the subject of many conceptual and methodological developments but still faces the fundamental issue addressed by Davidson (2005: 28): Unlike medicine, evaluation is not a discipline that has been developed by practicing professionals over thousands of years, so we are not yet at the stage where we have huge encyclopaedias that will walk us through any evaluation step-by-step. This underlines why program evaluation is periodically called into question as an original process, whose primary function is the production of legitimate and justified judgments which serve as the bases for relevant recommendations. Guba (1972), and more recently Scriven (1995), attribute such questioning to the difficulty practitioners have not only in determining evaluands (i.e. the subject of an evaluation), but also in developing the criteria required to generate a judgment.

307
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation 15(3) A number of converging studies provide support for this position. The Treasury Board of Canada (2004) sponsored a meta-analysis based on 130 evaluation reports produced for different government departments; the analysis concludes that 50 percent of these reports lacked credibility as the results were not based on information relevant to the aim of the evaluation, and, in 32 percent of the cases, the supporting arguments were not sufficient to generate a judgement. Further, Toulemonde (2005) maintains that managers frequently do not base their decisions on available program evaluations because most such evaluations do not provide the required information. Datta (2006: 434) states that: from an often huge body of relevant evaluations and reports, only about 10% of these reports, if that much, tend to survive reasonable screening for trustworthiness and evaluation quality. The costs are too high to ignore this. Fournier and Smith (1993: 322) comment that: when evaluation findings are challenged or utilization has failed, it was because stakeholders and clients found the inferences weak or the warrants unconvincing. House (1980: 89) adds: unless an evaluation provides an explanation for a particular audience, and enhances the understanding of that audience by the content and form of the arguments it presents, it is not an adequate evaluation for that audience, even though the facts on which it is based are verifiable by other procedures. For Scriven (1990), judgments constitute the Achilles heel of the whole evaluation effort. Such discourse focuses on the quality (i.e. legitimacy and justification) of the judgment being rendered. However, while this problem has been the subject of much discussion, it has seldom been well described empirically: do program evaluation practitioners generate legitimate and justified judgments?

Key Concepts
In order to answer the research question, one must understand the concepts underlying the term judgment, or more specifically those relating to legitimate and justified judgments.

Legitimate Judgment
The specific sphere of program evaluation, established by House and Howe (1999) and Schwandt (2002), is described by Schwandt (2008: 33) as follows:
Like all professional practices, evaluation has its own unique kind of practical knowledge, comprising the tact, dispositions, and considered character of decision making called for in various situations faced in doing the practice. The very act of coming to an interpretation of the value of a social or educational program or policy is a practical matter, involving the exercise of judgment.

Legitimate judgment takes the form of a statement, appraisal, or opinion concerning the merit, worth, or significance of a program, and is formed by comparing the findings and interpretations regarding the program against one or more selected standards of performance (Wheeler et al., 1992). For Scriven (1991a: 203): The most important fact about judgment is not that it isnt as objective

308
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Hurteau et al.: Legitimate and Justied Judgments as measurement (true) but that one can distinguish good judgment from bad judgment (and train good judges). In order to fully understand the concept of judgment, one must be introduced to the concept of the logic of evaluation, first described by Scriven (1980), and considered as a meta-theory in the field of program evaluation by Shadish et al. (1991). It offers a theoretical base for the definition of program evaluation, and more specifically for that of judgment. The logical process includes the following four basic operations: selecting criteria of merit: identifying elements or components that influence the performance of the object being studied (evaluand); setting standards of performance based on those criteria which, in turn, become the anticipated level of performance; gathering data pertaining to the performance of the evaluand in terms of the established standards (i.e. analysis which identifies the extent to which performance is in response to the standards); and integrating the results into a final value judgment (i.e. synthesis). More recently, Scriven (1991b) specifies that the evaluand serves a critical function in his theory in orienting the evaluation question. Further, Scriven (1990) conceptualizes an evaluation double pyramid encompassing dual processes. The first of these, an analysis process, consists in assessing the merit of the product by identifying its object, the general dimensions (i.e. criteria) and indicators required to describe it, as well as benchmarks or other data relative to each of the dimensions (i.e. standards). The second, a synthesis process, consists in inferring conclusions about each indicator from the performance data, then about each dimension of merit from the indicators, and finally moving from these inferences to judgment a conclusion about overall merit. Several authors (Fournier, 1995; Hurteau et al., 2006; Stake, 2004) have pursued Scrivens (1980) original idea, developing a more functional framework. The general and working logic of Fournier (1995) follows Scrivens four operations, referring to the evaluations global strategic process, as well as considering the evaluations context and, as such, renders the evaluation process operational. Stake (2004) provides insight into the underlying thought processes involved in initiating an evaluation by means of his concepts of critical thinking (i.e. standards-based evaluation) and responsive thinking. For this author, critical thinking also refers to the elements of the logic of evaluation (i.e. criteria and standards), and responsive thinking refers, like working logic, to the operational process required to conduct an evaluation. More recently, Hurteau et al. (2006) developed and validated a model of the act of evaluating, integrating Scrivens logic and offering a more operational process much like Fourniers logic. In summary, for these authors, the following characteristics of the judgement should be present in order for it to be considered legitimate: (a) it refers to the evaluation questions/goals; (b) it is supported by criteria; and (c) it is supported by a standard. As we will see, these characteristics are also recognized in the work of Toulmin (1964) and Arens (2005, 2006). 309
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation 15(3)

Justified Judgment
While the criteria so far discussed allow for the establishment of a judgments legitimacy, Fournier and Smith (1993) and House (1980) also refer to the importance of a judgment being justified (i.e. linked to the evidence gathered and consistent with the agreed-upon values or standards of stakeholders) (Joint Committee on Standards for Educational Evaluation, 1994). In other words, justification refers to the argumentation supporting a judgment. In order to better understand this concept of justification or argumentation, the notion of evaluative reasoning should be taken into consideration; this type of reasoning promotes judgments based on valid claims considered defensible, accurate, warranted, acceptable, and justified ( Fournier and Smith, 1993; Habermas, 1979; McCarthy, 1973; Redding, 1989; Scriven, 1995; Taylor, 1961; Toulmin, 1964). A model for such reasoning has been developed by Arens (2005, 2006) based on the works of Toulmin (1964), Toulmin et al. (1984), and Fournier (1995). The model incorporates two operations, the evaluative judgment and its argumentation, both simultaneously distinct and interrelated. The first operation consists in gathering relevant information in order to formulate a legitimate judgment (i.e. going beyond the evidence). The second operation links the evidence by argumentation that supports and justifies the judgment (Fournier, 1995; Trelogan, 2001; Valovirta, 2002). The judgment process is illustrated in Figure 1 which depicts the production of a legitimate judgment involving a succession of relations and syntheses starting with data, empirical evidence, and claims (i.e. assertions usually referring to criteria and standards) (Merriam-Webster, 2004; Schwandt, 2002), and concluding with judgment. Scrivens (1995) position is that evaluation should go beyond evidence and claims in order to culminate in a sort of conclusion or synthesis (i.e. justified judgment) and should lay the ground work for argumentation, the second operation proper to the model (Fournier, 1995; Trelogan, 2001; Valovirta, 2002). The procedural logic underlying the argumentation of a judgment is illustrated by Figure 2. In Figure 2, warrants are seen as assumptions or premises that confirm the logical path between claims and evidence, and legitimate inferences (for example, the program reaches 75 percent of its goals). Backings are further support for the warrants if they are called into question; these take the form of principles, laws, formulae (Toumin, 1964), and are based on experience, authority, a belief system, general knowledge (Booth et al., 2003). Finally, qualifiers are statements that provide a context for the warrants or produce leverage for the warrants, and so in turn impact on the judgment (for example, when, with the exception, etc.)
Claim ... Claim Fi Figure 1. Procedural Logic in the Development of a Judgment JUDGMENT

Evaluation question(s) or goal

Data and empirical evidence

310
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Hurteau et al.: Legitimate and Justied Judgments


Warrants (since) Backings (on account of) Qualifiers (unless that)

Claims

LEGITIMATE JUDGMENT

Figure 2.

Procedural Logic Underlying the Argumentation of a Judgment

(Arens, 2005, 2006). How, then, can one establish that a judgment is justified? Toulmin (1964) and Arens (2005, 2006) propose the following conditions: (a) justification of the criteria (i.e. warrants, backings); (b) justification of the standards (i.e. qualifiers); and (c) documentation of the procedure used to synthesize the information into a judgment. Because multiple standards can be applied to any given program, stakeholders might reach different or even conflicting judgments. Conflicting claims regarding a programs quality, value, or importance often serve as strong indicators that stakeholders are using different standards for judgment. Nevertheless, in the context of an evaluation, such disagreement can act as a catalyst for clarifying relevant values and for negotiating the appropriate base on which the program should be judged (Wheeler et al., 1992). Panoptic argumentation takes into consideration the points of view of the various stakeholders, permitting judgment that is both justified and balanced (House, 1995; Patton, 1997; Scriven, 1993). However, while theorists of program evaluation recognize the centrality of the argumentation, such a view remains scantily documented and rather poorly developed (Fournier and Smith, 1993; House, 1995; Trelogan, 2001). Within this context, Houses (1995: 93) comment that evaluations are the best judgments we can arrive at in the situation seems particularly apropos. In summary, evaluative practice must take into consideration a procedural logic during the process of evaluation in order to produce a legitimate and justified judgment (Rog, 1995). That the literature does not clearly document how a judgment can be considered balanced is troublesome; because, as early as 1972 (Guba) and as recently as 1995 (Scriven), reflections and dialogue on the failure of program evaluation have included balanced judgment in the discourse.

Research Question and Objectives


In response to the situation within the field of program evaluation and the current state of knowledge in this area, the research question was formulated as follows: do program evaluation practitioners generate legitimate and justified judgments? Meeting the following three objectives would allow the question to be answered: 1) establishing if practitioners generate any kind of judgment; 311
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation 15(3) 2) determining if these judgments are legitimate i.e. (a) they refer to the evaluation questions/goals; (b) they are supported by criteria (warrants, backings, and qualifiers); and (c) they are supported by a standard (Toulmin, 1964; Scriven, 1980, 1990; Arens, 2005, 2006); and 3) establishing if these judgments can be considered justified i.e. there is: (a) justification of the criteria (i.e. warrants, backings); (b) justification of the standards (i.e. qualifiers); and (c) documentation of the procedure used to synthesize the information into a judgment (Toulmin, 1964; Arens, 2005, 2006).

Methodology
In order to answer the question, a meta-analysis of program evaluation reports was undertaken, providing a valid description of theoretical attributes and identifying meaningful relationships (Krippendorf, 2004; Neuendorf, 2002).

Sampling
Evaluation reports (available, public, and referring to various fields of application) were chosen as a data source utilizing the ERIC database (Educational Resources Information Centre). While the number of sources was limited, constituting a possible bias, the selection of the sample was carried out rigorously according to the two following criteria: 1) reports that were produced between 2000 and 2006 inclusive; and 2) reports that included the full process of evaluation (reports including only the planning of an evaluation were not retained). Out of the 80 initial articles responding to the first criteria, 22 were excluded because they did not fit the second. Of the 58, 40 reports were randomly selected for the purpose of the meta-analysis, and the other 18 were retained in order to pre-test the grid, and of these, 17 were used to establish inter-judge reliability.

Variables
The variables are: the presence or not of a judgment (i.e. presence of a question or a goal, presence of the criteria, and presence of the standards); and the elements required to produce a legitimate or justified judgment (i.e. justification of the criteria, justification of the standards, and documentation of the procedures used to synthesize the information into a judgment a methodological consideration).

Data Collection Instrument


A grid was developed, composed of the variables on one axis, and the type of judgment (the information is present or not present) on the other axis. The grid also included the definition of each variable. It was validated by three university professors, recognized for their expertise in the field. Coders were not only trained, but also established an inter-judge reliability of at least .80 before starting to code the actual data. 312
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Hurteau et al.: Legitimate and Justied Judgments

Results
The results allowed us to attain the objectives of the current research, that is: 1) to establish if practitioners generate any kind of judgment; 2) to determine if these judgments are legitimate, and 3) to establish if these judgments can be considered justified. From these results emerges a response to the initial research question: do program evaluation practitioners generate legitimate and justified judgments? As shown in Table 1, out of the 40 reports analysed, only 20 (50 percent) generated some kind of judgment, meaning that they went beyond evidence in presenting a conclusion or affirmation. Concerning the elements of legitimated judgments, out of these 20 reports analysed, 14 (70 percent) stated the question that initiated the evaluation process or its goal; 20 (100 percent) presented the criteria and 13 (65 percent) presented standards. This low percentage of reports incorporating standards is not surprising; such a lack has already been noted in the literature (Arens, 2005). As for the elements supporting a justified judgment, out of the 20 reports analysed, 19 (95 percent) presented a justification of their criteria; 13 (65 percent) a justification of their standards, and no document offered information on the procedures used to synthesize the information into a judgment.
Table 1. Legitimate and Justified Judgments According to Frequencies and Characteristics Elements and scores Judgment generated from the evaluation process Out of the 20 reports, the judgment is considered legitimate: Presence of a question or a goal Presence of the criteria Presence of the standards Scores: 0 = 0/3 elements 1 =1/3 elements 2 = 2/3 elements 3 = 3/3 elements Out of the 20 reports, the judgment is considered justified: Justification of the criteria Justification of the standards of performance Documentation of the procedures used to synthesize the information into a judgment Scores: 0 = 0/3 elements 1= 1/3 elements 2= 2/3 elements 3= 3/3 elements Reports 20/40 (50%) 14 (70%) 20 (100%) 13 (65%) 1 (5%) 11 (55%) 8 (40%) 19 (95%) 13 (65%)

1 (5%) 6 (30%) 13 (65%)

313
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation 15(3) In analysing these results, we created (non-parametric) scores, ranging from 0 to 3, representing the number of elements found in each report (0 = none, 1 = one of the three elements; 2 = two of the three elements; 3 = three of the three elements), for both legitimate and justified judgments. We were unable to apply the Roc Curves test in establishing a cutting point as it requires that the two groups (in our case those reports producing a judgment/reports not producing a judgment) be distinct, which in this analysis is not the case. We then determined a theoretical cutting point for two out of three elements, in each case (legitimate and justified). As shown in Table 1, 95 percent of the reports have at least two of the three required elements of a legitimate judgment (incorporation of explicit standards being more problematic). As for the elements of the justified judgment, 65 percent of the reports contain two of the three elements. In summary, it may then be said that, when present, the judgments in the reports are most often legitimate but less often justified.

Comparison between Reports Generating a Judgment and Not Generating a Judgment


As shown in Table 2, the almost similar distributions of the elements required for a legitimate and justified judgment appear somewhat troublesome. This observation led us to compare the pattern of two groups of reports (those generating a judgment and those not generating a judgment). Indeed, no significant differences were found through a chi-square analysis for each element required to generate a legitimate judgment (presence of a question or a goal: 2 = 1.29, df = 1, p = .256; presence of the criteria: 2 = . 58, df = 1, p = .75; presence of the standards of performance: 2 = 3.87, df = 1, p = . 78). Similar results are observed concerning elements required to generate a justified judgment. Indeed, no significant differences were found through chi-square analysis for the two first variables (justification of the criteria: 2 = .51, df = 1, p = .48; justification of the standards: 2 = .03, df = 1, p = .86); and the third variable produced a frequency equal to 0 in each situation.
Table 2. Elements Related to a Legitimate and Justified Judgment Judgment produced (n = 20) Elements concerning a legitimate judgment Presence of a question or a goal Presence of the criteria Presence of the standards of performance Elements concerning a justified judgment Justification of the criteria Justification of the standards of performance Documentation of the procedures used to synthesize the information into a judgment (methodological considerations) Judgment not Total produced (n = 20) (n = 40) 31 (78%) 40 (100%) 23 (58%) 37 (93%) 23 (58%)

14 (70%) 17 (85%) 20 (100%) 20 (100%) 13 (65%) 10 (50%) 19 (95%) 13 (65%) 18 (90%) 10 (50%)

314
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Hurteau et al.: Legitimate and Justied Judgments


Table 3. Scores Related to a Legitimate and Justified Judgment Judgment produced (n = 20) Scores concerning a legitimate judgment 0 = 0/3 elements 1 = 1/3 elements 2 = 2/3 elements 3 = 3/3 elements Scores concerning a justified judgment 0 =0/3 elements 1 = 1/3 element 2 = 2/3 elements 3 = 3/3 elements 1 (5%) 11 (55%) 8 (40%) 1 (5%) 6 (30%) 13 (65%) Judgment not produced (n =20) 2 (10%) 9 (45%) 9 (45%) 2 (10%) 8 (40%) 10 (50%) Total (n = 40) 3 (7.5%) 20 (50%) 17 (42.5%) 3 (7.5%) 14 (35%) 23 (57.5%)

As shown in Table 3, we used the scores in each situation (judgment produced, or not produced). No significant differences were found through the chi-square analysis for each element required to generate a legitimate judgment (2 = .592, df = 2, p = .744), and a justified judgment (2 = .1.010, df = 2, p = .603). Finally, we searched for a significant pattern in the application of the six elements in the production of a judgment. The reports generating a judgment use a mean of 3.95 elements (S.D. = 1.05), compared to 3.75 elements (S.D. = 1.29) for those which do not produce a judgment. The result of the Mann-Whiney test (preferred over the t-test because the distribution of the current data set is not normal) confirms that there is no significant difference between the two groups (z = .372, p = .738). In summary, the present results lead us to the postulate that judgments in program evaluation are mostly based on the required elements in the case of legitimate judgment but are used less frequently in the justification of judgments. These elements are present in a similar ratio even when they are not used to produce a judgment.

Strengths and Limitations of the Research


The topic of this research is important and contributes to a better understanding of the evaluation communitys use of the basis of evaluative judgment, as well as of the occurrence and non-occurrence of judgments in program evaluation reports. The choice of the meta-analysis lends credence to this understanding as it goes beyond the possible bias of practitioners impressions. Furthermore, the study respects the steps proposed by Rourke and Anderson (2004) as required to produce valid data. However, sampling relied on only one source, and the size (less than 100) of the sample retained restricted both the type of statistical analysis utilized (for example, use of parametric instead of non-parametric tests) and the scope of the conclusions. Also, the limited source of the information (one database: ERIC) 315
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation 15(3) could lead to bias, as it often does not include governmental reports that may be of a high standard but difficult to obtain.

Discussion
Do program evaluation practitioners generate legitimate and justified judgments? The present study indicates that only 50 percent of the reports analysed did generate a form of judgment. The limitations concerning the sample encourage caution with regards to the conclusions, in that it is difficult to establish to what extent the results do indeed reflect evaluation practice in general. The results probably raise more questions than they answer. Could it be that practitioners do not establish a distinction between an inquiry that merely describes and a program evaluation that describes in order to generate a judgment or an evaluative conclusion? It could also mean that practitioners do not accept that a program evaluation should necessarily generate a judgment. As Shadish et al. (1991) state, most theorists agree with the three first operations of Scrivens logic of evaluation but not with the final one concerning judgment. Could it be that the clients of the programs evaluated would not want judgments presented in the written report because such judgments would not serve their purposes? The pattern concerning the three elements constituting the foundation of a justified judgment is more troublesome (justification of the criteria and the standards, and information on procedures used to synthesize the information) in that it does not find any explanation in the present context. The global impression emerging from the present analysis is that practitioners mostly produce the required information but that it is not used in a comprehensive manner. This observation stands for justified judgment. Could it be that the construction of a judgment rests on other theoretically grounded aspects? It would confirm the conclusion, after interviews with highly experienced practitioners, that Scrivens vision of program evaluation logic (logic of evaluation and the double pyramid) does not always fit practice (Stake et al., 1997). In order to establish if these elements remain focal, these results lead us to suggest that the next research step should be an inquiry among a large number of practitioners to better understand their operations and their intentions. And if they are not central to program evaluation, it is crucial to determine what elements delineate a legitimate and justified judgment. Such knowledge is important in order to maintain and further develop practice and training in the field.

Conclusion
After identifying and documenting the required elements to produce a legitimate and justified judgment, a meta-analysis was completed on 40 program evaluation reports. The results tend to show that, while program evaluation practitioners do not systematically generate legitimate judgments, they are at least usually justified. They also tend to show that the internal logic of the development in 316
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Hurteau et al.: Legitimate and Justied Judgments their reports is not always obvious. Such results support Gubas (1972) and Scrivens (1995) observations that program evaluation periodically needs to be called into question as an original process with the primary function of producing legitimate and justified judgments.

References
Arens, S. A. (2005) Understanding Practice through Exemplary Evaluation Cases, paper presented at the 2005 Joint Conference of CES/AEA, Toronto. Arens, S.A. (2006) Ltude du raisonnement dans les pratiques valuatives, Mesure et valuation en ducation 29(3): 4556. Booth, W., G. Colomb and J. Williams (2003) The Craft of Research. Chicago: University of Chicago Press. Datta, L. E. (2006) The Practice of Evaluation Challenges and New Directions, in F. Shaw, J. C. Greene and M. M. Mark (eds) The Sage Handbook of Evaluation, pp. 41938. Thousand Oaks, CA: SAGE. Davidson, E. J. (2005) Evaluation Methodology Basics: The Nuts and Bolts of Sound Evaluation. Thousand Oaks, CA: SAGE. Fournier, D. M. (1995) Establishing Evaluative Conclusions: A Distinction between General and Working Logic, in D. M. Fournier (ed.) Reasoning in Evaluation: Inferential Links and Leaps, pp. 1531. New Directions for Evaluation, 68. San Francisco: JosseyBass. Fournier, D. M. and N. L. Smith (1993) Clarifying the Merits of Argument in Evaluation Practice, Evaluation and Program Planning 16(4): 31523. Guba, E. G. (1972) The Failure of Educational Evaluation, in C. H. Weiss (ed.) Evaluating Action Programs: Readings in Social Action and Education, pp. 25166. Boston, MA: Allyn & Bacon. Habermas, J. (1979) Communication and the Evolution of Society. Boston, MA: Beacon Press. House, E. R. (1980) Logic of Evaluative Argument, in E. R. House (ed.) Evaluating with Validity, pp. 6796. Beverly Hills, CA: SAGE. House, E. R. (1995) Putting Things Together Coherently: Logic and Justice, in D. M. Fournier (ed.) Reasoning in Evaluation: Inferential Links and Leaps, pp. 3348. New Directions for Evaluation, 68. San Francisco: Jossey-Bass. House, E. R. and K. R Howe (1999) Values in Evaluation and Social Research. Thousand Oaks, CA: SAGE. Hurteau, M. and S. Houle (2005) Identifying a Core Body of Knowledge for Evaluators, paper presented at the 2005 Joint Conference of CES/AEA, Toronto. Hurteau, M., G. Lachapelle and S. Houle (2006) Comprendre les pratiques valuatives afin de les amliorer: la modlisation du processus spcifique lvaluation de programme, Mesure et valuation en ducation 29(3): 2744. Joint Committee on Standards for Educational Evaluation (1994) Program Evaluation Standards: How to Assess Evaluations of Educational Programs. Thousand Oaks, CA: SAGE. Krippendorf, K. (2004) Content Analysis: An Introduction to its Methodology. Thousand Oaks, CA: SAGE. McCarthy, T. A. (1973) A Theory of Communicative Competence, Philosophy of the Social Sciences 3: 13556.

317
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Evaluation 15(3)
Merriam-Webster Online Dictionary (2004) Available at http://www.MerriamWebsterCollegiate.com Neuendorf, K. A. (2002) The Content Analysis Handbook. Thousand Oaks, CA: SAGE. Patton, M. (1997) Utilization-Focused Evaluation: The New Century Text. Thousand Oaks, CA: SAGE. Redding, P. (1989) Habermas Theory of Argumentation, Journal of Value Inquiry 23: 1532. Rog, D. J. (1995) Reasoning in Evaluation: Challenges for the Practitioner, in D. M. Fournier (ed.) Reasoning in Evaluation: Inferential Links and Leaps, pp. 93100. New Directions for Evaluation, 68. San Francisco: Jossey-Bass. Rourke, L. and T. Anderson (2004) Validity in Quantitative Content Analysis, Educational Technology Research and Development 52(1): 517. Schwandt, T. A. (2002) Evaluation Practice Reconsidered. New York: Peter Lang. Schwandt, T. A. (2008) The Relevance of Practical Knowledge Traditions to Evaluation Practice, in N. L. Smith and P. R. Brandon (eds) Fundamental Issues in Evaluation, pp. 2940. New York: Guilford Press. Scriven, M.(1980) The Logic of Evaluation. Inverness, CA: Edgepress. Scriven. M (1990) The Evaluation of Hardware and Software, Studies in Educational Evaluation 16: 340. Scriven, M. (1991a) Evaluation Thesaurus. Newbury Park, CA: SAGE. Scriven, M. (1991b) The Science of Valuing, in W. E. Shadish, T. D. Cook and L. C. Leviton (eds) Foundations of Program Evaluation: Theories of Practice, pp. 73118. Thousand Oaks, CA: SAGE. Scriven, M. (1993) Evaluation and Critical Reasoning: Logics Last Frontier, in R. Talaska (ed.) Critical Reasoning in Contemporary Culture, pp. 353406. Albany State, NY: University of New York Publications. Scriven, M. (1995) The Logic of Evaluation and Evaluation Practice, in D. M. Fournier (ed.) Reasoning in Evaluation: Inferential Links and Leaps, pp. 4970. New Directions for Evaluation, 68. San Francisco: Jossey-Bass. Shadish, W. E., T. D. Cook and L. C. Leviton (1991) Foundations of Program Evaluation: Theories of Practice. Thousand Oaks, CA: SAGE. Stake, R. E. (2004) Standards-Based and Responsive Evaluation. Thousand Oaks, CA: SAGE. Stake, R., C. Migotsky, R. Davis, E. J. Cisneros, G. Depaul, C. Dunbar, R. Farmer, J. Feltovich, E. Johnson, B. Williams, M. Zurita and I. Chaves (1997) The Evolving Syntheses of Program Value, Evaluation Practice 18(2): 89103. Stake, R. E. and T. A. Schwandt (2006) On Discerning Quality in Evaluation, in I. F. Shaw, J. C Greene and M. M Mark (eds) The Sage Handbook of Evaluation, pp. 40418. Thousand Oaks, CA: SAGE. Taylor, P. W. (1961) Normative Discourse. Englewood Cliffs, NJ: Prentice Hall. Toulemonde, J. (2005) Appropriation des rsultats de lvaluation: leons de la pratique en Rgion Limousin, paper presented at the 2005 Colloque de la socit franaise de lvaluation, Lille. Toulmin, S. E. (1964) The Uses of Argument. New York: Cambridge University Press. Toulmin, S. E., R. D. Rieke and A. Janik (1984) An Introduction to Reasoning. New York: Macmillan. Trelogan, T. K. (2001) Arguments and their evaluation, Department of Philosophy, University of Northern Colorado, unpublished manuscript.

318
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

Hurteau et al.: Legitimate and Justied Judgments


Treasury Board of Canada Secretariat (2004) Examen de la qualit des valuations dans les ministres et les organismes. URL (consulted Sept. 2006) : http://www.tbs-sct.gc.ca/ eval/pubs/rev-exam_f.asp Valovirta, V. (2002) Evaluation Utilization as Argumentation, Evaluation 8(1): 6080. Wheeler, P., G. D. Haertel and M. Scriven (1992) Teacher Evaluation Glossary, Kalamazoo, MI: CREATE Project, Evaluation Center, Western Michigan University, unpublished manuscript.

MARTHE HURTEAU is an Associate Professor in the Faculty of Education at the Universit du Qubec Montral, where she teaches and supervises students in the field of program evaluation. The fundamentals of program evaluation are her main interest and the focal point of research activities. Please address correspondence to: Dpartement dducation et de Pdagogie, Universit du Qubec Montral, Case Postale 8888, succursale Centre-ville, Montral (Qubec), H3C 3P8, Canada. [email: hurteau.marthe@uqam.ca]

SYLVAIN HOULE is an Associate Professor in the Faculty of Business Administration at the Universit du Qubec Montral. Performance indicators are his main interest. Address: Dpartement des finances, Universit du Qubec Montral, Case Postale 8888, succursale Centre-ville, Montral (Qubec) H3C 3P8 [email: houle.sylvain@uqam.ca]

STEPHANIE MONGIAT is a full-time member of an evaluation team at the Commission de la sant et de la scurit du travail (CSST). Address: CSST, 1199 Bleury, Montral (Qubec) [email: smongiat@hotmail.com]

319
Downloaded from http://evi.sagepub.com by ticu dorina on October 5, 2009

S-ar putea să vă placă și