Creative Assessment

Thinking Skills and Creativity 13 (2014) 183194
Contents lists available at ScienceDirect
Thinking Skills and Creativity

journal homepage: http://www.elsevier.com/locate/tsc
More than appropriateness and novelty: Judges criteria of

assessing creative products in science tasks
Haiying Long
Department of Leadership and Professional Studies, Florida International University, 11200 SW 8th Street, Miami, FL 33199, USA
a r t i c l e
i n f o
Article history:
Received 15 January 2014
Received in revised form 25 March 2014
Accepted 6 May 2014
Available online 2 June 2014
Keywords:
Creativity
Assessment
Judges
Criteria
Framing analysis
a b s t r a c t
The present research used a qualitative methodology to examine the criteria that judges
employed in assessing creative products elicited by two science tasks. Forty-eight responses
were produced by sixth grade students and were then assessed by three groups of judges
with different levels of expertise. Verbal protocol and interviews were conducted to collect
data and framing analysis was used to analyze data. Overall, judges employed appropriateness, novelty, thoughtfulness, interestingness, and cleverness as their assessment criteria.
Each criterion included several interpretations and the criteria were related to each other.
Moreover, three judge groups differed in their use of criteria and the criteria also varied by
task.
2014 The Authors. Published by Elsevier Ltd. This is an open access article under the CC
BY license (http://creativecommons.org/licenses/by/3.0/).
1. Introduction
Among the six major strands of creativity that were categorized to understand the phenomenon (Kozbelt, Beghetto, &
Runco, 2010): process, product, person, place (Rhodes, 1961), persuasion (Simonton, 1990), and potential (Runco, 2003),
creative products have been used extensively during the past a few decades. According to a recent review of publications
on ve prestigious creativity journals (Long, 2014a), about one fourth of the quantitative studies required participants to
engage in creative problem solving and come up with solutions or products, which were then assessed by a panel of judges.
Researchers believed that this approach provided a valid alternative to measure creativity and regarded Consensual Assessment Technique (CAT) the most popular product assessment technique as the gold standard of creativity assessment
(Carson, 2006; Kaufman, Baer, & Gentile, 2004; Kaufman, Plucker, & Baer, 2008b; Plucker & Makel, 2010; Sternberg & Lubart,
1999).
Due to the popularity of the product assessment, an increasing number of studies examined various issues in this approach,
such as, the use of judges with different levels of expertise (e.g., Kaufman & Baer, 2012; Kaufman, Baer, & Cole, 2009; Kaufman,
Baer, Cole, & Sexton, 2008; Kaufman, Gentile, & Baer, 2005 and the use of different procedures (e.g., Dollinger & Shafran,
2005). As a signicant issue, assessment criteria, however, have received little attention. This issue is closely related to
the long-standing criterion problem in the eld of creativity. Furthermore, it is fundamental to understanding creativity
assessment as well as the construct of creativity given that judges rely on these criteria, whether manifest or latent, to make
decisions about the creativity of the products. The present research employs qualitative methodology to investigate the
criteria of assessing creativity in two science tasks, particularly from the perspective of the judges.
Tel.: +1 305 348 3228.

E-mail address: haiying.long@u.edu
http://dx.doi.org/10.1016/j.tsc.2014.05.002
1871-1871/ 2014 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/
by/3.0/).
184
H. Long / Thinking Skills and Creativity 13 (2014) 183194
1.1. Criteria of assessing creative products

The criterion of a creative product is always the rst question arising from the discussion of the assessment. It is also
a problem that has been plaguing the eld for many decades (Amabile, 1982; Kaufman & Baer, 2012; McPherson, 1963;
Shapiro, 1970). Although researchers (e.g., Cropley, 1967; Guilford, 1950; Stein, 1953; Sternberg, 1988) generally agree that
creativity is composed of novelty and appropriateness (Amabile, 1982; Runco & Jaeger, 2012), most denitions do not include
conceptualizations that are readily translated into useful assessment criteria, let alone ultimate criteria (Amabile, 1982, p.
999). Therefore, traditional creativity assessment approaches, including divergent thinking tests and experts judgment of
products, are operating in a denitional void, or more specically, [T]here is no clear, explicit statement of the criteria
that conceptually underlie the assessment procedures (p. 999).
In order to resolve this problem, Amabile (1982) developed the Consensual Assessment Technique (CAT), or Subjective
Assessment Technique, under the theoretical framework of social psychology of creativity. This technique was rooted in
the assumption if appropriate judges independently agree that a given product is highly creative, then it can and must
be accepted as such (Amabile, p. 1002). In the procedure of employing this approach, appropriate judges, who are those
familiar with the domain in which the product was created or the response articulated (Amabile, p. 1000), are not provided
with any criteria, but are required to use subjective criteria or their own denition of creativity to assess the products.
Because it avoids the difculty of dening creativity and its related terms, it has become the most widely used product
assessment tool in the eld and was commented as a clever solution for the criterion problem in creativity research
(Plucker & Makel, 2010, p. 59).
However, Sadler (1989) contended that individuals typically follow two sets of criteria manifest and latent in the
process of making human judgment. When manifest criteria are not required, people heavily rely on their latent criteria.
This idea is consistent with implicit theories of psychological constructs. Sternberg (1985, p. 608) dened these theories as
constructions by people (whether psychologists or laypersons) that reside in the minds of these individuals. He further
noted, Such theories need to be discovered rather than invented because they already exist, in some form, in peoples
heads (p. 608). This suggested that the implicit criteria used in the assessment should be identied in order to better
understand creativity and its construct. Furthermore, an in-depth understanding of the criteria is essential to the validity of creativity assessment. According to Standards for Educational and Psychological Testing (Standards hereafter) (AERA,
APA, & NCME, 1999, p. 9), Validity refers to the degree to which evidence and theory support the interpretations of test
scores entailed by proposed uses of test. In the product assessment of creativity, validity focuses on the meaning and interpretation of the ratings provided by the judges, rather than the ratings that are just the numerical representation of the
criteria.
Often, product assessment yielded acceptable interrater reliability as estimated by Cronbachs alpha (Plucker & Makel,
2010); however, an acceptable alpha only suggests that each judge is consistent in classifying the products based on
his own criteria, but it does not speak for the consistency of the criteria that all the judges apply (Stemler, 2004). On
the other hand, a few studies examining judges with different levels of expertise indicated low agreement or interrater reliability. For instance, Runco, McCarthy, and Svenson (1994) focused on three-dimensional artworks produced
by 47 college students and compared the works creativity ratings obtained from three professional artists, students
themselves, and their peers. They found that students and their peers noticed signicant differences among the works,
but the artists did not. Hickey (2001) used ve groups of judges with different expertise in assessing students musical composition work created by fourth- and fth-grade students. The results showed that the ratings of novices (i.e.,
2nd- and 7th-grade students) did not signicantly correlate with those of experts (i.e., teachers and musical theorists). Used 10 poets and 106 college students as experts and novice judges respectively to assess SciFaiku poems
produced by over 200 college students. They found that the ratings of experts and novices barely correlated (r = .22).
Long (2014b) examined rater effects in creativity assessment and demonstrated that the inconsistency of judges ratings
was partly due to the use of different criteria. However, the assessment criteria were not specically addressed in these
studies.
Besemer and colleagues (Besemer & OQuin, 1986; Besemer & OQuin, 1987; Besemer & OQuin, 1999; Besemer &
Trefnger, 1981; Besemer, 1998, 2006) are among a few researchers that investigated the assessment criteria of creative
products, or what attributes of the product contribute to its creativity (Besemer & OQuin, 1986, p. 115). Their work was
built on the model called the Creative Product Analysis Matrix (CPAM), which included three conceptual dimensions: novelty (newness of processes, materials, and design), resolution (functionality, usefulness, workableness of the product), and
elaboration and synthesis (the stylistic attributes of the nished product) and the original 14 criteria that were developed
to support the dimensions (Besemer, 1998). During more than a decade, they rened the criteria to the nal nine component facets and examined the psychometric properties of the matrix through several empirical studies. Along this line of
work, Cropley and colleagues (Cropley & Cropley, 2008; Cropley, 2005; Haller, Courvoisier, & Cropley, 2011) created and
revised Cropley Solution Diagnosis Scale (CSDS), which was used especially for nonexpert judges and the evaluation of
mini-c creativity. These studies made signicant contributions to the understanding of the criteria in assessing creative
products. Similar to most quantitative research, these dimensions and scales were derived from literature review, but they
were seldom conrmed in empirical studies.
Besides the employment of CAT in which criteria are not needed, the approach that judges are provided with predetermined criteria for the assessment has also been frequently used in the eld during the past a few decades. Mumford
185
and colleagues (e.g., Friedrich & Mumford, 2009; Reiter-Palmon, Mumford, Boes, & Runco, 1997) obtained creativity indexes
of the solutions based on quality and originality scores. Quality was dened as completeness, coherence, and usefulness of
the solutions and originality referred to novelty, imagination, and structure. Separate quality and originality scores were
rst calculated for each participant by averaging quality and originality scores for each problem across raters and a creativity
score was then generated by multiplying average quality and originality scores. Although judges in these studies were able
to discuss and further revise the rating criteria, these criteria were assigned to them in advance. Hence, judges perspectives
were not taken into account. This is not a trivial issue given that judges are at the center of the assessment.
More important, the criteria of assessing creative products are inuenced by many factors: task, domain, and expertise,
just to name a few. First, the assessment criteria vary by the nature of the tasks involved. Runco, Illies, and Eisenman (2005)
provided participants with realistic and unrealistic divergent thinking tasks and concluded that individuals generated more
appropriate ideas from realistic tasks, whereas they generated more original and exible ideas from unrealistic tasks. In
the same manner, Reiter-Palmon, Illies, Buboltz, Cross, and Nimps (2009) required participants to creatively solve one of
three realistic problems that differed in complexity, involvement, and problem-solving efcacy. They showed that less
complex problems elicited more solutions and fewer original solutions and more complex problems elicited solutions with
lower quality. Two studies (Kaufman, Baer, & Cole, 2009; Kaufman, Baer, Cole, & Sexton, 2008) using the same assessment
technique (i.e., CAT) and two same groups of judges (i.e., 10 poets as expert judges and 106 college students as novice judges)
reported somewhat dissimilar results. The ratings from experts and novices in 2008 study that focused on the products of
SciFaiku poems barely correlated (r = .22) whereas those ratings in 2009 study that focused on the products of short stories
highly correlated (r = .71). This suggested that different tasks, even in the same domain (i.e., writing), could produce different
outcomes.
In addition, the domain of the products and the level of expertise of the judges, or the interaction between these two
aspects, also affect the assessment criteria. Certainly, the assessment of an art work might involve originality, aesthetics,
and technical skills (Kozbelt, 2004), whereas the assessment of an engineering product might focus on utility, function, and
value (Charyton & Snelbecker, 2007; Larson, Thomas, & Leviness, 1999). Several studies introduced earlier (e.g., Hickey, 2001;
Kaufman, Baer, & Cole, 2009; Kaufman, Baer, Cole, & Sexton, 2008; Runco et al., 1994) and other studies (e.g., Kaufman, Niu,
Sexton, & Cole, 2010) demonstrated the impact of expertise level on the product assessment of creativity. Very recently,
Kaufman, Baer, Cropley, Reiter-Palmon, and Sinnett (2013) used different types of quasi-experts, novices, and experts in
evaluating creativity of short stories and engineering design products in order to investigate the effect of judges expertise
on the assessment of creative products in different domains. They found that quasi-experts were appropriate for short stories
but not for engineering products, indicating that some domains, such as engineering, need more domain knowledge than
other domains, such as writing. But little is known, in these studies, about exactly how the assessment criteria of judges
with different levels of expertise vary from task to task.
Furthermore, although creativity was studied in various domains, including writing (Baer, 1994; Baer, 2003; Kaufman
et al., 2004), arts (Amabile, 1982; Kozbelt, 2008; Perez-Fabello & Campos, 2011), music (Brinkman, 1999; Eisenberg &
Thompson, 2003; Ziv & Keydar, 2009), lm (Plucker, Holden, & Neustadter, 2008), and design (Allen, 2010; Demirkan &
Hasirci, 2009), few studies examined creativity in science and the assessment criteria in this domain. Understanding creativity
assessment of science products is especially meaningful due to the signicance of creativity and innovation to science and
the increasing interest in science, technology, engineering, and mathematics (STEM) areas.
1.2. The present research
This research examined criteria of assessing scientic creativity in product assessment approach. It recruited three groups
of judges with different levels of expertise to assess creativity in two science tasks that differed in contents and requirements.
Three research questions were investigated: What criteria are used by judges to assess creative products in science tasks?
Do the criteria vary by judges expertise level? Do the criteria vary by task?
2. Methodology
This research employs qualitative methodology. Different from quantitative methodology which aims to detect universal
laws about the regularities of the variables, qualitative methodology considers participants perspectives and interprets
individual experience (Cohen, Manion, & Morrison, 2011). In addition, qualitative research involves an inductive process,
which builds concepts and theories from data. This approach is particularly useful when the study is exploratory (Merriam,
2002).
Framing methodology was chosen as a specic qualitative methodology for this research. According to Goffman (1974),
framing methodology draws heavily from phenomenology and structuralism. Specically, it focuses on assumptions about
the nature of the self and the role of meaning in the analysis. It is also a way of selecting, organizing, interpreting, and
making sense of a complex reality to provide guideposts for knowing, analyzing, persuading and acting (Rein & Schon,
1993, p. 146). This methodology was used to examine how signicant relations emerge from the complex process in various
elds, such as, micro-sociology (Goffman), media (Chong & Druckman, 2007; Scheufele, 1999), policy analysis (Rein and
Schon), and curriculum analysis (Ortloff, 2007).
186
2.1. Participants
Responses that were used in the assessment were collected from 48 sixth grade students. They were selected from
one gifted and talented (GT) class and one general education class at a local school in a Midwestern city. The purpose of
selecting students from two different classes was to increase the sample variance because previous research demonstrated
that creativity of gifted and nongifted individuals are qualitatively and quantitatively different (Runco & Albert, 1985; Runco,
1985, 1986).
Three groups of judges educational researchers, elementary school teachers, and education-major undergraduate students were selected to represent judges of different levels of expertise. Educational researchers are typically experts in
education and creativity, but might not know as much about science and characteristics of elementary school students as
elementary school teachers. Elementary school teachers might know less about creativity and educational research but are
more experienced with students and domain knowledge and they were considered quasi-expert. Comparatively speaking,
education-major undergraduate students are novices in both educational research and elementary school settings. Each
judge group consisted of 15 judges and a total of 45 judges were recruited from local schools and university.
2.2. Tasks
Two open-ended science questions (see Appendix A) were created for the purpose of this study due to three reasons.
First, asking people to solve problems creatively is a good way to measure their creativity, especially for creative cognition
(Mumford, Vessey, & Barrett, 2008). Second, previous studies on scientic creativity (Hu & Adey, 2002; Hu et al., 2013;
Lin, Hu, Adey, & Shen, 2003; Taylor & Barron, 1963) mainly used divergent thinking tests and their scoring criteria but
seldom employed the creative products based on open-ended questions. Third, the items designed by Hu and collegues
were specically for secondary students, not for elementary school students.
These two questions were recommended by two sixth grade teachers and were further examined by an educational
psychology professor. One question was about water and water evaporation and the other was about earth tilt and climate
change. Two imaginary questions were created in order to stimulate students creative thinking. Moreover, these two questions were created differently. The rst question put the students in a scenario where there might not be right or wrong
answers, but the second question asked students to explain a scientic phenomenon before solving the problem.
2.3. Data collection
Two science questions were each administered to students by their teachers after the relevant contents had been taught.
Students were not given time constraint and were instructed to be as creative as they could because this instruction can
increase creativity of their responses (Christenson, Guilford, & Wilson, 1957; Runco et al., 2005; Silvia et al., 2008). Considering the long time involved in assessing 48 students works of two tasks, 24 pieces of responses were randomly selected
from two classes, of which 12 were from GT class and the other half from general education class.
Students responses were then typed into a word le and an online venue. The spelling mistakes and grammar errors were
corrected because this could avoid handwriting illegibility, which in turn helped reduce the burden of rating. All students
responses were labeled with ID numbers and were put in a random order before released to the judges. Each judge assessed
48 responses that were obtained from two tasks. They were not provided with any criteria, and all of their assessment was
based on their subjective criteria. A 15 scale was used with 1 being the least creative and 5 being the most creative (see
Appendix B for instruction). The judges were able to choose to use paper and pencil or online format, but the contents were
the same. Given the possible fatigue involved, they were encouraged to stop at any point and returned to continue with the
rating. Judges were also instructed to write down their rationales for giving a rating to a response or their comments if there
is any. This process is similar to verbal protocol or think-aloud method that was proved to be valid in assessing individuals
cognitive thinking (Crisp, 2008). It helped keep track of how judges made their judgments and what criteria they employed
in rating. At the end of the rating, judges were asked to answer a few questions to explain the meaning of creativity.
Based on the answers provided by the judges, follow-up questions were asked if ambiguity or confusion arose. Two
judges in a rater group whose average ratings were the highest and two others in the same group whose average ratings
were the lowest (i.e., a total of 12 judges for three rater groups) were conducted semi-structured interviews, aiming to
provide in-depth understanding of their assessment criteria.
2.4. Data analysis
In line with framing theory, framing analysis (Engel & Ortloff, 2009; Ortloff, 2007) was used to analyze the data. Frame
is dened as a perspective from which an amorphous, ill-dened, problematic situation can be made sense of and acted
on (Rein & Schon, 1993, p. 145). In content analysis that is a common research method analyzing the texts in all kinds of
verbal, pictorial, and symbolic forms (Krippendorff, 2004), a category or theme is typically developed and only one single
idea is emphasized. However, when several categories represent the ideas that imply some shade of differences as well as
some degree of similarities, redundancy would be created if these ideas are put into different categories. This might not be
a problem if the ideas are grouped together under one frame because a frame incorporates several similar meanings which
187
are related to the same perspective and the ideas become more concise. Moreover, meanings are made more salient with
framing analysis (Entman, 1993). By identifying frames in the text, a range of perspectives and meanings that are implicit
in the subtleties of the languages can naturally emerge (Gamson, 1997; Terkildsen & Schnell, 1997). In this sense, framing
analysis is deeply rooted in interpretation and consistent with the basic principles of qualitative data analysis. It also echoes
Carspeckens (1996) notion of meaning eld, which offers several possible interpretations to one idea. Furthermore, framing
analysis shows the relationships among all the meanings that are identied based on frames.
There are three levels in framing analysis. The rst level is to identify main categories or themes from the data based on
their meanings or the ranges of the meanings. This is similar to the initial coding in most qualitative data analysis. The second
level is to generate the frames based on the meanings of the categories and to further identify the relationships between
this and other frames. Categories at the rst level become sub-frames at this level. The third level of analysis is to map all
the relationships and analyze them in one picture. This is a further step of Ortloffs (2007) method and also a necessary step
to interpret the complexities of the relationships. It is especially important in this research because all of the frames are
entangled together and their relationships are crucial to understand the meaning and criteria of creativity. In this picture of
relationships, an idea or an aspect can become both an independent frame and a sub-frame of other frames.
2.5. Validity
In order to establish validity or trustworthiness for this study, two strategies were used to validate the data: triangulation
and member checking. Triangulation was used to check the data accuracy with the help of a graduate student who was not a
participant in this research. All the transcriptions and interpretations were sent to the participants for further check. Member
checking was also used when there were confusions and vagueness in the data (Seidman, 2006).
3. Results
3.1. Frames
Five frames were developed from the nine categories that were obtained in the rst level of analysis (see Fig. 1). They
represented the criteria of creativity that judges employed in their assessment. These criteria were based on two considerations: one was the meaning of creativity directly related to the contents of the responses and reected in judges own
denitions of creativity; the other included aspects related to forms of responses and characteristics of respondents, which
might be independent of response contents. When judges were mentioned or quoted in the following analysis, their ID
number and the group they belonged to were included.
3.1.1. Frame of appropriateness
Judges used various words or phrases to describe appropriateness of students responses, such as, if a response is appropriate, possible, viable, realistic, feasible, plausible, effective, functional, or useful; if it makes sense; or if it works. However,
to summarize, this frame explained how the responses were appropriate in some sense. It included six sub-frames, four
were related to the contents of the responses and two were related to response forms and respondents. The solutions that
were useful to solve the problems described in the two tasks were regarded as appropriate. The solutions that tted the
parameters provided in two scenarios were appropriate because they were within the range of the preset conditions. Logical
solutions were also appropriate because they were reasonable and practical to judges. Correctness referred to whether students showed correct understanding of the science used in the solutions. This was considered one aspect of appropriateness
because the correct science was closely related to the usefulness and logic of the responses. In addition, one requirement
of the tasks was to demonstrate students correct understanding of the science involved in problem solving. Completeness
of the responses was another requirement of the tasks although it was not explicitly stated in the questions. Judges consideration of respondents characteristics in assessing a response, such as students age level and knowledge appropriate
for their age, was another sub-frame of appropriateness. Appropriateness was further related to frame of thoughtfulness
because solutions will be more appropriate if more thinking is involved.
Even though judges agreed that appropriateness was an important criterion of creativity, they did not agree on which
solution was appropriate. For example, judges differed in their reactions to #4 students answer to the water task (i.e., the
student suggested using a water bottle to condense for water). Judge 2 in the teacher group gave it a 5 and thought it was
a very creative solution. She explained, The question said you were stranded on an island without water, you might have
water with you and you had a bottle. So I didnt rule that out as impractical. At this age everybody is carrying water bottles.
However, judge 11 in the researcher group who gave a 3 to this solution questioned its appropriateness in terms of t and
logic, Where do you get the bottle? How would the water be collected from the sides of the bottle with enough to drink?
3.1.2. Frame of novelty
Frame of novelty is another criterion of creativity that the judges considered. Words such as unique, original, different, thinking outside the box were used to describe it. Judges who focused on novelty rated an original response higher
even if it was not practical. For example, judge 2 in the undergraduate group gave a 5 to #24 students eight-step response
to the water task. Her reason was, In my opinion, this is thinking outside of the box. This solution would take longer than
188
Fig. 1. Frames of the criteria. Note: Each frame and its sub-frames in this gure are illustrated by the same line patterns that were randomly selected for
the purpose of illustration. The frames are rectangular and all of the sub-frames are represented by circles. The sub-frames representing characteristics of
response contents are shown by solid circles and the sub-frames representing other aspects, such as, response forms and characteristics of the respondents,
are shown by dotted circles. The relationships between a frame and its sub-frames are shown by solid arrows. If one frame is related to another one, they
are connected by a dotted arrow. When all of the frames are put together, some independent frames become the sub-frames of other frames. If one frame is
not only an independent frame but also a sub-frame of another frame, it is shown by a rectangle representing the frame itself as well as circles representing
the sub-frames.
simply heating the water with a re, but it does take an unusual mind to come up with this kind of idea. An interesting nding was that novelty implied different connotations for judges, depending on what they referred to. Some judges compared
one response to others in the sample. Some compared the solutions to what they saw or experienced. If a solution was not
mentioned by a second person in the sample or has never been seen by the judge, it was regarded as unique or original and
was assigned a higher score. Some judges also noticed uniqueness of the presentation or the form, instead of the content, of
the responses. For instance, some high rated responses listed materials that were needed in solving the problems, and some
used a chart to compare information included in the response.
3.1.3. Frame of thoughtfulness

Judges usually assigned higher ratings to responses that demonstrated careful thinking, or, if the plans described in the
responses indicated well thought-out processes. Judge 15 in the teacher group explained, To me, creativity is exhibited by a
thorough thought process, thinking something through to the most complete state possible by using any means available.
But judges had different foci when they talked about their idea of thoughtfulness. Some judges focused specically on
critical thinking; some emphasized how many details were provided in a response; others stressed how many ideas were
proposed in a response. Details shown in the responses indicated the depth of thinking, whereas the numbers of ideas
emphasized the breadth of thinking. The highest rated response to the water task contained multiple ideas and each idea
was provided with details. Other responses with high ratings either included one idea but with detailed descriptions or had
several workable ideas with few explanations. Thoughtfulness was also shown in the presentation of the responses. For
some judges, different ways of presenting the solutions, such as, a chart and numbers, showed not only uniqueness of the
forms but also good thinking. Well-thought ideas were closely related to the appropriateness of the ideas. A response that
included more ideas and details were more likely to solve the problem and to convince the judges that the solution could
help people achieve their goals.
189
Table 1
Rater groups use of criteria.
Rater group
Appropriateness
Novelty
Thoughtfulness
Interestingness
Cleverness
Researcher (n = 15)
Teacher (n = 15)
Undergraduate (n = 15)
Total (N = 45)
15 (11)
15 (9)
12 (3)
42 (23)
15 (4)
12 (5)
15 (11)
42 (20)
7 (3)
8 (2)
11 (1)
26 (6)
7
1
3
11
3
1
0
4
Note: (a) The rst numbers (or those with no parentheses) in the columns of Appropriateness, Novelty, and Thoughtfulness represent the number of judges
in each rater group and the total number of raters who included this criterion in their assessment. (b) The second numbers (or those with parentheses)
in the columns of Appropriateness, Novelty, and Thoughtfulness represent the number of judges in each rater group and the total number of raters who
regarded this criterion as primary one in the two tasks during their assessment.
3.1.4. Frame of interestingness

Interestingness was also considered by a few judges as a criterion to assess creativity. It included two sub-frames:
funniness and novelty. Judge 6 in the undergraduate group gave a 3 to #11 students answer to the earth task (i.e., the
student wrote we would live closer to the equator and all the life on earth would be killed and all the wardrobes would
change) and she explained, might kill life on earth followed by all the wardrobes would change made me laugh so hard I
bumped it up a point. Interestingness also implied novelty. Judge 12 in the teacher group explained, In regard to questions I
found amusing, I may have rated them higher for being novel as generally the responses I found amusing were unexpected.
Here something unexpected was similar to a novel idea.
3.1.5. Frame of cleverness
A couple of judges mentioned how they liked clever responses. Even for judges who tended to assign low ratings offered
higher ratings to those answers. In general, clever responses were not only original and feasible but also thought-provoking
and interesting. In this sense, frame of cleverness was comprised of the other four frames (i.e., appropriateness, novelty,
thoughtfulness, and interestingness), or, in other words, four frames were the sub-frames of cleverness. A few examples of
clever responses were identied by judges. For instance, the responses in which students suggested that if I had this, then
I would or could do this or that; those responses in which students made up assumptions but realized that their solutions
might not work and then provided a justication; and some funny responses. Judge 4 in the researcher group made this point
and said, The highest rated responses struck me as clever. They not only seemed feasible in reality, they were thoughtprovoking and interesting in their own right. He further explained, Im not sure that I can put into words the subjective
experience of cleverness, other than it strikes me as something interesting. Its a subtle emotion thats unaccompanied by
much else aside from an appreciation for what Ive just considered.
3.1.6. Relationships among the frames
Seen from Fig. 1, the criterion of cleverness encompasses the other four criteria and represents the core of all the relationships. In addition, similarities and differences between frames can be easily observed from the overlapping and independent
parts. For example, the frame of interestingness includes novelty, but it is not equal to novelty because the sub-frame of
funniness is not a part of novelty. Thoughtfulness cannot be viewed as a sub-frame of appropriateness because it contains
aspects that do not overlap with thoughtfulness. Further, novelty is not only a sub-frame of cleverness but also that of
interestingness, which suggests a close relationship between novelty and interestingness. On the other hand, the frame of
appropriateness is more closely associated with the frame of thoughtfulness. However, there seems no relationship between
novelty and appropriateness, which suggests that novelty and appropriateness might be two distinct criteria of assessing
creativity.
3.1.7. Criteria combination
In their assessment, most judges selected appropriateness (n = 42), novelty (n = 42), and thoughtfulness (n = 26) as their
criteria (see Table 1). In contrast, fewer judges considered interestingness (n = 11) and cleverness (n = 4). Furthermore, none
of the judges used the same set of criteria with the same interpretation. For instance, two judges in the researcher group,
judge 6 and 13 in the researcher group, whose average ratings were very close (i.e., 2.23 and 2.56), encompassed all of the
ve frames (i.e., appropriateness, novelty, thinking, cleverness, and interestingness) into their rating criteria. However, for
judge 13, appropriateness specically included three sub-frames: usefulness (does it work?), correctness (did they show
correct understanding of the situation?), and completeness of the responses (did they answer all parts of the question?).
For judge 6, appropriateness focused on correctness and characteristics of the respondents. In addition, judge 13s criterion
of novelty was based on the question did the response point something out others didnt mention or that I think is out of the
box because I didnt think of it? But judge 6 emphasized the use of subjective sense in assessing novelty. For the criterion
of thoughtfulness, judge 6 centered on both details and multiple ideas but judge 13 attended more to details.
Judges also chose different criterion as their primary consideration and assigned more weights to it. They typically viewed
appropriateness (n = 23), novelty (n = 20), or thoughtfulness (n = 6) as their primary criterion (see Table 1). For judges who
included the same aspects in their assessment, those who placed an emphasis on one aspect would give a higher rating to the
responses with that characteristic. For instance, in assessing #6 students response to the earth task, which indicated that
190
places like Mexico and Florida will become glacier and Alaska will become the new sunshine state. Judge 7 in the researcher
group and judge 10 in the undergraduate group included both appropriateness and thoughtfulness as their criteria and
agreed that it was a thoughtful but not appropriate idea. But the former judge assigned a 4 and the latter one assigned a 2
for this response because the primary criterion of judge 7 was thoughtfulness and that of judge 10 was appropriateness.
3.1.8. Judge groups use of criteria
There were similarities and differences with regard to three judge groups use of criteria (see Table 1). Specically, same
number of judges (n = 15) in the researcher group included appropriateness and novelty as two of their criteria, while only
a half (n = 7) in the same group included thoughtfulness. Fifteen judges in the teacher group included appropriateness and
twelve in the same group included novelty as their criteria, whereas only eight included thoughtfulness. In contrast, twelve
judges in the undergraduate group included appropriateness, fteen in the same group included novelty as their criteria, and
eleven included thoughtfulness. Comparatively speaking, fewer judges in the undergraduate group than those in the other
two groups viewed appropriateness as their criterion; fewer judges in the teacher group viewed novelty as their criterion,
and fewer judges in the researcher groups viewed thoughtfulness as their criterion.
When selecting their primary criterion, judges in the three groups also showed different patterns. Eleven judges in the
researcher group selected appropriateness, four selected novelty, and three selected thoughtfulness, as their primary criterion. Nine judges in the teacher group selected appropriateness, ve selected novelty, and only two selected thoughtfulness,
as their primary criterion. Only three judges in the undergraduate group selected appropriateness, eleven judges selected
novelty, and only one selected thoughtfulness, as their primary criterion. In other words, much more judges in the researcher
and teacher groups assigned more weight to the criterion of appropriateness than those in the undergraduate groups, while
much more judges in the undergraduate group assigned more weight to the criterion of novelty. This might suggest that the
researcher and teacher groups were more similar in this aspect.
3.2. Criteria for tasks
Judges applied different criteria to assess responses to the two tasks. Most of them focused more on appropriateness
in the water task while more on novelty and elaboration of the responses to the earth task. Seven of 45 judges specically
talked about their application of different primary criteria to the two tasks. All of them used appropriateness as their primary
criterion to assess the water task. Four judges used thoughtfulness as their rst criterion to assess the earth task and one
used novelty to assess it. Judge 8 in the teacher group explained, In the rst question, I was looking for a creative, yet
practical solution to the problem posed. In the second, I considered both detail and imagination, yet looked for realistic
or plausible predictions. This difference was related to the design of the two tasks. As introduced earlier, the water task
required students to create a plan to obtain drinkable water and the earth task required them not only to imagine what they
would do to help people survive but also to show their correct understanding of the impact of the new tilt on the living
things. For judges, the interpretation of the phenomenon part in the earth task could only be viewed as right or wrong, rather
than creative or not.
4. Discussion
Although science tasks were included in this research, the assessment criteria used by judges are consistent with prior
research. Appropriateness and novelty have long been regarded as the most important components of creativity. However,
how to accurately dene them was proved difcult (Amabile, 1982). Similar to judges in this study, researchers dened
appropriateness differently in the literature. For instance, Rothenberg and Hausman (1976, p. 7) viewed appropriateness as
a value, or intrinsic worth and/or pragmatic usefulness. Runco (1988) equated appropriateness with utility, and Sternberg
and Lubart (1999, p. 3) dened appropriateness as useful, adaptive concerning task constraints. In contrast, it is even
harder to dene novelty and few researchers attempted to do so. This is as Wilson, Guilford, and Christensen (1953, p. 362)
contended, Many writers dene an original idea as a new idea; that is, an idea that did not exist before. They are frequently
not in agreement, however, in their interpretation of new, since they use it with different connotations.
Although the criteria of thoughtfulness, interestingness, and cleverness were employed by fewer judges in this research,
they were also addressed in the eld from different perspectives. Creativity and thinking are naturally bonded and a massive
body of research has been conducted to understand creative thinking and the cognitive processes that are involved (Amabile,
1983; Mumford, Blair, & Marcy, 2006; Mumford, Mobley, Uhlman, Reiter-Palmon, & Doares, 1991). According to judges in
this study, thoughtfulness meant multiple ideas and details. These two aspects correspond to two criteria of assessing the
ideas generated from divergent thinking tests: uency and elaboration. Interestingness has seldom been discussed in the
literature of creativity assessment, but the relationship between humor and creativity was investigated by a large body of
studies. Koestler (1964, p. 31) argued that humor and creative behavior shared some cognitive skills, such as, the ability to
combine information in two or more disciplines. He stated, Humor is the only domain of creative activity where a stimulus
on a high level of complexity produces a massive and sharply dened response on the level of physiological reexes.
Other researchers (Amabile, 1987; Murdock & Ganim, 1993; Torrance, 1979; Van Gundy, 1984) viewed humor as a factor of
creativity. Some studies (Barron, 1982; Fabrizi & Pollio, 1987; Humke & Schaefer, 1996; Thorson & Powell, 1993; Weisberg &
Springer, 1961) conrmed that humor is related to creative personality and the scores of Torrance Tests of Creative Thinking.
191
Cleverness was regarded as a signicant aspect of creativity by Guilford and his colleagues (Christensen, Guilford, &
Wilson, 1957; Kettner, Guilford, & Christensen, 1959; Wilson et al., 1953; Wilson, Guilford, Christensen, & Lewis, 1954). In
their 1953 study, they contended that it was impossible to verify if a new idea was something that had not existed before, so
it was better to regard originality as a continuum (p. 363). They then proposed three alternative denitions of a novel idea,
including uncommon, remote, and clever. According to these researchers, three denitions include signicant aspects
of what is commonly meant by the term original (p. 363). Several factor analysis studies conducted by them (Kettner et al.,
1959; Wilson et al., 1954) supported that cleverness was a factor of originality, which was distinct from what the present
study suggested (i.e., originality or novelty is a component of cleverness). But cleverness has seldom been clearly dened
by Guilford and his colleagues and whether an idea was clever was mostly decided by judges. Several other researchers
followed Guilfords notion of cleverness and dened it in their studies. Treadwell (1970, p. 55) conceptualized cleverness
as the combination of aptness and funniness. Silvia et al. (2008, p. 85) described clever ideas as those that strike people
as insightful, ironic, humorous, tting, or smart. Here aptness and tting were similar to appropriateness identied in
this study; humorous and funniness can be viewed as one sub-frame of interestingness; insightfulness implies the idea of
thoughtfulness. In this regard, this research veried the meaning of cleverness identied in previous studies.
Furthermore, as indicated in this research, judges combined different criteria and primary criteria to assess creativity,
which suggested that judgment about creativity is probably personal. Even Amabile (1982, p. 1011) indicated that one of
the limitations of the product assessment is it seems unreasonable to expect that universal and enduring criteria even
subjective criteria could ever be agreed upon. This is also in line with the concept of personal creativity. According to
Runco (1996, p. 5), creativity relies on the individuals own personal logic, with personal criteria for the usefulness and
originality of a solution . . .. This notion was further expanded in the concept of mini-c (Beghetto & Kaufman, 2007, p. 73),
which focuses on the novel and personally meaningful interpretation of experiences, actions, and events.
This research also demonstrated that appropriateness and novelty were not closely related. This supported the ndings
of Runco and Charles (1993) study in which they examined the contribution of originality and appropriateness to the
judgments of creativity of idea pools with 71 college students as judges. The study showed that when ideas in the pools
were both original and appropriate, students ratings of creativity increased. But when ideas were not original, creativity
ratings decreased with more appropriate ideas identied. A recent study along the same line of work (Runco et al., 2005)
also reported a low shared variance (i.e., 7%) between originality and appropriateness of ideas.
The present study indicated that judges with various levels of expertise employed distinct criteria in their assessment
and the criteria also varied by task. These results conrmed prior studies on different judges and their ratings. For instance,
Hekkert and Van Wieringen (1996) used art experts and nonexperts to assess the artworks produced by young artists and
found that experts and nonexperts shared high agreement on originality, but not on craftsmanship and quality. Runco (1984)
also suggested that teachers might make more meaningful and accurate judgment in creativity than novices due to their
familiarity with students. Runco and Charles (1993) further argued that college students might use evaluative standards
that were different from other populations.
There are several limitations in this research. First, not all of the judges were asked follow-up questions or conducted
in-depth interviews. Therefore, some information might miss from the whole picture. Second, judges did not use the same
survey format: some of them took online surveys and some took paper and pencil survey, depending on their preference.
Overall, more judges liked to take the online survey due to its convenience. Generally speaking, it might be easier for judges
to gain an overview of all the responses when they were using paper and pencil format. Judges taking the online survey could
have an overview of the responses on the same web page but could not easily read all the responses between several pages.
This might affect their assessment to some degree. Third, although verbal protocol method was proved valid to examining
individuals cognitive process, Crisp (2012) suggested that the process of thinking aloud might interfere the rating process.
Fourth, qualitative studies are good for enhancing understanding and exploring personal experience and multiple realities,
but the results have limited generalization.
In the future, a follow-up quantitative study is needed to examine whether the criteria identied in the present study
can be used to assess creativity in other science tasks and in other domains. Reliability and factor analysis regarding these
criteria will be conducted. Other questions, such as, what characteristics of raters inuence their selection of criteria and
their assessment of creativity in general? will be further addressed.
Acknowledgements
I thank Dr. Janice Bizzari, Miss Erin N. Cerwinske, and Mrs. Meighan H. Scott for their help in collecting the original data.
Publication of this article was funded in part by Florida International University Open Access Publishing Fund.
Appendix A: Two science tasks.
1. You are on a cliff-bound, unknown oceanic island where you nd several coconut trees, some unidentied owers and a
few animals (like koalas, water rats, possums). You are thirsty but run out of water. You at last nd a stream nearby but
unfortunately it is murky, salty and obviously undrinkable. In this situation, how would you get drinkable water? Use
what you have learned about water and water evaporation to explain your ideas. Make your ideas as creative as possible.
192
2. In the year 2050, a meteorite narrowly brushes the earth. While a major explosion is avoided, it results in tipping the
earth 75 more than its current 22 tilt. How will this change in tilt affect climate of North America and the lives of the
people who live there? How will people need to adjust in order to survive (for example, food, agriculture, clothing, etc.)?
Use what you have learned about season and climate to explain your ideas. Make your ideas as creative as possible.
Appendix B: Instruction for rating.
Please read through the following answers to each science question that were obtained from 24 sixth graders and assign
a numerical rating between 1 and 5 based on your own criteria of creativity, with 1 being the least creative and 5 being
the most creative. It is very important that you use the full 15 scale, however, and not assign almost all the answers the
same rating. After each rating, there is a place where you can share your rationale for the rating you gave. This is just an
opportunity to share any thoughts or comments you have about the solution you rated. You DO NOT have to write anything,
however, a few words about how you think of the works would be very helpful.
References
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for
educational and psychological testing. Washington, DC: AERA.
Allen, A. D. (2010). Complex spatial skills: The link between visualization and creativity. Creativity Research Journal, 22, 241249. http://dx.doi.org/
10.1080/10400419.2010.503530
Amabile, T. M. (1982). Social psychology of creativity: A consensual assessment technique. Journal of Personality and Social Psychology, 43, 9971013.
http://dx.doi.org/10.1037/0022-3514.43.5.997
Amabile, T. M. (1983). The social psychology of creativity. New York: Springer-Verlag.
Amabile, T. M. (1987). The motivation to be creative. In S. Isaksen (Ed.), Frontiers in creativity: Beyond the basics (pp. 223254). New York: Bearly Limited.
Baer, J. (1994). Performance assessments of creativity: Do they have long-term stability? Roeper Review, 17, 712.
Baer, J. (2003). Impact of the Core Knowledge Curriculum on creativity. Creativity Research Journal, 15, 297300. http://dx.doi.org/10.1080/
10400419.2003.9651422
Barron, F. (1982). Personality and intelligence. In R. J. Sternberg (Ed.), Handbook of human intelligence (pp. 308351). New York: Cambridge University Press.
Beghetto, R. A., & Kaufman, J. C. (2007). Toward a broader conception of creativity: A case for mini-c creativity. Psychology of Aesthetics, Creativity, and the
Arts, 1, 7379. http://dx.doi.org/10.1037/1931-3896.1.2.73
Besemer, S. P. (1998). Creative Product Analysis Matrix: Testing the model structure and a comparison among products Three novel chairs. Creativity
Research Journal, 11, 333346. http://dx.doi.org/10.1207/s15326934crj1104 7
Besemer, S. P. (2006). Creating products in the age of design. Stillwater, OK: New Forums.
Besemer, S. P., & O Quin, K. (1986). Analyzing creative products: Renement and test of a judging instrument. Journal of Creative Behavior, 20, 115126.
http://dx.doi.org/10.1002/j.2162-6057.1986.tb00426.x
Besemer, S. P., & O Quin, K. (1987). Creative product analysis: Testing a model by developing a judging instrument. In S. G. Isaksen (Ed.), Frontiers of creativity
research: Beyond the basics. (pp. 341357). Buffalo, NY: Bearly Limited.
Besemer, S. P., & O Quin, K. (1999). Conrming the three-factor Creative Product Analysis Matrix model in an American sample. Creativity Research Journal,
12, 287296. http://dx.doi.org/10.1207/s15326934crj1204 6
Besemer, S. P., & Trefnger, D. J. (1981). Analysis of creative products: Review and synthesis. Journal of Creative Behavior, 15, 158178.
Brinkman, D. J. (1999). Problem nding, creativity style and the musical composition of high school students. Journal of Creative Behavior, 33, 6268.
Carson, S. (2006, April). Creativity and mental illness. In Invitational panel discussion hosted by Yales Mind Matters Consortium New Haven, CT.
Carspecken, P. F. (1996). Critical ethnography in educational research: A theoretical and practical guide. New York and London: Routledge.
Charyton, C., & Snelbecker, G. E. (2007). General, artistic, and scientic creativity attributes of engineering and music students. Creativity Research Journal,
19, 213225. http://dx.doi.org/10.1080/10400410701397271
Chong, D., & Druckman, J. N. (2007). Framing theory. Annual Review of Political Science, 10, 103126, doi: 10.072805.103054.
Christensen, P. R., Guilford, J. P., & Wilson, R. C. (1957). Relations of creative responses to working time and instruction. Journal of Experimental Psychology,
53, 8288. http://dx.doi.org/10.1037/h0045461
Cohen, L., Manion, L., & Morrison, K. (2011). Research methods in education (7th ed.). London: Routledge.
Crisp, V. (2008). The validity of using verbal protocol analysis to investigate the processes involved in examination marking. Research in Education, 79, 112.
Crisp, V. (2012). An investigation of rater cognition in the assessment of projects. Educational Measurement: Issues and Practice, 31(3), 1020.
Cropley, A. J. (1967). Creativity. London, UK: Longmans, Green.
Cropley, A. J. (2005). Creativity and problem-solving: Implications for classroom assessment. Leiceser, UK: British Psychological Society.
Cropley, D. H., & Cropley, A. J. (2008). Elements of a universal aesthetic of creativity. Psychology of Aesthetics, Creativity, and the Arts, 2, 155161.
http://dx.doi.org/10.1037/1931-3896.2.3.155
Demirkan, H., & Hasirci, D. (2009). Hidden dimensions of creativity elements in design process. Creativity Research Journal, 21, 294301.
http://dx.doi.org/10.1080/10400410902861711
Dollinger, S. J., & Shafran, M. (2005). Notes on Consensual Assessment Technique in creativity research. Perceptual and Motor Skills, 100, 592598.
Eisenberg, J., & Thompson, W. F. (2003). A matter of taste: Evaluating improvised music. Creativity Research Journal, 15, 287296.
http://dx.doi.org/10.1080/10400419.2003.9651421
Engel, L. C., & Ortloff, D. H. (2009). From the local to the supranational: Curriculum reform and the production of the ideal citizen in two federal systems,
German and Spain. Journal of Curriculum Studies, 41, 179198. http://dx.doi.org/10.1080/00220270802541147
Entman,
R.
M.
(1993).
Framing:
Toward
clarication
of
a
fractured
paradigm.
Journal
of
Communication,
43,
5158.
Fabrizi, M. S., & Pollio, H. R. (1987). Are funny teenagers creative? Psychological Reports, 61, 751761. http://dx.doi.org/10.2466/pr0.1987.61.3.751
Friedrich, T. L., & Mumford, M. D. (2009). The effects of conicting information on creative thought: A source of performance improvements or decrement?
Creativity Research Journal, 21, 265281. http://dx.doi.org/10.1080/10400410902861430
Gamson, W. A. (1997). News as framing: Comments on Graber. American Behavioral Scientist, 33, 157161. http://dx.doi.org/10.1177/0002764289033002006
Goffman, E. (1974). Frame analysis: An essay on the organization of experience. Boston: Northeastern University Press.
Guilford, J. P. (1950). Creativity. American Psychologist, 5, 444454. http://dx.doi.org/10.1037/h0063487
Haller, C. S., Courvoisier, D. S., & Cropley, D. H. (2011). Perhaps there is accounting for taste: Evaluating the creativity of products. Creativity Research Journal,
23, 99109. http://dx.doi.org/10.1080/10400419.2011.571182
193
Hekkert, P., & Van Wieringen, P. C. W. (1996). Beauty in the eye of expert and nonexpert beholders: A study in the appraisal of art. American Journal of
Psychology, 109, 389407.
Hickey, M. (2001). An application of Amabiles Consensual Assessment Technique for rating the creativity of childrens musical composition. Journal of
Research in Music Education, 49, 234244. http://dx.doi.org/10.2307/3345709
Hu, W., & Adey, P. (2002). A scientic creativity test for secondary school students. International Journal of Science Education, 24, 389403. Retrieve from
http://dx.doi.org/10.1080/09500690110098912.
Hu, W., Wu, B., Jia, X., Yi, X., Duan, C., Meyer, W., et al. (2013). Increasing students scientic creativity: The learn to think intervention program. Journal
of Creative Behavior, 47, 321. http://dx.doi.org/10.1002/jocb.20
Humke, C., & Schaefer, C. E. (1996). Sense of humor and creativity. Perceptual and Motor Skills, 82, 544546. http://dx.doi.org/10.2466/pms.1996.82.2.544
Kaufman, J. C., & Baer, J. (2012). Beyond new and appropriate: Who decides what is creative. Creativity Research Journal, 24, 8391.
http://dx.doi.org/10.1080/10400419.2012.649237
Kaufman, J. C., Baer, J., & Gentile, C. A. (2004). Differences in gender and ethnicity as measured by ratings of three writing tasks. Journal of Creative Behavior,
39, 5669. http://dx.doi.org/10.1002/j.2162-6057.2004.tb01231.x
Kaufman, J. C., Baer, J., Cole, J. C., & Sexton, J. D. (2008). A comparison of expert and nonexpert judges using the Consensual Assessment Technique. Creativity
Research Journal, 20, 171178. http://dx.doi.org/10.1080/10400410802059929
Kaufman, J. C., Plucker, J. A., & Baer, J. (2008). Essentials of creativity assessment. New Jersey: Wiley.
Kaufman, J. C., Baer, J., & Cole, J. C. (2009). Expertise, domains, and the Consensual Assessment Technique. Journal of Creative Behavior, 43, 223233.
Kaufman, J. C., Niu, W., Sexton, J. D., & Cole, J. C. (2010). In the eye of the beholder: Differences across ethnicity and gender in evaluating creative work.
Journal of Applied Social Psychology, 40, 496511. http://dx.doi.org/10.1111/j.1559-1816.2009.00584.x
Kaufman, J. C., Baer, J., Cropley, D. H., Reiter-Palmon, R., & Sinnett, S. (2013). Furious activity vs. understanding: How much expertise is needed to evaluate
creative work? Psychology of Aesthetics, Creativity, and the Arts, 7, 332340. http://dx.doi.org/10.1037/a0034809
Kettner, N. W., Guilford, J. P., & Christensen, P. R. (1959). A factor-analytic study across the domains of reasoning, creativity, and evaluation. Psychological
Monographs: General and Applied, 479, 131. http://dx.doi.org/10.1037/h0093745
Koestler, A. (1964). The act of creation. London: Macmillan.
Kozbelt, A. (2004). Originality and technical skill as components of artistic quality. Empirical Studies of the Arts, 22, 157170. http://dx.doi.org/
10.2190/NDR5-G09N-X7RE-34H7
Kozbelt, A. (2008). Hierarchical linear modeling of creative artists problem solving behaviors. Journal of Creative Behavior, 42, 181200. http://dx.doi.org/
10.1002/j.2162-6057.2008.tb01294.x
Kozbelt, A., Beghetto, R. A., & Runco, M. A. (2010). Theories of creativity. In J. C. Kaufman, & R. J. Sternberg (Eds.), The Cambridge handbook of creativity (pp.
2047). NY: Cambridge University Press.
Krippendorff, K. (2004). Content analysis: An introduction to its methodology. Thousands Oak, CA: Sage.
Larson, M. C., Thomas, B., & Leviness, P. O. (1999). Assessing the creativity of engineers. Design Engineering Division: Successes in Engineering Design Education,
Design Engineering, 102, 16.
Lin, C., Hu, W., Adey, P., & Shen, J. (2003). The inuence of CASE on scientic creativity. Research in Science Education, 33, 143162. Retrieved from
http://link.springer.com/journal/11165
Long, H. (2014a). An empirical review of research methodologies in creativity studies. , manuscript submitted for publication.
Long, H. (2014b). Rater effects in creativity assessment: A mixed methods approach. , manuscript submitted for publication.
McPherson, J. H. (1963). A proposal for establishing ultimate criteria for measuring creative output. In C. W. Taylor, & F. Barron (Eds.), Scientic creativity:
Its recognition and development (pp. 2429). New York, NY: Wiley.
Merriam, S. B. (2002). Qualitative research in practice: Examples for discussion and analysis. NewYork, NJ: John Wiley Sons.
Mumford, M. D., Mobley, M. I., Uhlman, C. E., Reiter-Palmon, R., & Doares, C. (1991). Process-analytic models of creative capacities. Creativity Research
Journal, 4, 91122. http://dx.doi.org/10.1080/10400419109534380
Mumford, M. D., Blair, C. S., & Marcy, R. T. (2006). Alternative knowledge structures in creative thought: Schema, associations, and cases. In J. C. Kaufman,
& J. Baer (Eds.), Creativity and reason in cognitive development (pp. 117136). Cambridge: Cambridge University Press.
Mumford, M. D., Vessey, W. B., & Barrett, J. D. (2008). Commentary: Measuring divergent thinking: Is there really one solution to the problem? Psychology
of Aesthetics, Creativity, and the Arts, 2, 8688. http://dx.doi.org/10.1037/1931-3896.2.2.86
Murdock, M. C., & Ganim, R. M. (1993). Creativity and humor: Integration and incongruity. Journal of Creative Behavior, 27, 5770.
Ortloff, D. H. (2007). Holding to tradition: Citizenship, diversity and education in post-unication Germany, a case-study of Bavaria (unpublished doctoral
dissertation). Bloomington: Indiana University.
Perez-Fabello, M. J., & Campos, A. (2011). Dissociative experiences, creative imagination, and artistic production in students of ne arts. Thinking Skills and
Creativity, 6, 4448. http://dx.doi.org/10.1016/j.tsc.2010.11.00
Plucker, J. A., & Makel, M. C. (2010). Assessment of creativity. In J. C. Kaufman, & R. J. Sternberg (Eds.), The Cambridge handbook of creativity (pp. 4873). NY:
Cambridge University Press.
Plucker, J. A., Holden, J., & Neustadter, D. (2008). The criterion problem and creativity in lm: Psychometric characteristics of various measures. Psychology,
of Aesthetics, Creativity, and the Arts, 2, 190196. http://dx.doi.org/10.1037/a0012839
Rein, M., & Schon, D. (1993). Reframing policy discourse. In F. Fischer, & F. Forester (Eds.), The argumentative turn in policy analysis and planning. London:
UCL Press Limited and Duke University Press.
Reiter-Palmon, R., Mumford, M. D., Boes, J. O., & Runco, M. A. (1997). Problem construction and creativity: The role of ability, cue consistency, and active
processing. Creativity Research Journal, 1, 923. http://dx.doi.org/10.1207/s15326934crj1001 2
Reiter-Palmon, R., Illies, M. Y., Buboltz, C., Cross, K. L., & Nimps, T. (2009). Creativity and domain specicity: The effect of task type on multiple indexes of
creative problem-solving. Psychology of Aesthetics, Creativity, and the Arts, 3, 7380. http://dx.doi.org/10.1037/a0013410
Rhodes, M. (1961). An analysis of creativity. Phi Delta Kappan, 42, 305311.
Rothenberg, R., & Hausman, C. R. (1976). The creative question. Durham, NC: Duke University Press.
Runco, M. A. (1984). Teachers judgments of creativity and social validation of divergent thinking tests. Perceptual and Motor Skills, 59, 711717.
http://dx.doi.org/10.2466/pms.1984.59.3.711
Runco, M. A. (1985). Reliability and convergent validity of ideational exibility as a function of academic achievement. Perceptual and Motor Skills, 61,
10751081. http://dx.doi.org/10.2466/pms.1985.61.3f.1075
Runco, M. A. (1986). Maximal performance on divergent thinking tests by gifted, talented, and nongifted children. Psychology in the Schools, 23, 308315.
http://dx.doi.org/10.1002/1520-6807(198607)23:3<308::AID-PITS2310230313>3.0.CO;2-V
Runco, M. A. (1988). Creativity research: Originality, utility, and integration. Creativity Research Journal, 1, 17. http://dx.doi.org/10.1080/
10400418809534283
Runco, M. A. (1996). Personal creativity: Denition and developmental issues. New Directions for Child Development, 72, 330. http://dx.doi.org/
10.1002/cd.23219967203
Runco, M. A. (2003). Education for creative potential. Scandinavian Journal of Education, 47, 317324.
Runco, M. A., & Albert, R. S. (1985). The reliability and validity of ideational originality in the divergent thinking of academically gifted and nongifted
children. Educational and Psychological Measurement, 45, 483501. http://dx.doi.org/10.1177/001316448504500306
194
Runco, M. A., & Charles, R. E. (1993). Judgments of originality and appropriateness as predictors of creativity. Personality and Individual Differences, 15,
537546. http://dx.doi.org/10.1016/0191-8869(93)90337-3
Runco, M. A., & Jaeger, G. J. (2012). The standard denition of creativity. Creativity Research Journal, 24, 9296. http://dx.doi.org/10.1080/
10400419.2012.650092
Runco, M. A., McCarthy, K. A., & Svenson, E. (1994). Judgments of the creativity of artwork from students and professional artists. Journal of Psychology, 128,
2331.
Runco, M. A., Illies, J. J., & Eisenman, R. (2005). Creativity, originality, and appropriateness: What do explicit instructions tell us about their relationships?
Journal of Creative Behavior, 39, 137148. http://dx.doi.org/10.1002/j.2162-6057.2005.tb01255.x
Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18, 119144.
Scheufele, D. A. (1999). Framing as a theory of media effects. Journal of Communication, 49, 103122. http://dx.doi.org/10.1111/j.1460-2466.1999.tb02784.x
Seidman, I. (2006). Interviewing as qualitative research: A guide for researcher in education and the social sciences (3rd ed.). NY: Teachers College Press.
Shapiro, R. J. (1970). The criterion problem. In P. E. Vernon (Ed.), Creativity (pp. 257269). New York: Penguin.
Silvia, P. J., Winterstein, B. P., Willse, J. T., Barona, C. M., Cram, J. T., Hess, K. I., et al. (2008). Assessing creativity with divergent thinking
tasks: Exploring the reliability and validity of new subjective scoring methods. Psychology of Aesthetics, Creativity, and the Arts, 2, 6885.
http://dx.doi.org/10.1037/1931-3896.2.2.68
Simonton, D. K. (1990). History, chemistry, psychology, and genius: An intellectual autobiography of historiometry. In M. A. Runco, & R. S. Albert (Eds.),
Theories of creativity (pp. 92115). Newbury Park, CA: Sage.
Stein, M. I. (1953). Creativity and culture. Journal of Psychology, 36, 311322. http://dx.doi.org/10.1080/00223980.1953.9712897
Stemler, S. E. (2004). A comparison of consensus, consistency, and measurement approaches to estimating interrater reliability. Practical Assessment, Research
& Evaluation, 9(4). Retrieved from http://PAREonline.net/getvn.asp?v=9&n=4
Sternberg, R. J. (1985). Implicit theories of intelligence, creativity, and wisdom. Journal of Personality and Social Psychology, 49, 607627.
http://dx.doi.org/10.1037/0022-3514.49.3.607
Sternberg, R. J. (1988). The triarchic mind: A new theory of human intelligence. New York: Viking.
Sternberg, R. J., & Lubart, T. I. (1999). The concept of creativity: Prospects and paradigms. In R. J. Sternberg (Ed.), Handbook of creativity (pp. 315). New
York, NY: Cambridge University Press.
Taylor, C. W., & Barron, F. (Eds.). (1963). Scientic creativity: Its recognition and development. New York, NY: Wiley.
Terkildsen, N., & Schnell, F. (1997). How media frames move public opinion: An analysis of the Womens Movement. Political Research Quarterly, 50, 879900.
http://dx.doi.org/10.1177/106591299705000408
Thorson, J. A., & Powell, F. C. (1993). Development and validation of a multidimensional sense of humor scale. Journal of Clinical Psychology, 49, 1323.
http://dx.doi.org/10.1002/1097-4679(199301)49:1<13::AID-JCLP2270490103>3.0.CO;2-S
Torrance, E. P. (1979). The search for satori and creativity. Buffalo, NY: Creative Education Foundation.
Treadwell, Y. (1970). Humor and creativity. Psychological Reports, 26, 5558.
Van Gundy, A. B. (1984). Managing group creativity. New York: American Management Association.
Weisberg, P. S., & Springer, K. J. (1961). Environmental factors in creative function. Archives of General Psychiatry, 5, 554564.
http://dx.doi.org/10.1001/archpsyc.1961.01710180038005
Wilson, R. C., Guilford, J. P., & Christensen, P. R. (1953). The measurement of individual differences in originality. The Psychological Bulletin, 50, 362370.
http://dx.doi.org/10.1037/h0060857
Wilson, R. C., Guilford, J. P., Christensen, P. R., & Lewis, D. J. (1954). A factor-analytic study of creative-thinking abilities. Psychometrika, 19, 297311.
http://dx.doi.org/10.1007/BF02289230
Ziv, N., & Keydar, E. (2009). The relationship between creative potential, aesthetic response to music, and music preferences. Creativity Research Journal, 21,
125133. http://dx.doi.org/10.1080/10400410802633764

Creative Assessment

Încărcat de

Informații document

Descriere originală:

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Creative Assessment

Încărcat de

Drepturi de autor:

Formate disponibile

Thinking Skills and Creativity 13 (2014) 183194

Contents lists available at ScienceDirect

Thinking Skills and Creativity

More than appropriateness and novelty: Judges criteria of

Tel.: +1 305 348 3228.

H. Long / Thinking Skills and Creativity 13 (2014) 183194

1.1. Criteria of assessing creative products

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

3.1.3. Frame of thoughtfulness

H. Long / Thinking Skills and Creativity 13 (2014) 183194

3.1.4. Frame of interestingness

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

H. Long / Thinking Skills and Creativity 13 (2014) 183194

S-ar putea să vă placă și