Sunteți pe pagina 1din 10

December 2017

Volume 1, Issue 2

Editorial Notes The CALT Group is a professional


Welcome to the second edition of The CALT Report. In community  of language teachers,
this issue, we feature an article about implementing assessment coordinators, and other
a teacher-based assessment system in a general
English program in Australia, and we learn from the
educators. It is a community of
experiences of a teacher tasked with creating a new practice for anyone interested in
program-wide assessment instrument. Our member effective classroom assessment in
spotlight shines on Emma Tudor in San Francisco, ESL/EFL programs. Visit our website
and NB quotes provide "food for thought" about at http://groupspaces.com/CALT. 
assessment. In addition, all our photos were taken at
the 2017 CALT Conference, November 3-4 at the
University of Arizona, Tucson. We hope you were able to attend one of our online
workshops this past year. If not, look for more
sessions in 2018. We hold two types of 60-minute
meetings:
     Talkshops are unstructured Q&A discussions
about anything related to classroom assessment. 
     Workshops are more structured sessions about a
particular area of classroom assessment, e.g.,
assessing speaking, and typically involve a
background reading to help focus the discussion.
Attendees can bring any questions or issues related
to the topic to the session for discussion.

Please feel free to send your feedback on the


newsletter or any of the online workshops to
CALT@groupspaces.com . Email us if you would like
CALT Conference co-chairs Mariana Menchola-Blanco and Eddy to make a contribution to The CALT Report, as well.
White get ready for the Welcome Ceremony. (Eddy shows off his leg
brace after recent knee surgery.)
1
Feature Article

Implementing and learning


from teacher-based assessment
in a General English course
BY KYLE SMITH AND LAUREN FAULL-ORTIZ
Kyle is the Assistant Programs Manager and Lauren is the Coordinator of the General English program
at Queensland University of Technology International College in Brisbane, Australia.
INTRODUCTION Certainly, this type of assessment practice is something
"Classroom assessment is an ongoing process through that we have sought to introduce to the General English
which teachers and students interact to promote (GE) program at the Queensland University of Technology
greater learning. The assessment process involves using International College (QUTIC). In this article, we will
a range of strategies to make decisions regarding discuss Davison and Leung’s (2009) model of TBA, what it
instruction and gathering information about student looks like in practice here at QUTIC, and what we have
performance or behavior in order to diagnose students’ learned in 2017 as we have developed and implemented a
problems, monitor their progress, or give feedback for TBA system. 
improvement" (Butler & McMunn 2006).
The above definition of ‘classroom assessment’ was BACKGROUND
included in the very first CALT Report. It is, however, For many students in QUTIC’s GE program, they are
somewhat problematic: on the basis of this definition, it taking the first step in their pathway through our English
is difficult to distinguish ‘classroom assessment’ from for Academic Purposes (EAP) program into undergraduate
similar terms, including ‘formative assessment’ and and postgraduate courses. Pathway students typically
‘assessment for learning’. In a 2009 paper about enroll in two five-week sessions of GE, ahead of an EAP
‘teacher-based assessment’ (TBA), Davison and Leung Entry Test; at the end of each session, teachers make a
claim that all these various terms are “used summative judgment as to whether their students will
interchangeably to refer to the same practices and progress to the next five-week stage of the program or
procedures” and... repeat the current one.

"...tend to be used to signify a more teacher-mediated, Prior to 2017, to assist teachers in making these
context-based, classroom-embedded assessment progression recommendations, GE students took eight
practice, explicitly or implicitly defined in opposition to standardized tests – two for every macro-skill – every five
traditional externally set and assessed large scale formal weeks. The reading and listening tests were adapted from
examinations used primarily for selection and/or commercially available materials by a team of four
accountability purposes" (p. 395). ongoing language educators, who also developed the
writing and speaking tests from scratch. More broadly,
As Davison and Leung (2009) conceive of it, TBA has this assessment system had essentially the same goal as
several important characteristics:
1. It happens in classrooms, as opposed to testing                                                                            continued on p. 3
     centers or halls;
2. Students’ own teachers are directly involved in:
     a. Planning, designing and modifying assessment
          tasks and activities,
     b. Giving immediate and constructive feedback, and
     c. Making judgments about their students’
         performance and work;
3. Students are more actively involved, perhaps through
     self- and peer-assessment; and
4. Teachers’ judgments are based on a range of samples
    of student work, collected over a period of time.

2
"Implementing and learning from TBA" continued...
large-scale English language proficiency testing: maximizing Additionally, the work produced by students in response to
fairness through reliability, understood in psychometric the standardized tasks seemed to us to demonstrate a very
terms. To achieve this goal, student and teacher attention narrow dynamic range . This is an audio production concept,
was focused “on highly discrete and defined tasks and … which very simply, refers to the difference in, say, a 
standardized procedures and standardized outcomes” recording of a vocal performance between the loudest and
(Murphy, 1994, p. 194). quietest notes; to standardize the listening experience 
across different playback devices (e.g., car stereos,
In the case of reading and listening tests, these tasks smartphones, etc.), this difference is typically reduced by
consisted of selected response and short gap-fill items. lowering the volume of the loud parts and raising the volume
Results were reported to students as a percentage for each of the quiet parts. We’ve applied the concept to language
test; sometimes students also received feedback on how assessment to refer to the qualitative differences (in relation
they performed on the different task types, e.g., ”You got all to criteria such as overall effectiveness, use of grammar and
of the multiple-choice questions correct but had difficulty vocabulary, etc.) between stronger and weaker responses by
with gap fill so work on your vocabulary and spelling.” students to test tasks, both written and spoken. We noticed
that the GE writing and speaking test tasks were significantly
In the case of writing tests, students completed one timed, reducing the dynamic range of responses–students of higher
impromptu task consisting of either one or two paragraphs, ability were producing mediocre responses which were hard
depending on their level; at Pre-Intermediate and to distinguish from those produced by students of lower
Intermediate levels, they were essentially writing mini- ability–making it harder for teachers to make confident
essays. Students received an overall grade for each test of progression recommendations. 
either Satisfactory or Needs work and feedback in several
different forms, including a checklist and error correction Feedback to aid the learning and teaching processes had not
codes, all created internally. Speaking tests often consisted been given much attention during the development of the
of highly-structured and well-rehearsed dialogues or assessment system; the emphasis was much more on the
monologues; the grading and feedback was very similar to summative function than the formative. Consequently, the
the writing tests. These tasks generally led to “simplistic system did not facilitate direct or unambiguous
judgments” of students’ capabilities (Hamp-Lyons, 1994, p. communication with students about their learning; rather, it
52) and made it quite difficult for teachers confidently to used “a form of code that only the most effective learners can
“evaluate higher-order composing skills including creativity, decode” (Boud, 2000, p. 155). This code took several different
problem solving, and metacognition which are badly needed forms (i.e., percentages, Satisfactory, Needs Work, error
in students’ future study and professional careers” (Ibid, p. correction codes) but these seemed likely to “act as a barrier
86).  Given that these are also increasingly important as to student understanding” because, when using such a code,
students progress through the GE program, onto the EAP “teachers encode their comments into a symbol whose
program and into university entry programs and beyond, meaning is not shared only to have students attempt to
their under-representation in the assessment system was a decode this symbol to gain feedback” (Ibid, p. 157).  
significant drawback. 
                                                                               continued on p. 4

Jarred Brinkman from Iowa State University presents at Jacqueline Church and Cyndriel Miemban from
CALT 2017. Northern Arizona University prepare for their talk.

3
"Implementing and learning from TBA" continued...
Implementing TBA AT QUTIC In designing this ‘task cycle’, we had three goals:
Starting in March 2017, the current GE team, consisting     1. To improve the formative assessment in terms of both
of the authors and English language teachers, took several          quantity and quality,
important steps away from this system towards TBA. We     2. To indicate to teachers and students the importance
organised a workshop with the teachers and discussed with         of giving, receiving and acting on feedback, and
them Wiggins’s belief that “all assessment is subjective; the     3. To help teachers gather a richer and more trustworthy
task is to make the judgment defensible and credible”          representation of students’ capabilities and
(1994, p. 136). We informed them that, in order to achieve a          achievement to support summative judgments.
more defensible and credible assessment system, we
would: In Figure 1, the arrows labelled ‘Collect’ indicate stages
     • Reduce the number of tests per sessions to just one per where the teacher could gather evidence of students’
       macro-skill capabilities, usually in the form of scanned documents or
     • Use ‘assessment task cycles’ to ‘fill the gap’ created by audio recordings saved to a shared drive.
       the missing tests (Figure 1).  
                                                                              continued on p. 9

ood fr Thought
Figure 1

4
CALT Best Practices
This section presents recommended assessment practices
and principles that teachers should incorporate in the
courses they teach to maximize and measure student learning,
as well as to develop their own assessment literacy.
(Compiled by Eddy White, PhD) 

High-Quality Assessment
High-quality assessment practices are those that provide results verifying and promoting targeted student learning.
There are a number of fundamental aspects of such high-quality practices.
----------------------------------------------------------------------------------------------------------------------------------------------------------
                                                     High Quality Assessment Practices (Cheng & Fox, 2017)
----------------------------------------------------------------------------------------------------------------------------------------------------------
1. Alignment: The degree of agreement among curriculum, instruction, standards, and assessments (tests). In order to
achieve alignment, we need to select appropriate assessment methods which reflect or represent clear and appropriate
learning outcomes or goals.
----------------------------------------------------------------------------------------------------------------------------------------------------------
2. Validity: The appropriateness of inferences, uses, and consequences that result from assessment. This means that a
high-quality assessment process (i.e., the gathering, interpreting, and using of the information elicited) is sound,
trustworthy and legitimate based on the assessment results.
----------------------------------------------------------------------------------------------------------------------------------------------------------
3. Reliability: The consistency, stability, and dependability of the assessment results are related to reliability. This
quality criteria guards against the various errors of our assessments (e.g. inconsistency in marking student errors).
----------------------------------------------------------------------------------------------------------------------------------------------------------
4. Fairness: This is achieved when students are provided with an equal opportunity to demonstrate achievement and
assessment yields scores that are comparably valid. It requires transparency, in that all students know the learning
targets, criteria for success, and on what and how they will be assessed. Fairness also means students are given equal
opportunities to learn, and assessment tasks and procedures avoid stereotyping and bias.
----------------------------------------------------------------------------------------------------------------------------------------------------------
5. Consequences: This term is associated with the results of the use or misuse of assessment results. The term
washback-the influence of testing on teaching and learning- is now commonly used. Assessment can motivate (and
when it is of low quality, potentially demotivate) students to learn. The student-teacher relationship is influenced by
the nature of assessment.
----------------------------------------------------------------------------------------------------------------------------------------------------------
6. Practicality and Efficiency: Considerations are given to the information that is gathered by assessment. A teacher’s
life is extremely busy, and this influences the choice of assessment events, tools and processes. Are the resources,
effort and time required for assessment worth the investment? 
----------------------------------------------------------------------------------------------------------------------------------------------------------
Our classroom assessment principles and practices should incorporate these fundamental aspects of high-quality
assessment (EW). 

Best practices in this edition of the newsletter come from the recent book, Assessment in the Language Classroom
(Cheng & Fox, 2017, pp.11-12). 

5
Teacher as Learner BY MICHAEL SCHWARTZ
Considering the teacher also as a learner in the classroom (and St. Cloud University, MN
workplace), in this section we ask a member to describe a memorable

assessment experience and the learning that occurred.

   Authenticity, Integration, and Assessment: the format was too difficult or too easy? What

                    Lessons Learned would it ultimately tell us about our students’

listening and speaking skills? More specifically,

Norm referenced, criterion referenced, what would it tell us about their academic

summative, formative, discrete point, global, oral/aural skills? What impact would the test

top-down, bottom up, integrated, authentic, have on the students’ identities, motivation, and

teacher-made, standardized, multiple choice, fill investment?

in the blank, cloze, short answer, etc…. Am I

doing the right thing? These terms, their I decided to design an “integrative” test in which

applications, implementation, and most students listened to a talk from the popular TED

importantly, implications for students’ futures Talks website. Students were asked to listen to

were all swirling around in my head and turning the talk, take notes, and then answer several

in stomach as I was writing the mid-term short-answer or multiple-choice test items.

programmatic assessment instrument for the EAP Students were only allowed to listen once, since

Listening and Speaking bridge classes at the that is their reality in their other university

university where I work.  classes. The results were, as you might predict,

awful. To be fair, I decided to curve the test by

I have been working in adult ESL for over 25 30 points, allowing students who scored 40%,

years and have written unit tests and final exams 50%, and 60% to receive a passing grade. Of

for specific classes, proctored standardized tests, course, this meant that some students earned

served as a holistic grader for writing and over 100% but that was okay.

speaking assessments, and even directed

calibration trainings. However, not until I was In reflection, I learned several important things.

asked to take on the responsibilities of director First and foremost, I learned that you must try,

for an established English for Academic but that in trying you must also be open to

Purposes program, where teachers and staff had learning from your mistakes. My first attempt at

a better understanding of the semester processes program-wide assessment was a disaster, but

than I, have I been given the responsibility of with adjustments, the final exam will be a better

creating an assessment instrument intended to be instrument for measuring language development.

used for measuring student progress in the Second, I learned that while integrative

first seven weeks of the semester. I felt the assessment instruments are still the most

weight of the institution, the students’ futures, authentic, there are some realities to authenticity

the teacher’s expectations, and more on my that cannot be replicated within the context of

shoulders. The pressure had become nearly

debilitating. What if the instrument I created was                                           continued on p. 7

ineffective? What if

6
"Teacher as Learner" continued...

summative assessment. In this instance, I was perspective it is a crap-shoot in which they have

reminded that students had not been prepared for no control. Third, I know that I have a lot to

the Ted Talks topic in the same way that students learn about assessment and have compiled a

enrolled in Biology 101, for example, had been. holiday reading list. While it is important to be

After all, the students in Bio 101, have had half concerned about and to understand the terms and

of a semester’s worth of biology content processes identified at the beginning, ultimately,

presented to them. Thus, in this sense, an if you are fair to your students, then their

integrative skills assessment instrument is not identities, motivations, and investments will

authentic at all. Rather, from the students’ remain intact and they, too, will learn.

Left: Nasrin Nazemi from the University of


Washington presents at CALT 2017 in
Tucson.

Below Left: Justin Jernigan, Ph.D., from


Georgia Gwinnett College and Yingling Liu,
Ph.D., a visiting scholar at the University of
Arizona, give their conference presentation.

-----------------------------------------------
THE CALT REPORT CONTRIBUTORS
 DECEMBER 2017
Editor
Eddy White
Production & Design
Holly Wehmeyer
Contributors
Isabella Anikst
Jarred Brinkman
Jacqueline Church
Lauren Faull-Ortiz
Justin Jernigan
Yingliang Liu
Mariana Menchola-Blanco
Cyndriel Miemban
Nasrin Nazemi
Michael Schwartz
Kyle Smith
Michael Thomas
Emma Tudor
---------------------------------------------------
7
Member Spotlight

Q. In your context, with your students, what assessment-

Emma Tudor related issues do you and your colleagues have?

A. Consistency of grading, calculating grade averages,

students comprehension of the grading system, and

Emma is the communication of that.

Director of

Academic Q. What specific assessment challenges do you or your

Management for colleagues face in the classroom?

A. A lot of our assessment is task based or communication


EF International
focused, which is quite subjective. It can, therefore, vary
Campuses in the
from class to class and teacher to teacher. While rubrics are
U.S. and Canada
provided, they can be interpreted differently. 

Q. Describe your current position and teaching situation.


Q. What have you learned about assessment in the past
A. I am currently the Director of Academic Management for
year that you have tried to incorporate into your teaching
EF International Campuses in the USA and Canada,
practices?
responsible for academics in 14 language schools.
A. The importance of transparency and communication about

assessment with students is vital. They need to know what


Q. What assessment responsibilities do you have?
they are being graded on, how that is weighted, and how they
A. I work with the global central team on developing
can improve upon that. I have encouraged giving students a
assessment policies and processes which are implemented
copy of the assessment criteria policy and talking through it
across all our worldwide schools.
with them in class, giving 1-to-1 feedback to students about

their grades and assessment results, and giving them advice


Q. Does your program have an Assessment Committee?
on how they can improve. Mentoring and providing feedback
A. Essentially, yes. It’s part of the central academic team
is an essential part of the assessment process, I believe.    
but we do not work solely on assessment.  

Q. What is an area of assessment that you want to learn


Q. If yes, how does the Assessment Committee function?
more about?
A. We have monthly online meetings and annual in persona
A. How we can be more consistent and accurate with
meetings where we review assessment policies and
formative assessment?
processes. We take feedback from the school staff, teachers,

and administrators, as well as students, and look at feedback


Q. When you want to forget about assessment, teaching,
and assessment results to inform decisions.
and work, what do you like to do to relax?  

A. Travelling the world and walking my Boston Terrier. 

8
"Implementing and learning from TBA" continued...
At this workshop in March, we also initiated a curriculum Dynamic range
development process comprising the following steps: One clear trend that has emerged from the collected data is
------------------------------------------------------------------------------ the distinction between strong, average, and weak student
Step            Activity performances. The integration of skills in the task cycles
------------------------------------------------------------------------------ and also the requirement for students to act on feedback as
   1                Teachers work in groups to design one they refine their initial responses helps to highlight students
                     assessment task cycle (typically involving who are confident responding to the challenges of the task
                     integration of at least two macro-skills) for each (and, by extension, more likely to cope well in the next stage
                     five-week stage of the course. The GE of their academic pathway); those with ability but who are
                     Coordinator also designs task cycles. underperforming; and those with a potential need for
------------------------------------------------------------------------------ further support or intervention. Overall, we have found that
   2                Completed task cycles are collected by the GE TBA creates more favorable conditions for students to
                     Coordinator and shared with the entire GE demonstrate their language abilities than the timed
                     teaching team. impromptu independent writing tasks and, in this sense, is
------------------------------------------------------------------------------ more fair to students, although it is important to
   3               New task cycles are implemented in classrooms acknowledge that, from a psychometric perspective,
                    by teachers. reliability is somewhat reduced.
------------------------------------------------------------------------------
   4               Students’ spoken and written responses are Teachers’ judgments
                    analysed for key performance features. As samples of student work are collected throughout the
------------------------------------------------------------------------------ course, they reflect a wide variety of tasks, topics, and
   5               Insights into the tasks, the teaching, and the approaches, and also highlight students’ abilities to act on
                    learners gained through these analyses lead to feedback and, in this way, our TBA system supports more
                    further changes to assessment task design and credible and defensible summative judgments about
                    are shared with the teaching team.  students’ work. The tasks involve more authentic,
------------------------------------------------------------------------------ spontaneous language and behaviors, and they are more
closely aligned with the practices and values that teachers
This process has been repeated seven times between April wish to assess. Also, through this collection process, we
and December 2017, with each iteration taking about five have effectively compiled individual student portfolios
weeks. The overall result has been to align the assessment which have given us a clearer picture as to how different
system more with the four key features of Davison and individuals progress through our course and also some of
Leung’s (2009) TBA model listed above. We have also learnt the risk factors which can help identify students who are not
a great deal about task design, our teachers, and also the making progress.
learners and their capabilities.
                                                                               continued on p. 10
LESSONS LEARNED

Active involvement of students


TBA has yielded valuable insights for all of us about
involving students as active participants in their own
learning. We have learned that students can edit their work
effectively, making numerous grammar corrections with a
high degree of efficacy when converting their work to a
different mode (for instance transcribing an audio file or
typing written work) and with little or no intervention from
their teacher or just by referring back to a stimulus text.
Also, we have seen that students can justify changes made
to their work, demonstrating higher-order processing skills
(e.g., when asked to use the comment tool in Word to
defend changes to a draft, student comments focused on
clarity of message, the need for further explanation or Isabella Anikst and Michael Thomas from the UCLA Extension
clarification, emphasis, missing key information, and listen to Eddy make an important point.
misquoted or incorrect details).
9
"Implementing and learning from TBA" continued...
TEACHER LEARNING / DEVELOPMENT If you would like to keep up to date with these efforts, or
By involving teachers in all stages of the TBA decisions — find out more about our activities this year, please feel free
planning, designing, implementing, giving feedback, and to email us at k15.smith@qut.edu.au or
making judgment s— we feel that we have initiated a lauren.faullortiz@qut.edu.au.
teacher learning project which has been both challenging
and rewarding for all concerned. The collected evidence has NOTE
created a record of the GE teachers’ skills, approaches, 1. For more on this, see Fulcher, et al. (2011, p. 23), who
strengths, and learning, and indicated teacher learning in describe an approach to assessment based on ‘thick
the following areas: description’ of student performance and “an analysis of how
     • Increased confidence in designing assessment tasks and people use language in actual communicative contexts”
       ‘owning’ assessment decisions, which is more flexible than a traditional psychometric
     • Feedback targeting practical steps to future approach and does “not assume a linear, unidimensional,
       improvement, and reified view of how second language learners
     • Critical analysis of student work for evidence of communicate”; and “pragmatic, focusing as [it does] upon
       comprehension of reading and listening texts. observable action and performance, while attempting to
relate actual performance to communicative competence.”
Additionally, teacher energy has been redirected from
repetitively marking and preparing students for “highly REFERENCES
discrete and defined tasks” (Murphy, 1994, p. 194) which Boud, D. (2000). Sustainable assessment: Rethinking
generate ambiguous, coded information, to more authentic     assessment for the learning society. Studies in Continuing
tasks that integrate both skills and materials and provide     Education, 22(2), 151-167.
the basis for meaningful discussions with students Butler, S. M., & McMunn, N. D. (2006). A Teacher’s Guide to
regarding what they can do next to further develop.      Classroom Assessment: Understanding and Using
    Assessment to Improve Student Learning. Jossey-Bass.
CONCLUSION Davison, C. & Leung, C. (2009). Current issues in language
To sum up, we have seen that an assessment system with     teacher-based assessment. TESOL Quarterly, 43(3), 393-
the TBA features described by Davison and Leung (2009) can     415.
be successful in our context (i.e., an Australian university Fulcher, G., Davidson, F. & Kemp, J. (2011). ‘Effective rating
language centre GE program). Not only that, we have     scale development for speaking tests: Performance
repeatedly seen teachers and students rising to the     decision trees’. Language Testing, 28(1), 5-29.
challenge and producing work which is delightful, Hamp-Lyons, L. (1994). Interweaving assessment and
surprising, and inspiring. Our misgivings about the previous     instruction in college ESL writing classes. College ESL,
assessment system and our belief in TBA as a better     4(1), 43-55.
alternative have been, we feel, validated many times over Murphy, S. (1994). Portfolios and curriculum reform:
this year.     Patterns in practice. Assessing Writing, 1(2), 175-206.
Wiggins, G. (1994). The constant danger of sacrificing
At the end of 2017, however, our assessment system is a     validity to reliability: Making writing assessment serve
hybrid of TBA and the three standardized tests which occur     writers. Assessing Writing, 1(1), 129-139.
every five-week session. In 2018, our focus will be on
resolving the contradictions and tensions inherent in such a
hybrid, most likely by designing new tests which align more
closely with the TBA tasks. 

10

S-ar putea să vă placă și