Documente Academic
Documente Profesional
Documente Cultură
Study Overview Measurement and Evaluation Perspectives On Scaling Teacher Affect with Multiple Measures
Judy R. Wilkerson, Ph.D., Florida Gulf Coast University The International Conference on Educational Measurement and Evaluation (ICEME 2012) SEAMEO-INNOTECH Manila, Philippines August 9-11, 2012
(c) 2012. Judy R. Wilkerson, Ph.D. All rights reserved.
Teacher dispositions include the values, attitudes, and beliefs about children, subject matter, and the skills of teaching that cause teachers to act in positive or negative ways. While measurement and evaluation interact closely, they are typically reported separately. The Dispositions Assessment Aligned with Teacher Standards (DAATS) scale of commitment to teaching skills is reviewed using both measurement and evaluation professional standards. Evidence of validity and reliability (measurement) and evidence of utility and feasibility (evaluation) are presented.
1
(c) 2012, Judy R. Wilkerson, Ph.D.. All rights reserved.
7/11/2012
Research Purposes
1. To determine if the 2008 results could be replicated and improved through enhanced scoring rubrics, a more diverse sample of students (undergraduate, masters level, and advanced graduate students), and better connectivity (higher completion rate for all three instruments). 2. To model and describe the integration of measurement and evaluation standards in the review and use of assessment instruments
Significance
To illustrate the potential for using a mix of a quantitative and qualitative analysis identify high and low levels of in a critical area often underassessed teacher dispositions in the identification, celebration, and remediation of teachers and teacher candidates.
Research Questions
1. What are the psychometric qualities of three DAATS instruments when combined into a single decision-making measure? 2. To what extent do measurement and evaluation standards support the use of the DAATS battery?
10
Sample
3 instruments: BATS, ETQ, and SRA 190 students in two public universities in Florida.
Rasch Measures
Instruments were calibrated using the Andrich rating scale model (Andrich, 1988) of IRT and Winsteps software, version 3.71 (Wright & Linacre, 1998; Linacre, 2011). Items were combined into a single scale that included both dichotomous items (BATS) and rating scale items (ETQ and SRA). A linear transformation of the traditional mean of zero and scale of one was used, providing a mean of 50 and a scale of 10 to facilitate use.
92 undergraduates 49 masters level 10 alternative certification 19 advanced graduate (Ed.S. candidates) 3 other and 17 with unknown student status.
11
12
7/11/2012
Selected Statistics
Means: 50 for items and 58 persons. Ranges: 11 to 84 for items 43-83 for persons. Standard deviations: for items is almost two logits (18.4) and for people about one logit (10.9). Mean fit statistics: near the expected ranges of 1.0 for mean squares and .0 for standardized zs. Of the 70 items, only three exceeded the 1.5 outfit MNSQ expectation(the highest was 2.05), and none exceeded 1.5 in infit. Cronbachs alpha (KR-20) is estimated at .96. The person reliability is .87 with separation of about three levels (2.67). Item reliability and separation are .98 and 7.63 respectively.
Category Structure
For all 6 categories, the range of MNSQs are all near the expected 1.0 (.83-1.02), with only 3 dropping below a.9. Category probabilities as expected.
13
14
Utility
Examined through review of four students whose scores matched expectations. Low scoring student should be counseled; high scoring student could be an effective leader.
Faculty perceptions of individual students and DAATS results matched (construct validity). Scores rise with degree level (another study by Quinn), indicating predictive validity. No statistically significant differences between gender and ethnic categories.
(c) 2012, Judy R. Wilkerson, Ph.D.. All rights reserved.
15
16
Utility (cont.)
Feasibility
BATS is easy to administer and score; other instruments require a commitment to taking the time. Rater reliability requires training, but FACETS results (next study) indicate it is working.
17
18
7/11/2012
Conclusions: General
1. 2. 3. 4. 5. 6. The INTASC Principles provide a useful construct definition that can be measured holistically and by Principle. The Thurstone agree/disagree scale contributes to the identification of strongly and weakly committed teachers. The Bloom and Krathwohl affective taxonomy works in assessment, yielding proficiency levels with a credible category structure. Combining affective instruments using different methods into a single Rasch scale overcomes weaknesses inherent in the instrument types. A well-designed measurement device leads to useful, feasible, and accurate evaluation decisions. A qualitative analysis of individual constructed response items enhances Rasch score interpretations, making them more useful for evaluation at the individual and program levels.
19
Empirical and judgmental evidence of validity (construct, content, predictive). Reliability (see RQ 1). Utility: high quality data for individual and program evaluation (see next study, too.) Feasibility: requires time commitment and rater training.
20
A mixed methods approach was used to assess the dispositions of 40 early childhood pre-service teachers using four instruments from the DAATS (Dispositions Assessments Aligned with Teacher Standards) battery. BATS: Beliefs About Teaching Scale ETQ: Experiential Teaching Questionnaire SRA: Situational Reflection Assessment CDC: Candidate Disposition Checklist Quantitative and qualitative analysis of two case studies (one excellent and one needing improvement) and an analysis of four INTASC Principles targeted for program improvement.
21
(c) 2012, Judy R. Wilkerson, Ph.D.. All rights reserved.
22
Purposes
Determining the extent to which:
(1) quantitative and qualitative data about teacher candidates dispositions converged with faculty perceptions and (2) the instruments and measures provided useful information for candidate counseling and program improvement efforts
23
24
7/11/2012
Persons 18 and 22
Person 18:
Faculty perceptions: Enthusiastic about teaching and high in the cognitive domain. DAATS pinpointed specific, but limited, needs for improvement. Faculty perceptions: Average student whose interactions with faculty and students have not always shown positive affect toward the teaching profession or people. Lacks enthusiasm for, and knowledge of, the essential skills to become a successful teacher. Results from 4 instruments converged. Needed remediation with Principles 3 and 9 (diverse learners and reflection/CI) with a specific focus on people interactions issues.
26
Person 22:
25
Program Evaluation
3 INTASC Principles were targeted for program improvement 1, 7, 9.
1. 2. 3. 4. 5. 6. 7. 8. 9. 1. 2. 3. 4.
Goals:
Take responsibility for continuous learning about content and skills Make content meaningful to support learning Use non-verbal behaviors, including demeanor, to motivate and teach Work collegially with others to plan, counsel, advocate, and teach Reflect and remain flexible Adapt to evolving needs Think positively about all (students, parents, colleagues, supervisors) Remain open to feedback and input from all stakeholders Report ethical violations of colleagues and friends Lessons and activities in courses Focused discussions with supervising teachers and univ. coordinators Systematic faculty observations Monthly discussions of progress
Strategies:
27
28
Conclusions
Research Purpose #1: Quantitative and qualitative data about teacher dispositions converged with faculty perceptions in the three cases analyzed. This supports the use of a mixed methods approach and provides judgmental evidence of construct validity. Research Purpose #2: The instruments and measures provided useful information for candidate counseling and program improvement efforts. Faculty are now implementing minor and major remedial efforts based on data rather than intuition with these cases as well as others not described in this article. The study supported the utility of the instruments.
(c) 2012, Judy R. Wilkerson, Ph.D.. All rights reserved.
30