Sunteți pe pagina 1din 12

Running head: TEST REVIEW 1

Test Review for TOEFL iBT and IELTS Academic

Xiayu Guo

Colorado State University


TEST REVIEW 2

Introduction

With the development of globalization, the increasing Chinese students prepare to

continue the study in English-speaking countries. The majority of Chinese students are going to

the United States or Commonwealth countries. Before they pursue the further study, they need to

decide which test they should take. When they consider the tests, they are supposed to

investigate tests from following aspects: Does this test meet their needs? Does this test assess

their ability validly? Are the test tasks helpful for their future study in English-speaking

countries? In this case, I am going to do this review to compare these two tests for my students.

The tests I review are TOEFL iBT and IELTS Academic. The TOEFL iBT test was

originally from the Center for Applied Linguistics and first administered by the Modern

Language Association. In 1973, ETS was to administer the exam with the guidance of the

TOEFL board. The IELTS Academic is managed by the British Council, IDP: IELTS Australia

and Cambridge English Language Assessment. I review these two tests because they are official

language proficiency test for students who want to be admitted in universities in English-

speaking countries. These two tests both include assessment of the 4 language skills to provide

an overall proficiency measure. By reviewing the tests, I hope to learn the way of designing a

test and how a test assesses students’ language abilities. I envision using the tests in English class

for adult learners (18 to 22 years old) in China who want to continue study in the United States.
TEST REVIEW 3

International English Language Testing System (IELTS)

Publisher: Cambridge English Language Assessment and the British Council

Publication Date: 1989

Target Population: Students seeking entry to a university or institution of higher education

offering degree and diploma courses

Cost: between $215-$240USD

Overview

The IELTS is managed by the British Council. IDP: IELTS Australia and Cambridge English
Language Assessment and was established in 1989. It was launched in 1980 by Cambridge
English Language Assessment and the British Council. IELTS went live in 1989, and was
revised in 1995. Further revision went live in 2001. There are three types of IELTS test: IELTS
for study, IELTS for migration and IELTS for work. The IELTS for study test contains IELTS
Academic and IELTS General Training. The IELTS Academic test is suitable for entry to study
at undergraduate or postgraduate levels, and also for professional registrations. An extended
description of the IELTS Academic is provided in Table 1.

Table 1

Test Purpose The IELTS Academic test is a paper-based test that is designed for non-native speakers
to enroll in undergraduate or postgraduate levels, and for professional registration
purposes. The test assesses whether you are ready to begin studying in an English
language environment. The IELTS Academic test can be used as a proficiency test. The
IELTS Academic test is accepted by most Australian, British, Canadian and New
Zealand academic institutions.

Test The IELTS Academic test consists of 4 components: listening, reading, writing and
Structure speaking. The listening, reading and writing are completed on the same day with no
breaks in between them. The speaking is completed up to a week before or after the other
tests. The total time is 2 hours and 45 minutes.
Listening: There are 4 sections. Each section has 10 questions. The first two sections are
social situations. In section 1, there is a conversation between 2 people. In section 2,
there is a monologue. The section 3 (conversation) and 4 (monologue) are academic
situations. The recordings are heard once. The accents include British, Australian, New
Zealand, American and Canadian. The listening component lasts 40 minutes. The task
types are multiple choice, matching, plan/map/diagram labelling, form/note/table/flow-
chart/summary completion and sentence completion.
Reading: There are 40 questions. Test-takers need to reading for gist, main ideas, detail,
skimming, understanding logical argument and recognizing opinions, attitudes and
purposes. There are 3 long tests including descriptive and factual to the discursive and
TEST REVIEW 4

analytical ones. The reading component lasts 60 minutes. The task types are multiple
choice, identifying information, identifying the writer’s views, matching
information/headings/features/sentence endings, sentence/summary/note/table/flow-
chart/diagram label completion and short-answer questions.
Writing: There are 2 tasks. In task 1, test-takers is presented with a graph, table, chart or
diagram. They need to describe, summarize or explain the information in own words. In
task 2, test-takers is asked to write an essay in response to a point of view, argument or
problem. Responses to both tasks must be in a formal style. Notes or bullet points are
note acceptable. The writing component lasts 60 minutes.
Speaking: There are 3 parts in the speaking component. In part 1, the test-taker is asked
general questions about himself/herself about familiar topics such as family, interests and
study. This part lasts 4 to 5 minutes. In part 2, the test-taker is given a card which asks
him/her to talk about a particular topic. The test-taker has 1 minute to prepare and 2
minutes to answer. Then the examiner asks 1 or 2 questions on the same topic. In part 3,
the test-taker is asked further questions about the topic in part 2. The test-taker is
supposed to discuss abstract and specific ideas. This part lasts 4 to 5 minutes.

Scoring of There is no pass or fail. The test-takers receive a score of each component. The
the Test individual scores are averaged and rounded to produce an Overall Band Score. The
IELTS Academic is scored on a nine-band score. Each band represents to a competence
in English. If the average score across the 4 components ends in .25, it is rounded up to
the next half band. If it ends in 5.75, it is rounded up to the nest whole band. The 9 bands
are described as follows (Retrieved from
https://en.wikipedia.org/wiki/International_English_Language_Testing_System#Scoring)
9 Expert User - Has full operational command of the language: appropriate, accurate and
fluent with complete understanding.
8 Very Good User - Has fully operational command of the language with only
occasional unsystematic inaccuracies and inappropriateness. Misunderstandings may
occur in unfamiliar situations. Handles complex detailed argumentation well.
7 Good User – Has operational command of the language, though with occasional
inaccuracies, inappropriateness and misunderstandings in some situations. Generally
handles complex language well and understands detailed reasoning.
6 Competent User – Has generally effective command of the language despite some
inaccuracies, inappropriateness and misunderstandings. Can use and understand fairly
complex language, particular in familiar situations.
5 Modest User – Has partial command of the language, coping with overall meaning in
most situations, though is likely to make many mistakes. Should be able to handle basic
communication in own field.
4 Limited User – Basic competence is limited to familiar situations. Has frequent
problems in understanding and expression. Is not able to use complex language.
3 Extremely Limited User – Conveys and understands only general meaning in very
familiar situations. Frequent breakdowns in communication occur.
2 Intermittent User – No real communication is possible except for the most basic
information using isolated words or short formulae in familiar situations and to meet
immediate needs. Has great difficulty understanding spoken and written English.
TEST REVIEW 5

1 Non User – Essentially has no ability to use the language beyond possibly a few
isolated words.
0 Did not attempt the test – No assessable information.

Statistical Mean band scores for female test takers


Distribution Listening:6.2, reading: 6.1, writing: 5.6, speaking: 5.9, overall: 6.0
of Scores Mean band score for male test takers
Listening: 6.1, reading: 6.0, writing: 5.5, speaking: 5.8, overall: 5.9
(IELTS Official Website, 2016)

Standard SEM for listening: 0.37, SEM for academic reading: 0.38.
Error of The SEM is interpreted in terms of final band score reported for listening and reading
Measurement components.

Evidence of There is no reliability report for speaking and writing components because they are not
Reliability item-based. The reliability of reading and listening is reported using Cronbach’s alpha.
This is a reliability estimate which measures the internal consistency of the 40-items test.
The type of reliability is inter-rater reliability.
The average alpha across listening versions: 0.91
The average alpha across academic reading versions: 0.90

Evidence of There is no evidence of validity for IELTS Academic, but I find the information about
Validity construct validity for reading test. Construct validity is a measure of how closely a test
reflects the model of reading underlying the test (Moore, Morton & Price, 2007). This
domain is study at university level in reading test. Therefore, if the ability to scan for
specific information is an important part of university reading requirements, then the
reading construct should diagnose the ability to quickly locate specific information
(Alderson, 2000).
TEST REVIEW 6

Test of English as a Foreign Language (TOEFL)

Publisher: Center for Applied Linguistics under the direction of Stanford University

Publication Date: 1964

Target Population:

• Students planning to study at a higher education institution

• English-language learning program admission and exit

• Scholarship and certification candidates

• English-language learners who want to track their progress

• Students and workers applying for visas

Cost: $160-$250USD

Overview

The TOEFL test is a standardized test to measure English ability for non-native speakers wishing
to enroll in English-speaking universities. The test was originally developed at the Center for
Applied Linguistics and was first administered in 1964 by the Modern Language Association. In
1965, the College Board and English Testing Service assumed responsibility for the continuation
of the TOEFL testing program. Since 2005, the TOEFL iBT has replaced the CBT and PBT.,
although the paper-based testing is still used in some areas. An extended description of TOEFL
is provided in Table 2.

Table 2

Test Purpose The TOEFL iBT test measures test takers’ ability to use and understand
English at the university level. It evaluates how well the test takers combine
your reading, listening, speaking and writing skills to perform academic
tasks. More than 100 colleges, agencies, and other institutions in more than
130 countries accept TOEFL scores. In addition, immigration departments
use TOEFL scores to issue residential and work visas, medical and licensing
agencies use scores for professional certification purposes, and individual
use scores to measure their progress in learning English.
TEST REVIEW 7

Test Structure Generally, during the test, test takers are asked to read, listen, and speak in
response to a question, listen and then speak in response to a question, read,
listen and then write in response to a question.
Reading: 60-80 minutes, 36-56 questions. Read 3 or 4 passages from
academic texts and answer questions. The questions are about main ideas,
details, inferences, essential information, sentence insertion, vocabulary,
rhetorical purpose, overall ideas and completing tables.
Listening: 60-90 minutes, 34-51 questions. Listen to lectures, classroom
discussions and conversations and then answer questions. The conversations
involve a student and a professor or a campus service provider. The lectures
are a self-contained portion of an academic lecture.
Speaking: 20 minutes, 6 tasks (2 independent+4 integrated) Express an
opinion on a familiar topic; speak based on reading and listening tasks about
academic lectures or conversations about campus life.
Writing: 50 minutes, 2 tasks. Write essay responses based on reading and
listening tasks; support an opinion in writing.
The test may include extra questions in the reading or listening section that
do not count in the score.
Scoring of the Test In TOEFL iBT test, 4 sections have different criteria.
Reading – high (22-30): Test takers understand academic texts in English
that required a wide range of reading abilities regarding of the difficulty of
the texts. Test takers have a very good command of academic vocabulary
and grammatical structure. They can understand and connect information
and make appropriate inferences and synthesize ideas. They can abstract
major ideas from a text.
intermediate (15-21): Test takers understand academic texts in English that
require a wide range of reading abilities, although their understanding of
certain part is limited. They have difficulty understanding conceptually
dense text.
low (0-14): Test takers understand some of the information in texts, but the
understanding is limited. Test takers have a command of basic academic
vocabulary, but the understanding of less common vocabulary is
inconsistent. They have limited ability to understanding and connect
information and identifying author’s purpose.
Listening – high (22-30): Test takers understand conversations and lectures
that present a wide range of listening demands such as difficult vocabulary,
complex grammatical structures, abstract or complex ideas and making
sense of unexpected or seemingly contradictory information.
intermediate (15-21): Test takers understand the difficult vocabulary
complex grammatical structures and ideas, but they have difficulty
understanding unexpected or seemingly contradictory information.
low (0-14): Test takers understand the main ideas and some important details
in conversations, but have difficulty understanding lectures and
conversations with abstract ideas.
Speaking – good (26-30): Test taker’s speech is clear and fluent. The use of
grammar and vocabulary is effective. Ideas are developed coherently.
TEST REVIEW 8

fair (18-25): Test taker’s speech is mostly clear with only occasional errors.
Grammar and vocabulary are somewhat limited and include errors.
limited (10-17): Listeners sometimes have trouble understanding the speech
because of noticeable problems with pronunciation, grammar and
vocabulary. Test taker is not able to fully develop the ideas.
weak (1-16): The responses are incomplete.
Writing task 1 – good (4.0-5.0): The responses are relating to the lecture
and reading, but may have slightly imprecision in the summary of the main
points and the use of English is occasionally ungrammatical and unclear.
fair (2.5-3.5): The responses are relating to the lecture and reading, but an
important idea or ideas may be missing, unclear or inaccurate. The responses
may have grammatical mistakes or vague/incorrect use of words.
limited (1.0-2.0): Test takers fail to understand the lecture or reading
passage. They have deficiencies in relating the lecture to the reading and
many grammatical errors.
Writing task 2 – good (4.0-5.0): The responses are well-organized and
developed but the use of English that is occasionally ungrammatical and
unclear. The connection between ideas is a little weak.
fair (2.5-3.5): Test takers provide essays with reasons, examples and details,
but may not provide enough specific supporting ideas. They may have
difficulty organizing and connecting ideas. There are grammatical errors.
limited (1.0-2.0): The responses contain insufficient detail. There are many
ungrammatical errors and unclear expressions.
Statistical the total mean and standard deviations for males: mean = 81.4, SD = 20.3
Distribution of the total mean and standard deviations for females: mean = 82.5, SD = 19.3
Scores
Standard Error of reading: 3.35, listening: 3.20, speaking: 1.62, writing: 2.76
Measurement The SEM can be interpreted as a measure that defines a score range in which
one’s true ability score lies with a certain level of probability. The smaller
the SEM, the more precise the scores will be.
Evidence of The responses of speaking and writing section are sent to ETS Online
Reliability Scoring Network and evaluated by 3 to 6 raters. The reliability for reading,
listening and writing are relatively high. Reading: 0.85, listening: 0.85,
speaking: 0.88, writing: 0.74.
Evidence of Validity • Reviews of research and empirical studies of language use at
English-medium institution of higher education
• Pilot studies of tasks; systematic development of rubric
• Investigations of discourse characteristics of written and spoken
responses and strategies used in reading.
• Factor analyses of test form
• Relationships between test scores and self-assessments, academic
placements, local assessments of international teaching performance.
Development of materials to help test users prepare for the test and interpret
scores appropriately; long-term empirical study of washback.
(Validity Evidence Supporting the Interpretation and Use of TOEFL iBT™)
TEST REVIEW 9

I envision using the tests in English class for learners in universities in China. Students

are adults from 18 to 22 years old who want to continue the study in the United States. All

students in the classroom are Chinese. They are all undergraduate students. The students are in

different English proficiency levels but at least mid-intermediate. They understand the

grammatical structures of most sentences and most academic texts. They can communicate with

others in English easily although sometimes there are grammatical errors in expressions.

There are 5 classes every week. Students do not need to have specific textbooks in class. I

will prepare teaching materials for students every class. The class consists of between 10 to 25

students. The materials include tasks and discourses in previous authentic test tasks. The original

materials are available on ets.org. Students can download the original version. Sometimes, I

revise the test tasks according to students’ needs and learning difficulties. Students have

opportunity to finish the tasks in class. If students have any questions and confusions, I can help

them.

Considering the teaching context and results of reviews, I think the TOEFL iBT test is

more appropriate for my students. The TOEFL iBT test is designed for test takers who plan to

study in higher education institution and English-language learning program, and it is

corresponding to the study purpose of my students. Moreover, the 4 sections in the TOEFL iBT

test are closely related to academic and campus life that students may encounter in study. For

example, there are conversations between a student and a campus service provider, lectures and

reading passages in various academic disciplines. They are all very common situations on

campus, so the tasks and materials are practical and useful. Additionally, the interpretation of

scores in TOEFL iBT test is more concrete so students can understand the why they get scores in

a particular level. In the guide of understanding TOEFL scores, advice for improvement is
TEST REVIEW 10

provided. The advice is helpful for students’ further study and for the revision of my lesson

plans.

Furthermore, the evidence of validity of the TOEFL iBT test is complete. It is accessible

in a score and test report on ets.org. Each proposition has a correspond evidence. However, for

the IELTS Academic test, there is no information about validity on the official website. The

score interpretation for the IELTS Academic is not as concrete and specific as it of the TOEFL

iBT. It seems that the interpretation for the IELTS Academic only indicates test takers’ language

level proficiency and their use of English. It fails to give students a clear direction for their

further study and how to improve the performance.

However, the disadvantage of the TOEFL iBT test is the test time. The test lasts 3 hours

and 45 minutes. It is an hour longer than the IELTS Academic. Students need to be highly

focused on the test for almost 4 hours and the test requires more time to administer.
TEST REVIEW 11

Reference

Alderson, J.C. (2000). Assessing reading, Cambridge University Press, Cambridge

Enright, M. (2011). TOEFL iBT® Research Insight Series 1 Volume 6: TOEFL Program

History. English Testing Service. Retrieved from

https://www.ets.org/s/toefl/pdf/toefl_ibt_insight_s1v6.pdf

ETS. (2010). TOEFL iBT® Research Insight Series 1 Volume 1: TOEFL iBT Test Framework

and Test Development. English Testing Service, p. 2. Retrieved from

https://www.ets.org/s/toefl/pdf/toefl_ibt_research_insight.pdf

ETS. (2011). TOEFL iBT® Research Insight Series 1 Volume 3: Reliability and Comparability

of TOEFL iBT Scores. English Testing Service, p. 6. Retrieved from

https://www.ets.org/s/toefl/pdf/toefl_ibt_research_s1v3.pdf

ETS. (2011). TOEFL iBT® Research Insight Series 1 Volume 4: Validity Evidence Supporting

the Interpretation and Use of TOEFL iBT Scores. English Testing Service, p. 3. Retrieved

from https://www.ets.org/s/toefl/pdf/toefl_ibt_insight_s1v4.pdf

ETS. (2014). A Guide to Understanding TOEFL iBT ®Scores. Retrieved from

https://www.ets.org/Media/Tests/TOEFL/pdf/TOEFL_Perf_Feedback.pdf

ETS. (2018). About the TOEFL iBT® Test. English Testing Service. Retrieved from

https://www.ets.org/toefl/ibt/about

IELTS. (2017). IELTS for study. International English Language Testing System. Retrieved

from https://www.ielts.org/en-us/what-is-ielts/ielts-for-study

IELTS. (2017). IELTS scoring in detail. International English Language Testing System.

Retrieved from https://www.ielts.org/en-us/ielts-for-organisations/ielts-scoring-in-detail


TEST REVIEW 12

IELTS. (2017). IELTS Test takers performance 2016. International English Language Testing

System. Retrieved from https://www.ielts.org/en-us/teaching-and-research/test-taker-

performance

IELTS. (2017). IELTS Test performance 2016. International English Language Testing System.

Retrieved from https://www.ielts.org/en-us/teaching-and-research/test-performance

Moore, T, Morton, J. & Price, S. (2007). Construct Validity in the IELTS Academic Reading

Test: A Comparison of Reading Requirements in IELTS Test Items and in University

Study, p.6. Retrieved from https://www.ielts.org/-/media/research-

reports/ielts_rr_volume11_report4.ashx

S-ar putea să vă placă și