Sunteți pe pagina 1din 4

key concepts in e l t

Corpus-aided language learning


Li-Shih Huang

A corpus is a large collection or database of machine-readable texts


involving natural discourse in diverse contexts (Bernardini 2000). Such
discourses can be spoken, written, computer-mediated, spontaneous, or
scripted and may represent a variety of genres (for example everyday
conversations, lectures, seminars, meetings, radio and television
programmes, and essays). Some readily available corpora include the
British National Corpus (BNC, http://www.natcorp.ox.ac.uk), which
contains 100 million words from written and spoken language in a variety of
contexts, the Michigan Corpus of Academic Spoken English (MICAS E,
http://micase.elicorpora.info), which features 1.8 million words of
speech in various academic contexts, and the Corpus of Contemporary
American English (COCA), with 410 million words
(http://www.americancorpus.org).1
Although corpus linguistics (i.e. computer-assisted analysis
techniques for studying texts) is a young specialization, its usefulness
in teaching and learning has received growing attention and
recognition (for example Hunston 2002; Sinclair 2004; Conrad 2005;
OKeeffe, McCarthy, and Carter 2007; Bennett 2010; Reppen 2010). In
particular, researchers have identified corpus data as resources that provide
descriptive insights relevant to how people use language and as tools that
enable students and instructors to analyse both how people use different
language forms at various levels of formality and how language fulfills
multiple speech functions across contexts. Corpus data suggest that
individuals often do not use language as specified in grammar books and
that word meanings vary across contexts and users (Biber and Reppen
2002).
Over the past ten years, a growing number of studies have shown how
learners can use corpus data to further their language learning (see
Hunston op.cit.; Boulton 2010). Numerous corpus linguists (for example
Gavioli and Aston 2001) have pointed out that learning activities centred on
analysing corpus data are consistent with current principles of languagelearning theory, that is students develop more autonomy when they receive
guidance about how to observe language and make generalizations. Such
activities promote noticing and grammatical consciousness raising
(Schmidt 1990), which can enhance second language learning and
development. Despite the growing interest in corpora and corpus-aided
learning, however, many teachers believe that incorporating corpora into
their teaching would be too technically challenging or time consuming
E LT Journal Volume 65/4 October 2011; doi:10.1093/elt/ccr031

The Author 2011. Published by Oxford University Press; all rights reserved.
Advance Access publication May 5, 2011

481

(Boulton 2010). Yet, while some researchers have suggested substantial


training is necessary (for example Estling Vannestal and Lindquist 2007),
others have provided evidence that only a minimal amount of training is
needed (for example Boulton 2008). Some have also recommended using
paper-based materials generated from corpora as a viable alternative to
accessing corpora via computers (Boulton 2010).
A key pedagogical approach for using corpora in language teaching and
learning is data-driven learning (DDL), which emerged in the mid-1980s.
DDL was defined as the use in the classroom of computer generated
concordances to get students to explore the regularities of patterning in the
target language, and the development of activities and exercises based on
concordance output (Johns and King 1991: iii). As Johns (1994: 297) stated,
what distinguishes the DDL approach is the attempt to cut out the
middleman as far as possible and to give direct access to the data so that the
learner can take part in building up his or her own profiles of meaning and
uses. Furthermore, corpus data [offer] a unique resource for the
stimulation of inductive learning strategiesin particular the strategies of
perceiving similarities and differences and of hypothesis formation and
testing (ibid.). By extension, the corpus-aided discovery learning (CADL)
approach entails encouraging learners to take the role of language
researchers by systematically engaging in discovery learning (Gavioli 2000)
and in learning how to learn through observations, analyses, interpretations,
and presentations of language-use patterns in corpus data. In the C A DL
approach, learning about language use is driven by a process of enquiry that
works toward understanding or problem solving, and corpora are used as
mediational tools (Vygotsky 1978) rather than as the basis for language
teaching and learning. Furthermore, instructors adhering to the CADL
approach play a critical role in facilitating or guiding the process of
discovery, which depends on the learners needs, stages of learning, and
levels of proficiency.
Researchers have generally agreed that corpus data enrich our
understanding of language use and are an important resource for language
teaching and learning. The use of corpora in language teaching is not
without controversies, however. Among the debates featured in Seidlhofer
(2003), for example, some scholars have advocated using real examples
only in the classroom (for example Sinclair 1997), while others, in contrast,
wonder whether the discourse in corpora, taken out of its original context,
can still be considered authentic, real, or natural, thereby questioning the
efficacy of analysing displaced language that may not be relevant to learners
linguistic and sociocultural contexts. In response to Widdowsons (1998)
remark that corpora may provide samples of genuine language produced by
language users with real communication goals but do not necessarily
guarantee that learners can participate in discourse in ways that lead to
learning, researchers such as Gavioli and Aston (op.cit.) note that learners
can still authenticate language samples by adopting an observers role to
critically analyse the data, which will raise their awareness of lexical,
grammatical, and textual issues as they restructure their views about
language use in real situations. Similarly, Carter (1998: 501) argues that
while real English from corpora can be unrealistic for classroom
instruction and thus modified language used in the classroom that is based
482

Li-Shih Huang

on learners needs and levels might be more pedagogically viable and


realistic, learners should be provided with opportunities to develop a feel
for the language through corpus data. The validity of analysing corpora to
capture language use across seemingly limitless contexts or to describe the
workings of real English around the world has also been questioned. Some
scholars point out that communicative contexts are not restricted to native
speaker discourse, and, as such, language teaching should not be based
simply on descriptive facts generated from largely native speaker-oriented
corpora (Prodromou 1996).2
Despite these debates, technological advancements have undoubtedly
enhanced language learners and instructors access to corpora, and the
plethora of articles and books written for language-teaching researchers and
practitioners published during the past five years suggest that attention to
and interest in using corpora for teaching and learning purposes will
continue for the foreseeable future.
Notes
1 For more examples, visit http://corpus.byu.edu
and International Corpus of English: http://icecorpora.net/ice.
2 The Vienna-Oxford International Corpus of
English (V O I C E) (http://www.univie.ac.at/voice)
is one such corpora that collects English spoken by
non-native language users in various contexts.
V O ICE comprises one million words of naturally
occurring, non-scripted, face-to-face interactions
by over 1,200 speakers with 50 different first
languages.
References
Bennett, G. 2010. Using Corpora in the Language
Learning Classroom. Ann Arbor, MI: Michigan
University Press.
Bernardini, S. 2000. Systematising serendipity:
proposals for concordancing large corpora with
language learners in L. Burnard and T. McEnery
(eds.). Rethinking Language Pedagogy from a Corpus
Perspective. Frankfurt am Main: Peter Lang.
Biber, D. and R. Reppen. 2002. What does frequency
have to do with grammar teaching? Studies in Second
Language Acquisition 24: 199208.
Boulton, A. 2008. Looking for empirical evidence
for DD L at lower levels in B. LewandowskaTomaszczyk (ed.). Corpus Linguistics, Computer Tools,
and Applications: State of the Art. Frankfurt am Main:
Peter Lang.
Boulton, A. 2010. Data-driven learning: taking the
computer out of the Equation. Language Learning
60/3: 534572.

Corpus-aided language learning

Carter, R. 1998. Orders of reality: C A N C O DE,


communications, and culture. E LT Journal 52/1:
4356.
Conrad, S. 2005. Corpus linguistics and L2 teaching
in E. Hinkel (ed.). Handbook of Research in Second
Language Teaching and Learning. Mahwah, NJ:
Lawrence Erlbaum Associates.
Estling Vannestal, M. and H. Lindquist. 2007.
Learning English grammar with a corpus:
experimenting with concordancing in a university
grammar course. ReC AL L 19/3: 32950.
Gavioli, L. 2000. The learner as researcher:
introducing corpus concordancing in the classroom
in G. Aston (ed.). Learning with Corpora. Houston,
TX: Athelstan/Bologna: C L U E B.
Gavioli, L. and G. Aston. 2001. Enriching reality:
language corpora in language pedagogy. E LT
Journal 55/3: 23846.
Hunston, S. 2002. Corpora in Applied Linguistics.
Cambridge: Cambridge University Press.
Johns, T. 1994. From printout to handout: grammar
and vocabulary teaching in the context of data-driven
learning in T. Odlin (ed.). Perspectives on Pedagogical
Grammar. Cambridge: Cambridge University Press.
Johns, T. and P. King. (eds.). 1991. Classroom
concordancing. English Language Research Journal
4: 2745.
OKeeffe, A., M. McCarthy, and R. Carter. 2007. From
Corpus to Classroom: Language Use and Language
Teaching. Cambridge: Cambridge University Press.
Prodromou, L. 1996. Correspondence. E LT Journal
50/1: 889.
Reppen, R. 2010. Using Corpora in the Language
Classroom. Cambridge: Cambridge University Press.

483

Schmidt, R. 1990. The role of consciousness in


second language learning. Applied Linguistics 11/2:
12958.
Seidlhofer, B. 2003. Controversies in Applied
Linguistics. Oxford: Oxford University Press.
Sinclair, J. 1997. Corpus evidence in language
description in A. Wichmann, S. Fligelstone,
T. McEnery, and G. Knowles (eds.). Teaching and
Language Corpora. New York, NY: Longman.
Sinclair, J. 2004. How to Use Corpora in Language
Teaching. Amsterdam: John Benjamins Publishing
Company.
Vygotsky, L. S. 1978. Mind in Society: The Development
of Higher Psychological Processes. Cambridge, MA:
Harvard University Press.

484

Li-Shih Huang

Widdowson, H. G. 1998. Context, community


and authentic language. T E S O L Quarterly 32/4:
70516.
The author
Li-Shih Huang is an Associate Professor of Applied
Linguistics and Learning and Teaching Centre
Scholar-in-Residence at the University of Victoria,
Canada. Her current research examines academic
language learning needs and outcomes assessment,
corpus-aided discovery learning, and learner
strategies in language learning and language testing
contexts.
Email: lshuang@uvic.ca

S-ar putea să vă placă și