A Little Bit About Differences in Native and Non Native Speakers Use of Formulaic Language

Australian Journal of Linguistics
ISSN: 0726-8602 (Print) 1469-2996 (Online) Journal homepage: http://www.tandfonline.com/loi/cajl20
A Little Bit About: Differences in Native and Nonnative Speakers Use of Formulaic Language
Hadi Kashiha & Swee Heng Chan
To cite this article: Hadi Kashiha & Swee Heng Chan (2015) A Little Bit About: Differences in
Native and Non-native Speakers Use of Formulaic Language, Australian Journal of Linguistics,
35:4, 297-310, DOI: 10.1080/07268602.2015.1067132
To link to this article: http://dx.doi.org/10.1080/07268602.2015.1067132
Published online: 12 Sep 2015.
Submit your article to this journal
Article views: 268
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=cajl20
Download by: [Waseem Hassan]
Date: 21 September 2016, At: 09:55
Australian Journal of Linguistics, 2015

Vol. 35, No. 4, 297310, http://dx.doi.org/10.1080/07268602.2015.1067132
A Little Bit About: Differences in Native

and Non-native Speakers Use of
Formulaic Language
HADI KASHIHA AND SWEE HENG CHAN
Universiti Putra Malaysia
(Accepted 24 February 2015)
This paper examines the use of a type of formulaic expression, called lexical bundles, in
classroom discussions among English native and Malaysian non-native speakers. Lexical
bundles are frequently used in academic discourse, and contribute to the production of
coherence in speech and written language, as well as playing a central role in the
comprehension of academic speech. Previous research has shown that L2 speakers often
show a capacity to approximate native-like efciency by using lexical bundles in their
speech and writing. However, it has not been analysed as to what degree L2 speakers
follow native expressions or rather use their own variations or versions of formulaic
expressions. In order to investigate this gap, the most frequent four-word lexical
bundles were identied and analysed in two different corpora of classroom discussions
by native and non-native speakers, and compared in terms of discourse function. The
ndings show that native speakers used more lexical bundles than their non-native
counterparts did. Native speakers also used more discourse organizing bundles, while
non-native speakers more frequently used lexical bundles as stance expressions. These
ndings are discussed in terms of the pedagogical implications of exposing L2 speakers
to a wider variety of lexical bundles, and the discourse functions inherent in their use.
Keywords: Classroom Discussion; Discourse Function; Formulaic Expressions; Lexical
Bundles; Native and Non-native
1. Introduction
Over the last two decades, there has been increasing interest in research on different
types of lexical phrases and formulaic expressions, as it has become understood that
2015 The Australian Linguistic Society
298
H. Kashiha and S.H. Chan
they are more frequently used than freely generated sequences (Erman & Warren 2000).
These multi-word units are also referred to as prefabricated language items, which are
stored in memory and retrieved wholly at the time of use (Wray 2000, 2002). Further, a
language contains a variety of patterns such as lexical clusters which have specic functions (Sinclair 1991). For example, the expression I want + you + to implies a type of
obligation. These memorized or internalized lexical clusters lead to native-like competence, and mastering them helps to reduce the effort of processing information
(Howarth 1998; Pawley & Syder 1983; Schmitt 2004; Wray 2002, 2008). Previous
research on both academic discourse and informal speech has found that they exhibit
a large repertoire of formulaic language. For instance, Altenberg (1998) found that
almost 80% of the words in his informal speech corpus obtained from native speakers
were part of formulaic combinations. The use of these combinations can be so common
that in some cases they determine preferences in the language use of speech communities, and their absence in language production could signal a lack of maturity in the
community (Li & Schmitt 2009). Such sequences have been identied and studied by
different scholars who adopt different terminology including: recurrent word combinations (Altenberg 1998), prefabricated patterns (Granger 1998), lexical bundles
(Biber et al. 1999; Hyland 2008) and formulaic sequences (Schmitt & Carter 2004).
All the terms refer to a combination of more than two words which co-occur with a
high frequency in a given register. This study uses the term lexical bundles (Biber
et al. 1999) to refer to this type of multi-word unit.
Adopting a frequency-driven approach, lexical bundles were rst examined and
categorized in academic speech and writing in The Longman Grammar of Spoken
and Written English (Biber et al. 1999). The researchers dened lexical bundles as
sequences of word forms that have a high level of co-occurrence in the register.
Bundles can be retrieved from a corpus using WordSmith Tools, a corpus analysis
software that allows for the input of criteria to provide cut-off points for identication
of lexical bundles. Some examples of the lexical bundles found in the corpus of academic discourse include: as can be seen, on the other hand and in terms of the.
Lexical bundles differ from other types of multi-word expressions, as they are identied empirically, rather than intuitively (Cortes 2004). Unlike idioms, lexical
bundles are semantically transparent; their meaning can be easily understood by translating the individual components. In contrast, idioms have meanings which cannot be
understood from the separated individual elements. Another characteristic of lexical
bundles is that they are often structurally incomplete, and usually encompass fragmented clauses or phrases.
It is well established that different academic genres show the use of a distinct set of
lexical bundles, associated with typical communicative purposes (Biber & Barbieri
2007: 265). The functional and structural characteristics of lexical bundles have also
been found to be different across academic linguistic varieties. In recent years,
several corpus-based studies have compared the use of lexical bundles in the speech
and writing of native speakers (NSs) and non-native speakers (NNSs) (Adel &
Erman 2012; Aijmer 2009; Chen & Baker 2010; Conklin & Schmitt 2008; Crossley
Native and Non-native Speakers Use of Formulaic Language
299
& Salsbury 2011; DeCock 2004; Gilquin et al. 2007; Karabacak & Qin 2013; Nekrasova
2009; Salazar 2014). The majority of the results revealed that NSs, in general, use more
lexical bundles than their NNS counterparts, and their language production shows a
greater degree of variation. Adel and Erman (2012) did a comparative study on the
use of lexical bundles in the academic writing of Swedish and British university students.
The analysis of 325 student essays revealed that the bundles were more frequent in the
writing of NSs. Similarly, Karabacak and Qin (2013) analysed 29,532 newspaper articles
and compared them with the writing of three groups of university students (Turkish and
Chinese representing non-native, and American representing native speakers). They
found that Chinese students used the lowest number of word combinations, while the
Americans used the most ve-word combinations. Ma (2009) compared the use of
lexical bundles in the essays of Chinese undergraduate students with those of English
NSs. Ma found that only one third of the lexical bundles used by NNSs were similar
to those of NSs. Further, NSs were able to use more lexical bundles constituted of
past tense verbs, and noun and prepositional phrases. By contrast, Pang (2009) discovered that Chinese students employed more lexical bundles than native students did in
argumentative writing, and they produced more phrases with active verbs and topic
related expressions. Finally, to identify the difculties that L2 scientist writers encounter
while using lexical bundles in English, Salazar (2014) did a comparative study and built a
frequency-derived list of the most prevalent lexical bundles used in scientic writing.
The studies discussed above have highlighted the signicance of lexical bundles in
academic language, and provided some insight into their use. However, an overview of
related literature reveals that there has been little analysis comparing the formulaicity
of L1 and L2 English speech using a corpus-based method. Among the few studies,
Crossley and Salsbury (2011) observed the development of lexical bundle use
among six adult L2 English learners and compared it to those of English NSs. They
discovered that the accuracy of lexical bundles by L2 learners increases as a function
of time spent while learning English and their production begins to improve in parallel
with their frequency used by NSs. Furthermore, like NSs, L2 learners also used lexical
bundles to serve pragmatic and syntactic functions. Aijmer (2009) also compared the
use of I dont know and (I) dunno by Swedish learners and English NSs. The comparison showed that Swedish learners repeatedly used I dont know (dunno) to signal
speech management, while NSs used this expression to avoid asking direct questions.
In an attempt to ll this gap, this study compares the use of lexical bundles during discussions of Malaysian L2 speakers of English with those of English NSs. The purpose is
to investigate how and to what extent L2 speakers approximate native-like competence
by using formulaic language. The following research questions will be answered:
(1) What are the high frequency lexical bundles occurring in the speech of L2 and L1
speakers? Which bundles are shared by the two groups of speakers?
(2) What are the discourse functions of the bundles used in L2 and L1 speech? Are
there any similarities and differences between the functions used?
300
2. Data and Method

This study is based on a corpus of 26 discussion session transcripts, which were
divided into two distinct categories, forming NS and NNS samples. The native data
consisted of six transcripts from the Michigan Corpus of Academic Spoken English
(MICASE). These six transcripts were the only transcripts available that focused on
discussion sessions undertaken by English native speakers. The transcripts ranged
from 5,685 to 16,609 words, with 54,732 words in total. The NNS transcripts were
obtained from a larger collection of audio recordings of ESL discussion sessions,
from an English prociency test bank at a Malaysian university. In order to make
the study comparable, only 20 audio recordings, totalling 53,240 words, were selected
to represent the non-native sample. The audio recordings were deliberately selected
based on the intelligibility of discussion. The samples illustrated a reasonable level
of English language prociency, so that they could be compared to the discussions
of the NSs. The transcriptions were carried out by the researchers from the recordings,
which lasted for between 20 and 25 minutes each. In order to ensure consistency, the
target transcripts were cross-checked by another rater.
To identify lexical bundles for analysis, a number of criteria were used following Biber
et al.s research (1999). First, for practical considerations, analysis was limited to fourword lexical bundles. This decision was taken as previous studies have shown that the
four-word string is the most commonly used cluster length in this kind of discourse (e.g.
Adel & Erman 2012; Biber & Barbieri 2007; Biber & Conrad 1999; Biber et al. 2004; Chen
& Baker 2010; Cortes 2002, 2004; Hyland 2008). Further, it is believed that four-word
bundles are much more frequent than ve-word bundles and represent a fuller range
of structures and functions than three-word combinations (Cortes 2004).
Another criterion was to set an objective frequency cut-off point for the identication of lexical bundles. Biber et al. (1999: 992) describe lexical bundles as those combinations of words which occur at least 10 times in a million words, and in at least ve
different texts in their corpus. For this study, a string of words must occur 10 times per
hundred thousand words in order to be identied as a lexical bundle. A frequency cutoff point is a useful tool to lter out strings which were considered too specic within a
given context (e.g. the Malaysian language learners) or those that included proper
nouns (e.g. in the United States). Counter to Biber et al. (1999), a criterion of dispersion was not used in this study, as there were only six transcripts for the native
sample and there was a disparity in the number of texts between the corpora. The
computer application WordSmith Tools was used to automatically retrieve and
make a list of four-word bundles.
To investigate how the lexical bundles are used to frame discourse functions, the
identied bundles were classied functionally using the taxonomy proposed by
Biber et al. (2004). Discourse functions were identied through qualitative analysis,
and classied into three main categories: stance expressions, discourse organizers and
referential expressions. Stance expressions were served by two sub-functions: epistemic
and attitudinal/modality. Epistemic stance expressions show the degree of certainty or
301
probability, e.g. is likely to be, the fact that the. Attitudinal/modality stance expressions
reect the speakers attitude in relation to the following propositions: desire (do you
want to), intention/prediction (you need to know), obligation/directive (I want you
to) and ability (to be able to). Discourse organizers also have two sub-functions.
They either introduce a topic (want to talk about), or elaborate and clarify the
given information (on the other hand). Referential expressions elaborate on the features of abstract or concrete entities. Specically, they refer to place (in front of
you), time (at the same time) or even another text or a segment of speech (as we discussed earlier). To validate the functions of the lexical bundles, the concordance lines
of each identied sequence, generated through the use of Wordsmith Tools, was examined individually by the researchers, to detect similarities and differences between
native and non-native language use.
3. Result and Discussion

An initial review of the two lists (NS and NNS) of retrieved bundles showed that the
discussion sessions of the NSs included a larger stock of lexical bundles. There were 73
different bundles in the native corpus and only 39 for the NNSs. In terms of raw frequencies, there were 413 individual occurrences in the native corpus, compared with
241 in the non-native. The fact that English L2 speakers used fewer lexical bundles
than their NS counterparts conrms earlier ndings by Adel and Erman (2012),
Chen and Baker (2010), Karabacak and Qin (2013) and Ma (2009). NSs were more
varied in the construction of lexical bundles, and the language production of L2 speakers showed a lack of register awareness, phraseological infelicities, and semantic
misuse (Gilquin et al. 2007: 319). It is also likely that NNSs used lexical bundles
that fell below the frequency criterion of four words, suggesting that they were less
inclined to use longer word strings.
The most common bundles used by NSs were I mean that is, I mean I think and I
think it is which occurred 23, 22 and 20 times, respectively. The most frequent bundles
used by the Malaysian speakers were I agree with you, I think that the and my point of
view, with a frequency of 45, 43 and 41 respectively. Only nine types of four-word
lexical bundles were commonly shared between the two groups of speakers. They
were: the end of the, what do you think, I think it is, at the same time, I think that
the, I agree with you, I would like to, is one of the and so we have to. Table 1 provides
the list of top 10 most frequent four-word lexical bundles found in the two corpora.
It was found that 64 bundles were used only by the NSs, while 30 bundles were
specic to non-native use. This distinctiveness suggests that each group relied on a
different range of linguistic realizations to form thought patterns. Table 2 provides
an overview of the distribution of lexical bundles used by the two groups.
In the data, it was found that Malaysian L2 speakers tended to overuse certain
expressions compared to NS, such as those that included the verb agree, as in agree
with your opinion, I do agree with, I agree with what, as well as expressions that
302
Table 1 The top 10 most frequent lexical bundles in the two corpora
NS
I mean that is
I mean I think
I think it is
I mean it is
do you think that
I dont know if
I think it was
that is what I
what do you think
what is going on
Frequency
NNS
Frequency
23
22
20
20
19
18
18
18
16
14
I agree with you

I think that the
my point of view
what do you think
the end of the
the role of the
I think it is
I dont think that
I think we should
to the end of
45
43
41
39
37
37
36
34
33
27
Table 2 Distribution of lexical bundles in native and non-native corpus

Type of bundles
Samples
NS
NNS
Total no. of occurrences
Shared
Non-shared
413
241
64
30
included the verb think, e.g. I dont think that, I think that the. Possibly, this is due to
their limited range of lexical verb choices, which could be linked to lexical bundles
used when expressing their stances. Some NNSs did attempt to use patterns that
involved discourse marker + verb phrase fragments, such as I mean you know, you
know that is or comparative expressions, such as as well as the, but the frequency
and variety were far less compared to the NS sample. The underuse of certain structures points to the need to improve speaking strategies that involve a higher level of
deployment of lexical bundles. Very few comparisons for this study could be found
in the existing literature, in terms of the overuse and underuse of lexical bundles in
native and non-native speech. The closest formulaic language research that has
been done in relation to academic writing (Adel & Erman 2012; Chen & Baker
2010; Salazar 2010; Staples et al. 2013) found that L2 speakers underuse, overuse
and misuse NSs lexical bundles, and were unable to understand the discursive functions of bundles based on L1 conventions.
3.1. Discourse Functions of Lexical Bundles
Table 3 presents the proportions of the discourse functions of lexical bundles in the
two corpora analysed. It can be seen that NNSs used a higher range of stance
expressions (53.84%), in comparison to the native data (31.50%). Conversely, NSs
showed more inclination to use discourse organizers (39.72%), which was the least
used function in the NNS data. The two groups had very similar percentages of use
303
Table 3 Functional distribution of lexical bundles in the native and non-native corpus
Functions
Stance expressions
Discourse organizers
Referential expressions
NS No. (%)
NNS No. (%)
23 (31.50)
29 (39.72)
21 (28.76)
21 (53.84)
6 (15.38)
12 (30.76)
for referential expressions. This function characterizes almost one third of the bundles
in the native and non-native corpus (28.76% and 30.76% respectively). The comparisons of each functional category and its sub-categories are discussed in the following
sections, with examples extracted from each corpus.
3.1.1. Stance expressions
In general, both groups of speakers in the study used stance expressions to articulate
their assessment of knowledge, according to a degree of certainty or uncertainty. These
expressions were also used to show the speakers attitude towards propositions, such as
that of desire, intention, ability, etc. As can be seen in Table 4, over half of the lexical
bundles used by the Malaysian L2 speakers were of this type. The high percentage of
stance expressions could be attributed to NNSs reliance on a wider range of stance
bundles deemed as important in carrying out oral discussions. It seems that the ESL
speakers had a marked preference for claiming, or giving emphasis on direct ownership as a functional purpose. This was more evident in the use of epistemic stance
expressions, which comprised one third of all the bundles in the NNS sample. Most
of these expressions included the rst person I and were used repeatedly to show
the speakers lack of certainty about their knowledge, or that they might have
inadequate information about the topic being discussed. For example:
(1) here in the context of Malaysia I dont think that it is possible since our government not really control the access to the internet. (NNS)
In terms of overall frequency, analysis revealed that NSs were more procient in the
use of hedging markers to express their uncertainty. Hedging bundles, such as it is
Table 4 Stance expressions in native and non-native corpus
Functional categories
1. Stance bundles
Total
Sub-categories
A. Epistemic stance
B. Attitudinal/modality stance
B1) Desire
B2) Obligation/ directive
B3) Intention/ Prediction
B4) Ability
NS No. (%)
NNS No. (%)
9 (12.32)
12 (30.76)
1 (1.36)
4 (5.47)
5 (6.84)
4 (5.47)
23 (31.50)
1 (2.56)
5 (12.82)
3 (7.69)
21 (53.84)
304
possible to, is more likely to and a few variations of the verb know, were repeatedly used
in the native speech to serve this function. In the following example, the speaker used
the I focus expression I dont know if, to show that he was not sure about what he was
saying:
(2) I dont know if this is, you know true or not but, they say that the whole state,
system could, um be moving towards ending because (NS)
NSs also used variations of the modal verb would, such as it would be a, would be
necessary to, to realize this epistemic function. However, neither of these hedging
markers was found in the speech of NNSs. Instead, they preferred to use other
expressions, containing the verb think, to signal this function: I think that the, I
dont think that, I dont think we. The data suggest that L2 speakers could have
been trained to use such direct expressions to state personal opinions, and may not
be aware of other alternatives, such as it is likely to, to indicate a greater subtlety in
conveying degrees of certainty. Such tendency to use more direct lexical bundles to
serve this function could have been partly dictated by cultural preferences in their L1.
For the other sub-categories of stance expressions, obligation/directive was also
more frequent in the discussion of NNSs. They seemed to use these types of chunks
to guide each other in negotiating the discussion, or to highlight the importance of
an event:
(3) So you have to think before you argue, you have to know what is your point of
view. (NNS)
(4) Its something you have to do by your own self. (NNS)
This again suggests that NNSs are more direct, and lack the knowledge and variety of
expressions to manage the ow of oral discourse. They appeared to be motivated by
imposing obligation in the discussion, or to explicitly guide the discourse through
the use of such cues. This function was not prevalent in the NS speech comprising
only 5% of the bundles.
As can be seen in Table 3, the two groups of speakers registered similar percentages
of use in the desire and intention/prediction sub-categories. This nding suggests that
such functions are characteristic of group discussion, regardless of a native or nonnative speech context. However, there was a difference in the construction of the
lexical bundles employed by the two groups. NNSs used verb(ing) expressions, such
as I am going to, to show intention. NSs used more complex expressions, such as I
was supposed to. A few examples showing the ability function, will be able to, were
found in native speech but lacking completely from the non-native corpus. NSs
were able to use will be able to + verb, as opposed to bundles including will + verb,
showing that they have knowledge of the differences between the two lexical
bundles. The former can be construed as a double hedge, while the latter expresses
ability more directly.
305
3.1.2. Discourse organizers

Discourse organizer bundles were used by both groups, either to initiate their turns
in discussion, or make a logical link between their ideas. As shown in Table 5, this
function was more prevalent in the speech of native language users; discourse organizer bundles featured twice as often as in the non-native corpus. This nding indicates that L1 speakers language was more varied and they were more aware of the
need to signal their turn. This was more apparent in the use of the topic elaboration/
clarication bundles, which was again found to be higher in the native corpus
(24.65%). NNSs were much less likely to elaborate (2.56%); on the other hand
was the only bundle of this type used by NNSs to make a contrast between prior
and coming information:
(5) and then the rest of the time maybe you can spend your time on studying or
doing other stuff other than using the internet. On the other hand, the government
plays a very important role in blocking some of the very sensitive websites. (NNS)
In contrast, NSs employed a greater variety of bundles with different grammatical

structures in order to elaborate or clarify a topic. This is consistent with the other ndings of this research and, again, signals their skill and competency in the language.
Examples of passive structure (is considered as a), discourse markers (you know I
mean), WH-clauses (what I mean by) and that clauses (that is to say) that serve
this function, were all found:
(6) thats a way of doing it, that is to say, to think of it, in terms not of what denes
authority but in a given period (NS)
(7) yeah and I mean that is a really important way of putting it too that, that um,
within the category that you could call social norms. (NS)
Both groups made use of topic introduction/focus with a similar rate of use (15% for
NSs, and 12.85% for NNSs). They employed this bundle to initiate turns, and to focus
on the topic under discussion. However, there were differences between the types of
bundles and the way they were used. For example, L2 speakers deployed the formal
bundle I would like to to introduce the topic:
(8) I would like to suggest other ideas to how to improve English language skills by
government policy at schools, secondary and primary school. (NNS)
Table 5 Discourse organizers in native and non-native corpus

2. Discourse organizers
Total
Sub-categories
NS No. (%)
NNS No. (%)
A: Topic introduction
B. Topic elaboration/clarication
11 (15)
18 (24.65)
29 (39.72)
5 (12.85)
1 (2.56)
6 (15.38)
306
However, NSs used more informal expressions such as I was just gonna or going to talk
about:
(9) therefore Im going to talk about what makes somebody a king. What makes
somebody a king, in Augustus time, is, possession of an ofce, and, the, protection
of the law. (NS)
Arguably, NNSs were more conscious about the use of overly formal language, perhaps
reecting the style of training they have received. Another difference features in the
use of the question bundle: what do you think? This bundle was used by NSs to introduce a topic, while NNSs principally employed it at the end of their turns, to give the
oor to the next speaker:
(10) maybe they would, put up a different coloured suh- street sign or something
and, youd be a little happier, right? but, what do you think people in Detroit feel
about, their local government? (NS)
(11) So, I think parents, as I said earlier, should start from home, to provide them
with self control or mental guidance so that they dont fall into this kind of negative
effects. What do you think? (NNS)
3.1.3. Referential expressions

In general, both groups of speakers had a similar level of use for referential expressions.
It was found that among the referential expressions, identication/focus bundles were
the most common (see Table 6). These bundles were used by NSs to refer to something
that is important or to put emphasis on the noun or noun phrase following the bundle.
For example, the bundle those of you who identied a group of people:
(12) on the uh for those of you who are more inspired than others, I already have
homework two posted to my website. (NS)
Table 6 Referential expressions in native and non-native corpus

3.Referential expressions
Total
Sub-categories
NS No. (%)
NNS No. (%)
A. Identication/focus
B. Imprecision
C. Specication of attributes
C1) Quantity specication
C2) Tangible framing
C3) Intangible framing
D. Time/Place/Text reference
D1) Place reference
D2) Time reference
D3) Text-deixis
D4) Multi-functional reference
13 (17.80)
2 (2.73)
6 (15.38)
2 (2.73)
1 (1.36)
1 (2.56)
1 (2.56)
1 (1.36)
2 (2.73)
21 (28.76)
4(10.25)
12 (30.76)
307
NNSs employed bundles such as each one of you to identify their peers, or is one of the
to single out something that is deemed important:
(13) Yeah, in my opinion, the solution is to use plastic bag. I think this is a good
solution that is one of the most important objects to reduce the rubbish. (NNS)
Identication/focus bundles also occasionally served a discourse organizing function

in NSs discussions. For example, one of the things was used to initiate a discussion;
the lexical bundle introduced a main point, after which relevant details were
presented:
(14) One of the things that it would encompass is human sensibility. Theres a similarity here because both of these things are imposing, frameworks on, the numina,
so theres something really similar going on but (NS)
No instances of dual function were found in the non-native discussions. This might be
a result of L2 speakers unfamiliarity with formulaic expressions that could have more
than one meaning, or that could be used differently according to the context. This
nding highlights the need for L2 speakers to gain more experience of these dualities
in order to enhance oral ability.
Table 5 shows that the use of imprecision bundles did not appear in the non-native
corpus, which would normally be expected in extemporaneous speech. NSs used the
bundle or something like that to indicate the approximate nature of a prior reference
(example 15), or and things like that to highlight that additional information exists but
is not being specically referred to in the context (example 16):
(15) I think you could make it, somewhat broader than that. I think you could say,
that after 14, 25 or something like that, that technology and the ability to control
technology, is an important base for authority. (NS)
(16) and somebody at the equator isnt going to see many at all, or wont see any. So
circumpolarity and altitude and things like that are dened by where you are
(NS)
To use such bundles condently requires a certain degree of language control, to

balance the level of expertise the speaker displays and prevent them appearing
overly unknowledgeable. L2 speakers lack the level of expertise to use this kind of
insertion. They may also wish to avoid expressions that highlight uncertainty, if
such expressions reveal a weakness in their ability to manage the ow of discussion.
Important variations were found among the sub-categories of specication of attributes; NSs were more likely to use formulaic bundles the extent to which or the degree
to which, to identify varied quantities, as opposed to simpler expressions such as a lot
of the, which were used by the NNSs. It reafrmed that L2 speakers were much more
limited in the range of bundles used.
308
There was another instance of functional duality in the native corpus for the quantity bundle, a little bit about, which also served to introduce a topic:
(17) I just wanna tell you a little bit about politics, politics isnt, politics doesnt just
run on individuals (NS)
The only bundle which was used to describe intangible and abstract attributes in the
native corpus was the concept of the. NNSs used the role of the to serve this function.
No bundle was found to describe tangible framing attributes in the speech of the two
groups, suggesting that this function is not relevant in group discussion.
Time reference bundles appeared more often in the non-native corpus, such as at
the same time, at the end of, the end of the, for about two minutes. These bundles
were not found in the list of multi-word combinations used by NSs, with the exception
of at the same time. As expected, no bundle was used to refer to text in the two corpora.
Obviously, this function is not necessary in a spoken discussion.
Finally, the bundles at the end of and the end of the were multi-functional only in the
speech of native speakers, who used them to refer to both time and place in the discussion depending on the context:
(18) remember his explanation at the end of the page where he said, sorry I guess I
wasnt able to really present to you the unity of consciousness, youve got, um, Descartes trying to do it with the soul theory (NS)
(19) itll give you more time. so, um, you would put the Clovis example further back,
really at the start, and say what Gregory, at the end of the sixth century is doing. (NS)
4. Conclusion
The current study compared the use of formulaic expressions, in the form of fourword lexical bundles, in the spoken academic discourse of native and Malaysian
L2 speakers. To do this, two different corpora of classroom discussion sessions
were compiled. The most frequent four-word lexical bundles in each corpus were
identied using WordSmith Tools. The frequencies of the target bundles used
were compared and discussed. The analysis revealed that NSs used more lexical
bundles than their L2 counterparts did. These results supported previous studies
into the use of lexical bundles among native and non-native language users in
written discourse (Adel & Erman 2012; Chen & Baker 2010; Karabacak & Qin
2013; Ma 2009).
This comparative study also found similarities and variations between NSs and
NNSs in the discourse functions of lexical bundles in spoken discourse. The speech
of NNSs contained many more stance expressions compared to NSs. In contrast,
they used fewer discourse organizer bundles. The two data sets reported a similar percentage of use of referential expressions. A detailed investigation found that NNSs
used some word combinations exclusive to their group. This included the tendency
309
to overuse agree- expressions, such as agree with your opinion, I do agree with, I agree
with what, while they underused discourse marker + verb phrase fragments, as well
as comparative structures. NSs demonstrated a higher variety of lexical bundles,
with examples specic to their corpus. They also used multi-functional bundles, something lacking from the non-native corpus. There are some limits to the study that need
to be acknowledged. The corpora analysed were restricted in terms of sample size, and
the topics under discussion may potentially inuence the bundles used by speakers.
The ndings of this study illustrate a dimension of language exploration made possible through advances in technology using corpus-based methodologies. In spite of the
frequent use of lexical bundles in academic settings, ELT course designers and practitioners have not kept up with research in this area, and should include them more prominently in their materials and syllabus. An examination of a popular text used for oral
interactions among NNSs revealed elements of managing an oral discussion, but a wider
coverage of different types of lexical bundles would be a denite aid. L2 speakers tended
to use much simpler expressions, likely due to a lack of exposure, training or practice.
There would be a denite advantage in developing exercises that could activate the
sequence of use through conscious predictions such as at the end which necessitates a
preposition (at the end of) or I agree + preposition before your opinion. In this way,
lower prociency students would be helped to internalize lexical bundles with a focus
on grammatical structures. In the case of the Malaysian L2 speakers, the data suggest
that greater focus should be given to introducing discourse organizer bundles, and
the complex structures necessary to use lexical bundles in spoken discourse effectively.
References
Adel A & B Erman 2012 Recurrent word combinations in academic writing by native and non-native
speakers of English: a lexical bundles approach English for Specic Purposes 31(2): 8192.
Aijmer K 2009 So er I just sort I dunno I think its just because : a corpus study of i dont know
and dunno in learners spoken English Language and Computers 68(1): 151168.
Altenberg B 1998 On the phraseology of spoken English: the evidence of recurrent word combinations In A Cowie (ed.) Phraseology: theory, analysis and applications Oxford: Oxford
University Press. pp. 101122.
Biber D & F Barbieri 2007 Lexical bundles in university spoken and written registers English for
Specic Purposes 26(3): 263286. doi:10.1016/j.esp.2006.08.003
Biber D & S Conrad 1999 Lexical bundles in conversation and academic prose in H Hasselgard & S
Oksefjell (eds) Out of Corpora Amsterdam: Rodopi. pp. 181190.
Biber D, S Conrad & V Cortes 2004 If you look at : lexical bundles in university teaching and
textbooks Applied Linguistics 25: 371405. doi:10.1093/applin/25.3.371
Biber D, S Johansson, G Leech, S Conrad & E Finegan 1999 Longman Grammar of Spoken and
Written English London: Longman.
Chen YH & P Baker 2010 Lexical bundles in L1 and L2 academic writing Language Learning and
Technology 14(2): 3049.
Conklin K & N Schmitt 2008 Formulaic sequences: are they processed more quickly than nonformulaic language by native and nonnative speakers? Applied Linguistics 29(1): 7289. doi:10.
1093/applin/amm022
310
Cortes V 2002. Lexical bundles in published and student academic writing in history and biology
Unpublished Doctoral Dissertation, Northern Arizona University.
Cortes V 2004 Lexical bundles in published and student disciplinary writing: examples from history
and biology English for Specic Purposes 23(4): 397423. doi:10.1016/j.esp.2003.12.001
Crossley S & TL Salsbury 2011 The development of lexical bundle accuracy and production in
English second language speakers International Review of Applied Linguistics in Language
Teaching 49(1): 126. doi:10.1515/iral.2011.001
DeCock S 2004 Preferred sequences of words in NS and NNS speech Belgian Journal of English
Language and Literatures (BELL) 2: 225246.
Erman B & B Warren 2000 The idiom principle and the open choice principle Text-The Hague
Then Amsterdam Then Berlin 20(1): 2962.
Gilquin G, S Granger & M Paquot 2007 Learner corpora: the missing link in EAP pedagogy Journal
of English for Academic Purposes 6(4): 319335. doi:10.1016/j.jeap.2007.09.007
Granger S 1998 Prefabricated patterns in advanced EFL writing: collocations and formulae In A
Cowie (ed.) Phraseology: theory, analysis, and applications Oxford: Oxford University Press.
pp. 145160.
Howarth P 1998 Phraseology and second language prociency Applied Linguistics 19(1): 2444.
doi:10.1093/applin/19.1.24
Hyland K 2008 As can be seen: lexical bundles and disciplinary variation English for Specic
Purposes 27(1): 421. doi:10.1016/j.esp.2007.06.001
Karabacak E & J Qin 2013 Comparison of lexical bundles used by Turkish, Chinese, and American
university students Procedia-Social and Behavioral Sciences 70: 622628. doi:10.1016/j.sbspro.
2013.01.101
Li J & N Schmitt 2009 The acquisition of lexical phrases in academic writing: a longitudinal case
study Journal of Second Language Writing 18(2): 85102. doi:10.1016/j.jslw.2009.02.001
Ma GH 2009 Lexical bundles in L2 timed writing of English majors Foreign Language Teaching and
Research 41: 5460.
Nekrasova TM 2009 English L1 and L2 speakers knowledge of lexical bundles Language Learning
59(3): 647686. doi:10.1111/j.1467-9922.2009.00520.x
Pang P 2009 A study on the use of four-word lexical bundles in argumentative essays by Chinese
English-majors: a comparative study based on WECCL and LOCNESS Teaching English in
China 32: 2545.
Pawley A & FH Syder. 1983 Two puzzles for linguistic theory: nativelike selection and nativelike uency
In J Richards & R Schmidt (eds) Language and Communication London: Longman. pp. 191225.
Salazar D 2010 Lexical bundles in Philippine and British scientic English Philippine Journal of
Linguistics 41: 94109.
Salazar D 2014 Lexical Bundles in Native and Non-native Scientic Writing: applying a corpus-based
study to language teaching (Studies in Corpus Linguistics, 65) Amsterdam, Philadelphia, PA:
John Benjamins.
Schmitt N 2004. Formulaic Sequences: acquisition, processing and use Amsterdam: John Benjamins.
Schmitt N & R Carter 2004 Formulaic sequences in action: an introduction In N. Schmitt (ed.)
Formulaic Sequences: acquisition, processing, and use Philadelphia, PA: John Benjamins. pp. 122.
Sinclair JM 1991 Corpus, Concordance, Collocation Oxford: Oxford University Press.
Staples S, J Egbert, D Biber & A McClair 2013. Formulaic sequences and EAP writing development:
lexical bundles in the TOEFL iBT writing section Journal of English for Academic Purposes 12:
214225. doi:10.1016/j.jeap.2013.05.002
Wray A 2000 Formulaic sequences in second language teaching: principle and practice Applied
Linguistics 21(4): 463489. doi:10.1093/applin/21.4.463
Wray A 2002 Formulaic Language and the Lexicon Cambridge: Cambridge University Press.
Wray A 2008 Formulaic Language: pushing the boundaries Oxford: Oxford University Press.

A Little Bit About Differences in Native and Non Native Speakers Use of Formulaic Language

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

A Little Bit About Differences in Native and Non Native Speakers Use of Formulaic Language

Încărcat de

Drepturi de autor:

Formate disponibile

Australian Journal of Linguistics

ISSN: 0726-8602 (Print) 1469-2996 (Online) Journal homepage: http://www.tandfonline.com/loi/cajl20

Published online: 12 Sep 2015.

Submit your article to this journal

Article views: 268

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Date: 21 September 2016, At: 09:55

Australian Journal of Linguistics, 2015

A Little Bit About: Differences in Native

2015 The Australian Linguistic Society

H. Kashiha and S.H. Chan

Native and Non-native Speakers Use of Formulaic Language

H. Kashiha and S.H. Chan

2. Data and Method

Native and Non-native Speakers Use of Formulaic Language

3. Result and Discussion

H. Kashiha and S.H. Chan

I agree with you

Table 2 Distribution of lexical bundles in native and non-native corpus

Total no. of occurrences

Native and Non-native Speakers Use of Formulaic Language

NNS No. (%)

NNS No. (%)

H. Kashiha and S.H. Chan

Native and Non-native Speakers Use of Formulaic Language

3.1.2. Discourse organizers

In contrast, NSs employed a greater variety of bundles with different grammatical

Table 5 Discourse organizers in native and non-native corpus

NNS No. (%)

H. Kashiha and S.H. Chan

3.1.3. Referential expressions

Table 6 Referential expressions in native and non-native corpus

NNS No. (%)

Native and Non-native Speakers Use of Formulaic Language

Identication/focus bundles also occasionally served a discourse organizing function

To use such bundles condently requires a certain degree of language control, to

H. Kashiha and S.H. Chan

Native and Non-native Speakers Use of Formulaic Language

H. Kashiha and S.H. Chan

S-ar putea să vă placă și