Collocation, Colligation, and Encoding Dictionaries

Telechargé par ammar.kulic

Téléchargement

COLLOCATION, COLLIGATION AND

ENCODING DICTIONARIES. PART I:

LEXICOLOGICAL ASPECTS

Dirk Siepmann: Universita« t-GH Siegen, Fachbereich 3, Adolf-Reichwein-Strae,

D-57068 Siegen,Germany (dsiepmann@t-online.de)

Abstrac t

This article attempts to synthesise recent advances in collocational theory into a

coherent framework for lexicological theory and lexicographic practice. By posing a

number of fundamental questions related to the definition of collocation, it critically

reviews frequency-based, semantic and pragmatic approaches to collocation. It is found,

among other things, that two types of collocation, namely ‘long-distance’ collocation

and collocation between semantic features, have suffered almost total neglect. This leads

to suggestions for a new division of the collocational spectrum and for a revised

definition of ‘collocation’ based on the notions of ‘usage norm’ (Steyer 2000) and

‘holisticity’ (Siepmann 2003). It is argued that this new view of collocation considerably

widens the dictionary maker’s brief, since future lexicography will have to provide a full

account of both structurally simple and structurally complex units, including fixed

expressions of regular syntactic-semantic composition (see Part II of this article, to be

published in the March issue of this journal).

1. I n tr o du c t io n

According to modern science, there is no such thing as ‘independent existence’;

at least since the advent of chaos theory, there has been full recognition that

all forms of life and material phenomena, whether at the micro-level or at the

macro-level, are interdependent. In linguistics, this realization has found its

fittest expression in the idea of linguistic rather than literary ‘intertextuality’,

whereby the meaning of one text and its constituent elements depends on

millions of other texts using similar or identical elements. Textual meaning is

thus created by the interplay of two types of repetition, viz. (a) collocation

(in the largest possible sense, including colligation

and phraseology) and

(b) cohesion. It turns out that one instance of collocation and the entire

language are mutually illuminating, since the instance is understood in terms of

International Journal of Lexicography, Vol. 18 No. 4

please email: [email protected]

doi:10.1093/ijl/eci042

the whole, and the whole in terms of the instance (cf. Hunston 2001: 31); taking

this a bit further, we might say that not only is each pattern necessary for

comprehending the sum total of similar patterns, but each pattern is

also a miniature version of that sum total, as shown by the fact that the

meaning of individual patterns (e.g. German ‘sonniges Gemu

¨t’ [‘sunny

disposition’ ¼irrepressible high spirits] vs ‘sonnige Lage’ [sunny location]),

even if shorn of any context, is evident to the native speaker.

This relatively recent view of meaning creation (Hoey 1991, 1998, 2000,

Feilke 1994, 1996) seems much more in keeping with speakers’ intuitive

knowledge about language than was the case in earlier structuralist theories.

The latter tended to assume that expressions such as ‘sonnige Lage’ have

both a compositional, literal meaning and a non-compositional, figurative

meaning (Feilke 1996: 128). In an intertextual or socially-based view of

meaning creation, the compositional meaning is exposed for what it is, namely

an abstraction of the linguist which has no base in the native speaker’s

mental lexicon; the expression ‘sonnige Lage’ is then considered to be a

‘holistic’ sign that is irreducible to the sum of its parts. In a related

development, computational and cognitive linguists have used corpus-linguistic

insights to work out models of language grounded in actual usage rather

than abstract general rules (Chandler 1993, Croft and Cruse 2003, Skousen

1989). In these models word or clause formation is by analogy with existing

exemplars, and it will be seen that such models can also be applied to

collocation.

This article reviews, one by one, the various defining criteria that have in

the last half century been called upon to define the notion of collocation,

pursuing a dual objective: (a) to show that none of these criteria apply in

all cases, so that we can at best give a prototypical definition of collocation,

and (b) to demonstrate that the problems associated with the definition

of collocation stem from the mechanistic, old-paradigm view of language

embodied in structuralist theories which try to impose theoretical abstractions

on an infinitely complex reality arising from communicative interaction and

the institutional practices such interaction puts in place. This will then allow

us to provide a more secure and more broadly based underpinning for the

treatment of colligation and collocation in lexicography. With the exception

of Steyer (2000), no such model has as yet been proposed.

The subject of collocation has been approached from two main angles:

on one side are the semantically-based approaches (e.g. Benson 1986, Mel’c

ˇuk

1998, Gonza

´lez-Rey 2002, Hausmann 2003, Grossmann and Tutin 2003) which

assume a particular meaning relationship between the constituents of a

collocation; on the other is the frequency-oriented approach (e.g. Jones and

Sinclair 1974, Sinclair 1991, Sinclair 2004, Kjellmer 1994) which looks at

statistically significant cooccurrences of two or more words. This theoretical

distinction is paralleled by a geographical divide: the semantic approach has its

410 Dirk Siepmann

origins in continental European research into phraseology, while the frequency

approach is firmly rooted in British contextualism. There has until now been

surprisingly little exchange between the two groups, and when the semanticist

Hausmann (2003) claims to have won the war over collocation, one wonders if

that war has ever been fought.

A third, more recent approach to phrasemes and collocations (Feilke 1996,

2003) might be termed ‘pragmatic’, since it claims that the structural

irregularities and non-compositionality underlying such expressions are

diachronically and functionally subordinate to pragmatic regularities deter-

mining the relationship between the situational context and linguistic forms.

In this view, collocation can best be explained via recourse to contextualisation

theory (Fillmore 1976).

In what follows, I shall argue that there is no reason to resort to the military

metaphor, let alone go to war on matters of collocation. It is much wiser to

unify the three approaches. Tersely stated, I shall argue the following theses:

(1) Only the frequency-based approach can provide a heuristic for discovering

the entire class of co-occurrences; in a way, it is safe from refutation, but

empty – it gives us all the raw material, but tells us nothing about how this

material came to be or how it is to be structured; it has also resulted in

lexicographic products of doubtful value, such as Kjellmer (1994) and

Sinclair (1995) (cf. Hausmann 2003: 319–320, Siepmann 1998).

(2) By contrast, the semantically-based approach is fragmentary – it cannot

account for all possible cases. It would nevertheless seem absurd to

abandon such an intuitively appealing approach at the first appearance of a

counterexample, since it has given rise to reliable collocational dictionaries

such as Langenscheidts Kontextwo

¨rterbuch Franzo

¨sisch-Deutsch.

(3) Likewise, as I shall explain below, a purely pragmatic approach relying on

the extralinguistic context cannot explain a large number of co-occurrences

operating at the level of semantic features.

(4) It follows from this that the debate between the various approaches is a

more/less rather than a yes/no issue. What is needed is an extension of the

semantically-based approach that will take account of strings of regular

syntactic composition which form a sense unit with a relatively stable

meaning. ‘Lexical bundles’ (Biber et al. 1999) such as je sais que c’est or it’s

been will not be included among the class of collocations (cf. Siepmann

2003). Although such sequences may perform similar or identical functions

across a range of texts, they have no meaning ‘by themselves’. In sharp

contrast, there are good theoretical and practical reasons for subsuming

under the notion of collocation such colligational patterns as regarde ou

`tu

vas, dans les colonnes de (þname of newspaper or magazine) or si elle est

prise a

`temps (referring to an illness), which have so far been regarded as

free sequences of words subject only to general rules of syntax and semantics.

Collocation, Colligation and Encoding Dictionaries 411

For greater expository convenience, the various questions raised by the

discussion of the above theses will be broken down under five separate heads:

(1) How many elements make a collocation?

(2) What elements make a collocation?

(3) Are collocations arbitrary?

(4) Can we distinguish between collocations and phraseology on the one hand,

and collocations and free combinations on the other?

(5) Are collocations monosemous and monoreferential? Are there synonymic

collocations?

This will lead to a division of the collocational spectrum into four major

categories, all of which have their role to play in the making of dictionaries,

especially those aimed at the non-native speaker.

My theoretical arguments will be leavened with a large number of concrete

examples encountered during the ongoing compilation of three unabridged

bilingual thesauri intended mainly for non-native speakers of English, French

and German (the ‘Bilexicon’ project). All of these examples have been drawn

from the following authentic sources (for a detailed account of corpus

construction, see Siepmann 2005):

(a) electronic editions of wide-circulation quality newspapers and news

magazines (The Times, The Guardian, The Economist, Le Monde,

Le Monde Diplomatique, Su

¨ddeutsche Zeitung, Frankfurter Rundschau,

Der Spiegel );

(b) a large corpus of academic texts produced from reviews, journal articles,

doctoral theses and portions of books;

the Internet;

(d) a 100-million word corpus of the language of motoring based mainly on

Internet sources.

Table 1 gives a breakdown of the sources used by corpus type, content, size,

baseline year and analysis software.

2. How many elements make a collocation?

It is accepted wisdom among European researchers that collocations are

binary units, and this is probably true for the majority of the class. Thus, the

most common type of collocation is the combination of a noun with a verb,

and there are hundreds of thousands of examples which confirm this point

of view (e.g. take a step,launch an appeal ). Mel’c

ˇuk (most recently 2003)

argues that the constituents of such collocations tend to be linked by a

standard lexical function, such as Magn (rely on [Magn] ¼heavily, beautiful

412 Dirk Siepmann

Table 1: Corpora used in this study

Corpus (Abbreviation) Type Content Word Count Baseline Year

Corpus of Academic

English (CAE)

full-text reviews, journal articles, doctoral

theses and portions of books

30 million 1980 (less than 5% of texts

predate 1980)

Corpus of Academic

French (CAF)

full-text reviews, journal articles, doctoral

theses and portions of books

30 million 1980 (less than 5% of texts

predate 1980)

Corpus of Academic

German (CAG)

full-text reviews, journal articles, doctoral

theses and portions of books

30 million 1980 (less than 5% of texts

predate 1980)

Corpus of English

Fiction (FE)

full-text and samples reviews, journal articles and

portions of books from CAE

50 million 1980

Corpus of French

Fiction (FF)

full-text and samples reviews, journal articles and

portions of books from CAF

50 million 1980

Corpus of German

Fiction (FG)

full-text and samples reviews, journal articles and

portions of books from CAG

50 million 1980

Corpus of English

Motoring (CME)

full-text and samples Internet forums and chatrooms,

electronic magazines, transport

sites, encyclopaedia and

dictionary articles

100 million 1995

British Newspapers and

News Magazines (NE)

full-text Issues of The Times, The

Guardian and The Economist,

published in London

100 million 1990

(Continued)

Collocation, Colligation and Encoding Dictionaries 413

1 / 35 100%

Documents connexes

Collocations: Definition and Characteristics

Lexical Collocations: Extended Definition

Teaching Vocabulary: Collocations - Master's Thesis

LES VERBE REFLECHIS /LES VERBES PRONOMINAUX S`appeler

MFL snow work - Bitterne Park School

English below

2017 / 1 er trimestre What is required for the final French

EXERCICE : En utlisant les pronoms intérrogatifs, posez les

Chapitre 1: À l`université

Collège de Montceaux – Fduperray CHAPTER I

SWAT Model Training Course - Dakar 2016

order form - J`aime 5 à 10

Merci pour votre participation!

Faire une suggestion

Avez-vous trouvé des erreurs dans l'interface ou les textes ? Ou savez-vous comment améliorer l'interface utilisateur de StudyLib ? N'hésitez pas à envoyer vos suggestions. C'est très important pour nous!

GDPR Confidentialité Conditions d''utilisation

Collocation, Colligation, and Encoding Dictionaries

Documents connexes

Faire une suggestion

Produits

Assistance

Produits

Assistance

Collocation, Colligation, and Encoding Dictionaries

Documents connexes

Faire une suggestion

Produits

Assistance

Ajouter ce document à la (aux) collections

Ajouter ce document à enregistré

Suggérez-nous comment améliorer StudyLib