3.2 Lexemes and lexical fields
3.2.1 Lemma
What are all the grammatical forms of be, cut, tree, nice, beautiful?
- be, am, are, is, were, was, been, ’s, ’m, ’re, ?being
- cut, cuts, (cut, cut), ?cutting
- tree, trees, tree’s, trees’
- nice, nicer, nicest
- beautiful
A lemma is all the inflectional forms of a word. This includes forms with grammatical affixes (tree, trees) and suppletive forms (go, went). What is not included is derivational suffixes like the adjectival -ly. Of course, this requires a clear definition of inflection and derivation. Some researchers might argue that the participial -ing is derivational rather than inflectional. There is also the issue of whether the past participle of some verbs like cut is to be seen as separate “form” or not.
When it comes to the technical side of research, you have to be aware of the decisions taken when lemmatizing corpus data as to what counts and what doesn’t. A lemma in a corpus is not equal to a lemma as a linguistic concept.
3.2.2 Distribution
How can we find out if something is a homonym if we do not know the meaning or want to keep intuition out of the picture?
Animal or sport utensil?
- Maybe I’m a fruitarian bat
- … with a straighter bat than some of the Englishmen
- The unfortunate starved bat was then returned
- And not simply a bat, but an autographed bat
(examples from The British National Corpus 2007)
> [pos = "AJ.*"] []? [hw = "bat"] BNC
In this example, the preceding adjective provides enough context to disambiguate the two meanings. If you expanded this to more co-occurrence patterns, e.g. with verbs or even different text types, two clearly distinct patterns emerge. The animal bat eats, like other animals, whereas the utensil bat strikes like other club-like devices. A Giraffe rarely strikes and a tennis racket doesn’t eat. They each form distinct lexical fields. Distribution plays a defining role in the structure of our lexicon.
3.2.3 Association
A key component of human memory is association. The lexicon is organized in associative networks, semantic fields. What we perceive together frequently, we associate as belonging together. This is also referred to as spatial or temporal contiguity.
- law and …?
- order
- good or …?
- bad, evil
- the number of the …?
- ??beast
- spoils of …?
- ??war
The first word that comes to mind when you read the first two fragments is most likely law and order, and good or bad. For the other two examples, there is expected to be more variation. A metal fan might readily come up with beast, since the song of the same name is part of their cultural experience, and therefore, very frequent for them. spoils of war might not be a phrase that everyone is familiar with at all. spoils as a word is very rare; yet there is a strong association with the phrase. If it is encountered, it occurs together with war more often than not.
References
The British National Corpus, version 3 (BNC XML Edition). 2007. http://www.natcorp.ox.ac.uk/; Distributed by Bodleian Libraries, University of Oxford, on behalf of the BNC Consortium.