3 Semantics II

3.1 How do antonyms emerge?

We have seen quite a few examples of various types of antonymy in Justeson & Katz (1991). Let’s approach this from a different perspective by first thinking about synonyms, words with the same meaning, rather than the opposite. A typical example is the pair buy and purchase. Both words have to do with the exchange of money in one direction and the exchange of some sort of counter value in the other direction. Sameness can be claimed on the basis that substituting those words in context won’t change the proposition of the sentence.

We’ve bought more of the chocolate you like.
We’ve purchased more of the chocolate you like.

Roughly speaking, the real-world situation the sentences describe is likely the same. Yet the example in 20 sounds unusual in the context of chocolate. The substitution is also not possible in all contexts. Consider the following example where a different sense of buy is used that roughly means believe.

She told me she’d never eaten chocolate, but I don’t buy it.
She told me she’d never eaten chocolate, but I don’t purchase it.

The reading believe is not available for purchase. True synonymy in the sense of perfect substitutability is rare if it exists at all. The only candidates might be dialectal variations of words with very specific meanings, such as German Brötchen, Schrippe, Semmel. There are always differences in connotation and at the very least differences in use. You find different distributions, collocations and fixed expressions. For example, the phrase Brötchen verdienen is common in German but Schrippen verdienen isn’t, even in regions where Schrippe is the more common word.

At this point, the existence of true synonyms depends on the very definition of meaning. In Usage-Based linguistics, the lines between denotation, connotation and distributional properties is blurred on the assumption that all of these aspects are intertwined. The “Principle of no Synonymy” (Goldberg 1995) is a prominent idea from this paradigm. Meaning is seen as being inseparable from use, therefore, co-occurrence patterns become extremely important.

If lexemes cannot have exactly the same meaning, can they even have opposite meanings? Does it even make sense to speak of opposite use? There is clearly an intuition for oppositeness. There are two main ways to explain antonymy. The traditional one is that antonyms can paradigmatically replace their opposite.

Paradigmatic, replaceable

He was a good dog.
He was a bad dog.
I feel good today.
I feel bad today.

In fact, that makes them extremely similar in their use. You would expect similar collocates, constructions and syntax.

Justeson & Katz (1991) argue against this view and propose that the intuition we have that some words have direct opposites is grounded in their co-occurrence syntagmatically.

Syntagmatic, co-occurrence

There are good and bad dogs.
Some dogs are good, some are bad.
I feel neither good nor bad.
Good jokes make people laugh, unlike bad ones.

One fascinating aspect of these observations is that, while synonyms occur in wildly different contexts, antonyms tend to occur together. The hypothesis is that we think of antonyms as antonyms exactly because we see experience them together all the time. high and low are contiguous; they co-occur—high and flat are not, because they don’t not because of some objective meaning components. In fact, there is usually only one antonym within a set of synonyms. If we assume a componential model based on truth-conditions, this would be difficult to explain. Taking logical meaning components alone does not explain speaker intuition.

3.2 Causal relationships

The findings from Justeson & Katz (1991) work very well with cognitive concepts of memory and learning. An interesting interpretation of the findings would be that we don’t need any inherent meaning to explain antonyms. Children would just learn what antonyms are through language use. Here, we are falling for a common trap though, which is bias towards a specific theory. The findings might be consistent with more than one theory. In order to evaluate the usage-based interpretation, we should also consider alternative explanations.

Usage-based linguistics is most strongly contradicted by nativist theories, such as Universal Grammar. The idea of nativists is that we are born with a capacity for language including some deep undelying linguistic categories. In this view, children already have categories like antonymy (or more generally oppositeness) hard-wired in their brains. Language learning then would consist of categorizing new stimuli against these pre-existing categories. It would be possible to imagine that the structure for antonym relationship is already given and children learn which words are antonyms. As a result of that, use them together more often than other word pairs.

Ultimately, it is a question of causality. Did co-occurrence cause the emergence of antonym pairs, or did the oppositeness of the lexemes cause the co-occurrence? We have arrived at a chicken or egg situation:

Antonyms co-occur and children/learners associate them. Their oppositeness is a result of this.
Antonyms have opposite meanings, children learn that first and, as a result, use them together.

One of the main routes to take is the search for an a priori definition of antonymy or oppositeness that we would need to determine a potential pre-language concept. Observational methods are not normally used to infer causal relationships. This is normally the job of experimental methods, where you can setup a controlled environment for language to be produced in order to control for as many confounding variables as possible. However, when it comes to children, the options are limited and it is hardly possible to check whether there is a pre-language concept of antonymy. Without empirical data, the options are limited but, you can still theorize over the most likely explanations. Non-empirical methods are common in literary and cultural studies and also philosophy.

When it comes to language and children, there have been a number of stories and questionable unethical experiments. Friedrich II. and the Nazis experimented on children and deprived them of language, and there have been multiple accounts of orphaned children who grew up alone or with animals. It is tempting to take these stories as evidence. However, neither of these accounts have been carried out with the necessary academic rigor, and there was usually some sort of ideological agenda. Anecdotal evidence is often full of contradictions and fantasy. Sometimes the stories are distorted to fit a particular belief, and sometimes contradicting aspects are simply left out. For the most part, observations that are non-reproducible are not viable even if the reasons for their non-reproducibility are ethical.

In conclusion, many observations in corpus linguistics are simply evidence for correlations, and it is very hard to infer causal relationships without the help of experimental methods. If those are not available, it is necessary to have a good grasp on the philosophy of science to narrow down possible interpretations. In any case, you always have to be aware of the limitations of the data and methodology.

3.3 Co-occurrence/correlation/contiguity

Correlation is a concept that should be common knowledge. It is most commonly encountered in the context of statistics. Roughly speaking a correlation is a close numeric relationship between data points. The most common type of correlation we encounter in corpus linguistics is co-occurrence, which is simply the observation that two linguistic structures happen in the same environment, mostly the same text, sentence, phrase or even right after one another. Another con word we have encountered is contiguity. Same prefix (con, Latin for with, together) same general idea, but different context. Contiguity is most often used in the sense of co-occurrence on the level of experience. It stresses the psychological aspects of perception and memory and what is perceived as occurring together. Contiguous stimuli aren’t necessarily correlated from an objective point of view. The main contrast to association that arises from contiguity is association that arises through similarity.

Consider the following scenario:

Every time you leave your house, an elderly woman from next door screams God bless, out of her window. It is likely that you will be reminded of this grandma when you hear this phrase. You associate the two because there is temporal contiguity. You might likewise be reminded of the woman and the phrase when you leave your house and the old lady is not around because there is also local contiguity.

Statistically speaking, there is no correlation between old women and yelling God bless out of windows. On a broader scale, however, the phrase might be correlated, i.e. more common, with older speakers. The correlation might be weak, but it isn’t unreasonable to hypothesize that. In order to judge that, only one piece of evidence alone isn’t enough to establish a correlation.

Now, on a linguistic level, there is definitely a relationship between God and bless. A linguistic con concept (no pun intended) is collocation, which refers to words that occur together significantly often. On the level of the individual piece of data, which is you experiencing this phrase again and again, in this particular discourse situation, the two words simply co-occur. Co-occurrence is just when things happen at the same time and/or same place. In order to know whether God bless is a collocation, you’d need more evidence for both God and bless.

All these differences might be subtle, but they should rarely cause confusion. The biggest differences lies in the communicative context they are used in. In some sense, the relationship between some of those con words is one of synonymy, with which we have come full circle for this week. Them being very similar doesn’t mean, however, that they are interchangeable, especially not in academic prose.

If anything, the principle of no synonymy is even truer in academic language. ;)

3.4 Reproducing Justeson & Katz 1991

Using CQP and the Brown Corpus, which is available on our server, we can try to reproduce the results. If you want to look at a pair in detail, you can do the following:


BROWN;
old = [word = "old" %c & pos = "JJ.*"] expand to s;
young = [word = "young" %c & pos = "JJ.*"] expand to s;

young_and_old = intersect young old;

size young;
size old;
size young_and_old;

Here are all commands in detail:

BROWN — activate the Brown Corpus
old = ... — binds the query results to a variable called old
[...] — look for one token
[word = "old" ... — match the orthographic form old
%c — ignore case, i.e. also search for Old, OLD, oLD, etc.
... pos = "JJ"] — also match the pos-tag JJ which stands for adjective (see cheatsheet or type info in CQP)
expand to s — match the entire sentence surrounding
intersect young old — keep all matches that include those from young and old (intersection = Schnittmenge)
size — print number of matches (absolute frequency)
; — end command; this is only necessary if run as a script; it is otherwise equivalent to hitting enter

In summary, this searches for all sentential co-occurrences in Justeson & Katz (1991), i.e. old and young within the same sentence.

Of course, if we want to do this for all adjectives from the paper, this is too tedious. We can instead find all adjective pairs within the same sentence.

adj = [pos = "JJ"] []* [pos = "JJ"] within s;

set PrettyPrint no;
group adj match lemma by matchend lemma > "coocurrences.csv";

all = [pos = "JJ"];
group all match lemma > "all.csv"

This looks for adjectives within a sentence with anything in between, and a frequency list of it to a file. We also create a second file with a frequency list of adjectives.

[]* — a token without constraints, so any token, and it can be repeated 0 times or any amount of times.
within s — the match needs to be within a sentence, so []* cannot cross sentence boundaries
set PrettyPrint no — removes some formatting from the output so that we can work with it more easily later
group — make a frequency list (similar to count)
all — name of our variable holding results of [pos = "JJ"]
match lemma — count the lemma of the first token of the match (equivalent to match[0] lemma)
> — redirect output
"..." — file name to redirect to; in our case, all.csv and coocurrences.csv

For further processing, CQP is not the right tool anymore. The next step would be to filter out some combinations that we are not interested in and find the Deese antonyms etc. This is a job for scripting tools like R, Python or Excel, and we’ll go over it in the future.

3.5 Homework

See homework 4.

3.5.1 Tip of the day

Use spreadsheets! You will inevitably have to at some point enter some numbers into something like LibreCalc, Microsoft Excel, or Google Sheets. We will benefit from spreadsheets throughout this module, but this is not where their utility stops. Being able to do some quick formulae and vlookups in Excel are common skills needed outside Uni.

Especially for teachers, spreadsheets are an essential skill: for grades, averages, homework, quick stats on exams, lesson planning, Sitzplan (oh memories :D), what have you. If you know your way around Excel, you can speed up your tax returns (Steuererklärung) a lot, too. Many teachers end up working as freelancers. For a freelancer (and anyone else really), gathering your receipts, bills and pay slips neatly arranged and categorized as data in a spreadsheet can save you endless amounts of time and even money.

This is not where it stops though. Timetables and To-Do-Lists are also neat to do in a spreadsheet if you need more fine-grained control over the layout than the clunky online calendar you are probably using. Here are some things I have used spreadsheets for in the past: notes, training log, travel plans, shopping lists. You could even use them for recipes or counting calories if that’s what you’re into.

I myself have since moved past Excel/Calc and use only plain text files. If I need to do some maths or stats, I use .csv or .tsv files in combination with statistical software such as R. That might seem to be the ultra-nerd level, but isn’t so difficult to learn at all, and can save you additional time and frustration. Maintaining a CV for example is a breeze if you have everything as plain data and deal with the formatting in an automated fashion, and only when you need to.

2 Semantics I

4 Frequencies and Zipf’s Law