5.3 Collocation as probabilistic phenomenon

Collocations are co-occurrences that are perceived and memorized as connected. Some collocations are stronger, some weaker, and some are so strong that one or all elements only occur together, making it an idiom or fixed expression.

  1. spoils of war: idiom
  2. declare a war: strong collocation
  3. fight a war: weak collocation
  4. describe a war: not a collocation
  5. have a war: unlikely combination
  6. the a war: ungrammatical

The above examples are ordered according to the probability that they can be observed. We will get into the details of how we can quantify this exactly in the lecture and in future seminar classes. If you are impatient, www.collocations.de has a very detailed guide to how that works. For now, the logic is simply that we consider not only the different frequencies of the individual phrases, as in the examples above, but also the frequencies of the each word in the phrases, the words in the corpus, and all attested combinations of [] a war.

The important takeaway is that a defining property of collocation is gradience. That means that it is a matter of degree. The more frequent, a collocation is, relatively speaking, the more likely it is to be memorized as a unit. It is not a question of either or, or black and white. This idea has since been described for most if not all linguistic phenomena, including grammatical constructions, compound nouns, and even word classes, just to name a few.