Uncertainty Reduction: Ambiguity Resolution Mechanisms in Language

Ambiguity is a constant problem for any embodied cognitive agent with limited resources. Decisions need to be made, and their consequences understood, despite the probabilistic veil of uncertainty enveloping everything from sensory input to action execution. Clearly, there must be mechanisms for dealing with or resolving such ambiguity.

A nice sample domain for understanding ambiguity resolution is language, where problems of uncertainty have been long appreciated. The meaning of words in general (not to mention referents like "that" or "he") can be highly ambiguous (see "the gavagai problem"). A similar problem abounds in grammar, famously in the case of garden path sentences ("the horse raced past the barn fell"), where grammatical ambiguities are often completely overlooked until a differentiating word is encountered ("fell").

Most accounts of language emphasize the distinction between semantics (the meanings of words) and syntax (the rules involved in how words are put together - essentially, grammar). One might therefore suspect that ambiguity resolution in these two domains is separable. However, a classic Psychological Review article by MacDonald, Pearlmutter and Seidenberg describes a single ambiguity resolution mechanism which might operate both on semantics and syntax.

MacDonald et al emphasize that the same three issues turn up in both lexical and syntactic explorations of ambiguity resolution: the role of frequency information, contextual constraints, and issues concerning modularity v.s. distributed interactivity. I'll illustrate examples from both below:

Frequency Information. Words with approximately equally-frequent multiple meanings (e.g., "pitcher") show longer eye fixation times than those words with either a single meaning or those with highly biased meanings (where one meaning is much more frequent than the other). Similarly, in grammatical processing, the interpretation of garden path sentences was presumed by Chomskian theory to be accomplished by a grammatical parser with no access to frequency information, and yet some work demonstrates that the frequency of words used in garden path sentences may influence the interpretation subjects adopt to resolve the ambiguity of those sentences.

Contextual Information. In research on the influence of context on semantics, some studies have shown that words with multiple meanings have all the potential meanings activated automatically, while other studies have shown that the context in which the word appears does influence the extent to which certain meanings become activated (as determined through priming studies), even when the context doesn't seem to directly prime the ambiguous word's various meanings. Similarly, in research on syntax, context has been shown to influence the interpretation of garden path sentences, contradicting other accounts (e.g., minimal attachment algorithms) of garden path sentence processing.

Representation: Modularity vs Distributed Processing. Although many researchers emphasize that the multiple meanings of words seem to be accessed from memory, as though each meaning comprises a different record in a master database of all meanings (sometimes called the "mental lexicon"), other research has demonstrated that meanings interact with one another through the frequency and contextual effects described above. Thus, lexical access seems compatible with what might be expected from a distributed (i.e., connectionist) rather than modular (database-like, to simplify) representation. Similarly, in grammatical processing, early Chomskian theory presumed that grammatical rules were unrelated to the particular lexical entries of a particular language, whereas later Chomskian and related theories (e.g., Government Binding theory) proposed a much tighter interaction between semantics and grammar, suggestive again of more distributed and less modular processing.

Rather than reflecting mere coincidence, MacDonald et al propose that the similar theoretical, methodological and empirical issues surrounding lexical and syntactic processing reflect a fundamentally similar mechanism underlying the resolution of ambiguity in all of linguistic processing.

Specifically, they suggest that the cortical representation of words is distributed, such that many neurons participate in the representation of many words and that those representations differ mostly in the degrees to which various neurons contribute to those representations. Critically, these networks encode not only semantic information but also syntactic information (for example, tense, voice, person, gender, etc). Nodes which represent mutually compatible interpretations of a sentence are connected in an excitatory fashion, whereas those representing mutually incompatible interpretations are connected with inhibitory links; thus syntactic structures can be activated in a more graded fashion, in contrast to the "all-or-none" selection of grammatical structures implied by other views.

In this system, ambiguity resolution is accomplished by a winner-take-all process, at both the level of the individual words (which contain multiple meanings and related grammatical structures) and at the level of the larger linguistic context (where the "winning" patterns of activity from previous words may have a carry-over influence on activity elicited by the currently-processed word). The authors go on to account for a variety of syntactic ambiguities using this model, and demonstrate that the same lexical and contextual effects hold across these phenomena, as predicted by their unitary model of linguistic processing. It's interesting to note that some later connectionist models of language adopted a dual route mechanism, one route relying on phonological and another relying on orthographic information, to explain past tense formation. Although MacDonald et al advocate a single mechanism, they do not appear to have implemented this theory in all its scope, so it's unclear what kinds of architectural changes might be necessary to get it to work properly.

More like this

Trueswell & Kim's paper in the Journal of Memory and Language describes a phenomenon known as "fast priming," in which a reading task is momentarily interrupted by a brief presentation of a "prime" word, usually lasting around 30 to 40 ms. The reading task then continues, and although subjects…
To what extent is music like language? Previously, I've reviewed how music and language share semantic characteristics, at least insofar as similar scalp electrical activity follows incongruent musical passages as follows incongruent words. But is it also possible that music has grammar, just…
The claim that language processing can be carried out by purely "general purpose" information processing mechanisms in the brain - rather than relying on language-specific module(s) - may seem contradicted by a slew of recent neuroimaging studies demonstrating what appears to be a visual "word form…
When reading the title of this post, your knowledge of the world was sufficient to let you interpret the phrase "when pigs fly," but also alerted you to the fact that it is inconsistent with much of that world knowledge: clearly, pigs don't fly. A new study by Menenti, Petersson, Scheeringa…

It occurs to me that examining the responses to LOLcat-style sentences might yield interesting information on this issue....

By David Harmon (not verified) on 04 Oct 2007 #permalink

Without ever having to relate to neurons and the brain there is an algorithm to remove ambiguities over any set of potential patterns provided there are at least some criteria to discriminate between the patterns (of course) no matter how entangled the others traits could be (and no matter the field of experience is or what the patterns are about).
This is the closure operator of Formal Concept Analysis which somehow implement our "pars pro toto" recognition capability.

A good (but very technical) paper on this is What Is A Concept? by Chris Hillman (as I remember the first 4 or 5 pages are enough to grasp the idea behind FCA and the closure operators).

If even computer scientists can do this why would the brain couldn't?

There is no need for any kind of structure in the traits involved just that they can be probed, that matches the "single mechanism" of MacDonald.

By Kevembuangga (not verified) on 04 Oct 2007 #permalink