Context is paramount in word-sense disambiguation

Word-sense disambiguation (WSD) is the problem of identifying which sense of a word is used in a sentence. In this article I explore cases where WSD could be used to help clarify what a given sentence means.

In everyday use of language, people resolve the ambiguity of words - which word sense applies in a sentence - by looking at the context in which the word is used.  In modern NLP we often focus on disambiguating nouns and noun phrases, and overlook other sources of ambiguity in a sentence.

* Noun Ambiguity

Consider the phrase /the Roosevelt administration/ as it might occur in some longer text.  What does this mean?  There are multiple possibilities:

+ the US presidency of Franklin D. Roosevelt (FDR) + the US presidency of Theodore Roosevelt (TR)

To work out what our sample phrase means is a problem of /named entity recognition/ (NER), which is used to identity which of multiple entities sharing a common name, that a specific noun phrase refers to.

According to US Census Bureau data in 2010, there were 44,935 distinct individuals named /John Smith/. [1]

 * Verb Ambiguity

NER as usually implemented doesn't disambiguate verbs, just nouns and noun phrases.

Consider the three triples (adapted from [2])

1. <Man, Is-a, Person>

2. <Author, Is-a, Person>

3. <John, Is-a, Person>

The first triple is correct if "is-a" means a subsumption relationship, as in, the narrower category /man/ is included in the broader category /person/; colloquially, /man/ is a type of /person/.

The second triple makes sense if "is-a" means "is a role of", as in, an /author/ is a role of a /person/.

The third triple makes sense if "is-a" means "is an instance of", as in, the individual human being named /John/ is a specific instance of the category named /Person/.

Each of these triples makes sense if you choose the appropriate sense or meaning of "is-a".  If you use the wrong meaning of "is-a", the triple in question doesn't make sense.

The point here is that the verb "is" is highly overloaded; it has multiple meanings, each of which make sense in some contexts and do not make sense in others.  It's the presence of multiple interpretative contexts that make the verb "is" ambiguous.

* Temporal Ambiguity

Consider again the phrase /the Roosevelt administration/.  This phrase could refer to either of two people (FDR or TR), and is ambiguous as a noun phrase.  It also ambiguous in a temporal sense.  TR had two terms in office, FDR had four terms in office, and the /the Roosevelt administration/ may refer to any of these six possible terms in office.

For example, consider the claim "/The Roosevelt administration/ worked to pass the Hepburn Act in 1906"

In the context of time, if we could refer to a knowledge graph or other knowledgebase in which dates of presidential terms are recorded, we could constrain /the Roosevelt administration/ in this example to refer to Theodore Roosevelt and specifically TR's second term in office. 

Noun and verb ambiguity are in a way related - these are different facets of the WSD problem.  Temporal ambiguity can be mentally located in a larger contextual environment, the surrounding "cloud of facts" that includes aspects of time, geography, and other dimensions that can be used to situate and understand what a sentence means.

[1] https://blog.timesunion.com/rogergreen/how-many-people-in-the-usa-have-your-name/1574/

[2] Mustafa Jarrar and Robert Meersman: Ontology Engineering -The DOGMA Approach. Book Chapter (Chapter 3). In Advances in Web Semantics I. Volume LNCS 4891, Springer. 2008.

Next
Next

Supercharging biometric data collection