Learning Rules for PP-Attachment Disambiguation (York)

Application domain: Natural Language Processing
Source: Standard PP-attachment dataset + original annotation extracted from WordNet
Dataset size: $\sim$20,000 positive examples of 2 predicates, 4 intensionally defined background predicates and $\sim$23,000 clauses of 6 other background predicates
Data format: Prolog
Systems used: P-Progol
Pointers: kazakov,jc,suresh@cs.york.ac.uk

P-Progol is applied to a natural language processing task of learning rules for PP-attachment disambiguation (Kazakov et al.). The dataset consists of $\sim$20,000 examples of 2 ``almost'' disjunctive predicates, 4 intensionally defined background predicates and $\sim$23,000 clauses of 6 other background predicates.

The target predicates have the format n(Verb,Noun,Preposition,Noun), v(Noun,Verb,Preposition,Noun) and describe one of the following two syntactic structures:
(VP (Verb NP(Noun PP(Prep Noun)))) or (VP (Verb NP(Noun) PP(Prep Noun))). The background predicates map word-forms into lexical entries, and semantic classes, e.g. begins (Verb) $\rightarrow$ (to) begin (Verb) $\rightarrow$ {begin, get, start out, commence}.

Progol rules covering each of the classes are learned and then applied to associate semantic classes with a test example of a given class, therefore reducing semantic ambiguity in the phrase.

Bibliography

  1. Dimitar Kazakov, James Cussens, and Suresh Manandhar. On the duality of semantics and syntax: The PP attachment case. Unpublished.


back to index