Application domain: | Natural Language Processing |
Source: | Standard PP-attachment dataset + original annotation extracted from WordNet |
Dataset size: | ![]() ![]() |
Data format: | Prolog |
Systems used: | P-Progol |
Pointers: | kazakov,jc,suresh@cs.york.ac.uk |
P-Progol is applied to a natural language processing task of learning
rules for PP-attachment disambiguation (Kazakov et al.). The
dataset consists of 20,000 examples of 2 ``almost''
disjunctive predicates, 4 intensionally defined background
predicates and
23,000 clauses of 6 other background
predicates.
The target predicates have the format
n(Verb,Noun,Preposition,Noun), v(Noun,Verb,Preposition,Noun)
and describe one of the following two syntactic structures:
(VP
(Verb NP(Noun PP(Prep Noun)))) or (VP (Verb NP(Noun) PP(Prep Noun))).
The background predicates map word-forms into lexical entries, and
semantic classes, e.g. begins (Verb) (to) begin (Verb)
{begin, get, start out, commence}.
Progol rules covering each of the classes are learned and then applied to associate semantic classes with a test example of a given class, therefore reducing semantic ambiguity in the phrase.