Transfer Rules Learning (STO)

This section describes joint work of STO with the end user Telia Research.

Application domain: Transfer Rules Learning
Source: Telia Research
Further specification: Data set of 246 QLF pairs
Pointers http://www.dsv.su.se/ML/
Data complexity: 560 KB
Data format: Prolog

The data

The Core Language Engine (CLE) is a general purpose device for mapping between natural language sentences and logical form representations of their meaning, which has been used as one part in a system called Spoken Language Translator (SLT). SLT is able to translate spoken English into spoken Swedish (and vice versa) within restricted domains. Two main components of the system are the Swedish and English versions of the CLE. Input sentences are analyzed by the source language version of CLE as far as the level of quasi logical form (QLF), and then, instead of further interpretation, undergo transfer into another QLF having constants and predicates corresponding to word senses in the other language. The transfer rules used in this process can be viewed as a kind of meaning postulate. The target language CLE then generates an output sentence from the transferred QLF, using the same linguistic data as used for the analysis of that language.

A transfer rule specifies a pair of QLF patterns (i.e. either atoms or compound terms, where the latter may contain variables). The first argument corresponds to a QLF in one language and the second argument to a QLF in the other. The QLF patterns in these rules can be QLF (sub)expressions, or such expressions with transfer variables showing the correspondence between the two arguments. Many transfer rules are simple relations between two atoms, e.g.\

trans(flight_AirplaneTrip,flygning_Flygresa)
while others are more complicated, e.g.\
trans([and,X1,form(_,verb(no,A,yes,M,D),V,Y1,_)],
      [and,X2,[island,form(_,verb(pres,A,no,M,D),V,Y2,_)]]):-
      trans(X1,X2), trans(Y1,Y2).
The transfer rules are normally hand-crafted through inspection of a set of non-transferrable QLF pairs, which is a tedious and time-consuming task. The main problem addressed in here is how to use ILP techniques in order to automatically learn transfer rules from examples. So far, one example set has been obtained from TELIA Research AB consisting of 246 QLF pairs of various size, ranging from QLFs corresponding to two word phrases (e.g. to Denver) to full sentence parses (e.g.What is the cheapest one way fare from Boston to Washington). Despite the sentences being restricted to the Air Travel Information domain, the number of lexical item pairs extracted amounts to 176 pairs.

The experiments

Three main features of the transfer rule learning problem is that i) often more than one clause should be produced from each example, ii) only positive examples are provided, and iii) the produced hypothesis should be recursive. Most previous ILP systems produce at most one clause from each positive example, and this is a significant problem when learning transfer rules, since it is not practically feasible to provide examples of each clause to be induced, but rather a set of clauses should be produced from each example.

There are however techniques that overcome this by instead of producing clauses one by one, produce a set of clauses by specialising an overly general hypothesis in the form of a logic program. A system based on this idea, called TRL, has been developed by STO. TRL first generates a set of overly general clauses, which then are specialised. The problem of over-generalisation due to lack of negative examples is handled by assuming output completeness, i.e. for each source QLF in a pair, it is assumed that all but the target QLF in the pair are undesired. This is a reasonable assumption in the transfer rule learning domain as normally only one transfer is desired (in case of multiple transfers, CLE uses an additional module for choosing one of them).

Experiments are currently being performed using the following scheme. QLF pairs from (a subset of) the set of all available examples are randomly split into two disjoint sets: one used for learning and the other for testing. The rules generated are tested for at least two parameters: coverage and determinacy. Whereas defining coverage is almost straightforward, as the percentage of the target QLFs from the test set that could be obtained from their sources via transfer, measuring determinism poses some problems. Measures such as the average number of transfer outputs seem to be too rough to estimate the extent of the phenomenon. Actually, not all non-determinism is bad: some may promote wider choice at the generation end of the translation process. This raises the point that evaluating the quality of transfer rules should take into account features of the generation module. In the case of the CLE-based translation system, the post-transfer module, in addition to the target language grammar, features a number of preference metrics ranging from checks refusing certain (syntactically correct) QLFs to complex schemes preferring certain expressions word choices. Those should be included in the determinism (and possibly quality) tests.

References

  1. Henrik Boström: Induction of Recursive Transfer Rules. In James Cussens, Saso Dzeroski (Eds.): Learning Language in Logic, pages 237--246, Lecture Notes in Computer Science 1925. Springer, Berlin, 2000.


back to index