As FOIL, GOLEM is a ``classic'' among empirical ILP systems. It has
been applied successfully on real-world problems such as protein
structure prediction and finite element mesh design.
GOLEM copes efficiently with large datasets. It achieves this
efficiency because it avoids searching a large hypothesis space for
consistent hypotheses as, for instance, FOIL, but rather
constructs a unique clause covering a set of positive examples relative to
the available background knowledge. The principle is based on the relative
least general generalisations (rlggs) introduced by Plotkin. GOLEM embeds the construction of
rlggs in a covering approach. For the induction of a single clause, it
randomly selects several pairs of positive examples and computes their
rlggs. Among these rlggs, GOLEM chooses the one which covers the
largest number of positive examples and is consistent with the negative
examples. This clause is then further generalised. GOLEM randomly selects
a set of positive examples and constructs the rlggs of each of these
examples and the clause obtained in the first construction step. Again,
the rlgg with the greatest coverage is selected and generalised by the
same process. The generalisation process is repeated until the
coverage of the best clause stops increasing. GOLEM conducts a
postprocessing step, which reduces induced clauses by removing irrelevant
In the general case, the rlgg may contain infinitely many literals.
Therefore, GOLEM imposes some restrictions on the background
knowledge and hypothesis language which ensure that the length of rlggs
grows at worst polynomially with the number of positive examples. The
background knowledge of GOLEM is required to consist of ground facts.
For the hypothesis language, the determinacy restriction applies, that is,
for given values of the head variables of a clause, the values of the
arguments of the body literals are determined uniquely. The complexity of
GOLEM's hypothesis language is further controlled by two parameters,
i and j , which limit the number and depth of body variables in a
GOLEM learns Horn clauses with functors. It may be run as a batch
learner or in interactive mode where the induction can be controlled
manually. GOLEM is able to learn from positive examples only.
Negative examples are used for clause reduction in the postprocessing
step, as well as input/output mode declarations for the predicates the user
may optionally supply. For dealing with noisy data, GOLEM provides a
system parameter enabling the user to define a maximum number of negative
examples a hypothesis clause is allowed to cover.
S. Muggleton and C. Feng.
Efficient induction in logic programs.
In S. Muggleton, editor, Inductive Logic Programming, pages
281-298. Academic Press, 1992.
A further note on inductive generalization.
In Machine Intelligence, volume 6, pages 101-124. Edinburgh
University Press, 1971.
Automatic methods of inductive inference.
PhD thesis, University of Edinburgh, 1971.