CPROGOL

System:	CPROGOL
Version:	4.1, 4.2
Further specification:	empirical ILP system
Pointers:	http://www.doc.ic.ac.uk/~shm/index.html
Code:	C source code
References:	(Muggleton 1995, 1996)
Other comments:	Distributed together with manual and examples

PROGOL employs a covering approach like, e.g., FOIL. That is, it selects an example to be generalised and finds a consistent clause covering the example. All clauses made redundant by the found clause including all examples covered by the clause are removed from the theory. The example selection and generalisation cycle is repeated until all examples are covered. When constructing hypothesis clauses consistent with the examples, PROGOL conducts a general-to-specific search in the theta-subsumption lattice of a single clause hypothesis. In contrast to other general-to-specific searching systems, PROGOL computes the most specific clause covering the seed example and belonging to the hypothesis language. This most specific clause bounds the theta-subsumption lattice from below. On top, the lattice is bounded by the empty clause. The search strategy is an -like algorithm guided by an approximate compression measure. Each invocation of the search returns a clause which is guaranteed to maximally compress the data, however, the set of all found hypotheses is not necessarily the most compressive set of clauses for the given example set. PROGOL can learn ranges and functions with numeric data (integer and floating point) by making use of the built-in predicates ``is'', <, =<, etc.

The hypothesis language of PROGOL is restricted by the means of mode declarations provided by the user. The mode declarations specify the atoms to be used as head literals or body literals in hypothesis clauses. For each atom, the mode declaration indicates the argument types, and whether an argument is to be instantiated with an input variable, an output variable, or a constant. Furthermore, the mode declaration bounds the number of alternative solutions for instantiating the atom. The types are defined in the background knowledge by unary predicates, or by Prolog built-in functions.

Arbitrary Prolog programs are allowed as background knowledge. Besides the background theory provided by the user, standard primitive predicates are built into PROGOL and are available as background knowledge. Positive examples are represented as arbitrary definite clauses. Negative examples and integrity constraints are represented as headless Horn clauses. Using negation by failure (CWA), PROGOL is able to learn arbitrary integrity constraints.

PROGOL provides a range of parameters for controlling the generalisation process. These parameters specify the maximum cardinality of hypothesis clauses, a depth bound for the theorem prover, the maximum layers of new variables, and an upper bound on the nodes to be explored when searching for a consistent clause. PROGOL allows to relax consistency by setting an upper bound on the number of negatives that can be covered by an acceptable clause.

PROGOL4.2 (Muggleton 1996) is an upward compatible with version 4.1 but learns from positive-only data. The system is available from the author upon request. The paper (Muggleton 1996) describing PROGOL4.2, can be obtained as Postscript by anonymous ftp from ftp.comlab.ox.ac.uk in the file
pub/Packages/ILP/Papers/poslearn1.ps.

References

S. Muggleton. Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming, 13(3-4):245-286, 1995.
S. Muggleton. Learning from positive data. In S. Muggleton, editor, Proceedings of the 6th International Workshop on Inductive Logic Programming, pages 225-244. Stockholm University, Royal Institute of Technology, 1996.

back to index