System: | FORS |
Version: | 0.16 |
Further specification: | First order regression system |
Pointers: | http://www-ai.ijs.si/AramKaralic/fors/ |
Code: | C mixed with SICStus Prolog |
References: | Karalic and Bratko 1996 |
Handling numerical constraints in the normal ILP setting
takes the form of induction of classification or regression rules that
involve the use of real numbers, predicting a discrete or a real-valued
class in the presence of background knowledge. In the ILP
project, a transformation approach to this problem was developed, using
propositional systems as subroutines. However, this approach only works
for determinate background knowledge, which excludes its applicability
to domains such as predicting activity or mutagenicity of chemical compounds.
A new approach developed in ILP,
called First Order Regression (FOR), is a combination of ILP and numerical
regression. First-order logic descriptions are induced to carve out those
subspaces that are amenable to numerical regression among real-valued variables.
The program FORS (First Order Regression System) is an implementation of
this idea, where numerical regression is focused on a distinguished continuous
argument of the target predicate. This can be viewed as a generalisation
of the usual ILP problem. Namely, the target predicate in usual ILP can
be modified by adding an extra ``continuous'' attribute whose value would
be determined by the truth of the examples: 1.0 for positive examples and
0.0 for negative. The regression formulas would only involve this attribute
and FORS would tend to find rules that cover subsets of positive-only and
negative-only examples.
FORS uses a covering approach, similar to the one of FOIL. The clause building part of the algorithm uses a top-down approach. The algorithm starts with the most general candidate clause, covering the entire example set and then specializes the clause by adding literals. Clause construction uses beam search to guide the algorithm through the space of possible clauses. As a part of the system, pruning based on the Minimum description length principle was developed that can handle also continuous variables. It turned out that MDL pruning helps to build more comprehensible models, while at the same time preserves model's performance in terms of its prediction power. FORS can handle noisy data and can also model dynamic systems (learn from time series).