FORS: First Order Regression System (LAI)

Further specification: First order regression system
Code:C mixed with SICStus Prolog
References:Karalic and Bratko 1996

Handling numerical constraints in the normal ILP setting takes the form of induction of classification or regression rules that involve the use of real numbers, predicting a discrete or a real-valued class in the presence of background knowledge. In the ILPtex2html_wrap_inline208 project, a transformation approach to this problem was developed, using propositional systems as subroutines. However, this approach only works for determinate background knowledge, which excludes its applicability to domains such as predicting activity or mutagenicity of chemical compounds.

A new approach developed in ILPtex2html_wrap_inline210, called First Order Regression (FOR), is a combination of ILP and numerical regression. First-order logic descriptions are induced to carve out those subspaces that are amenable to numerical regression among real-valued variables. The program FORS (First Order Regression System) is an implementation of this idea, where numerical regression is focused on a distinguished continuous argument of the target predicate. This can be viewed as a generalisation of the usual ILP problem. Namely, the target predicate in usual ILP can be modified by adding an extra ``continuous'' attribute whose value would be determined by the truth of the examples: 1.0 for positive examples and 0.0 for negative. The regression formulas would only involve this attribute and FORS would tend to find rules that cover subsets of positive-only and negative-only examples.

FORS uses a covering approach, similar to the one of FOIL. The clause building part of the algorithm uses a top-down approach. The algorithm starts with the most general candidate clause, covering the entire example set and then specializes the clause by adding literals. Clause construction uses beam search to guide the algorithm through the space of possible clauses. As a part of the system, pruning based on the Minimum description length principle was developed that can handle also continuous variables. It turned out that MDL pruning helps to build more comprehensible models, while at the same time preserves model's performance in terms of its prediction power. FORS can handle noisy data and can also model dynamic systems (learn from time series).


  1. A. Karalic, I. Bratko: First Order Regression. Machine Learning, 1997.

back to index