TILDE: Top-down Induction of Logical Decision Trees (KUL)

System:TILDE
Pointers: http://www.cs.kuleuven.ac.be/~hendrik/Tilde/tilde.html
Code:Prolog
References:Blockeel and De Raedt 1998

In the attribute-value learning domain, two major paradigms exist. There are decision tree learners, which follow a divide-and-conquer approach; and rule induction systems, typically following a covering approach. Because of these different search strategies, a rule set derived from a decision tree will usually look different from a rule set found directly by means of a typical rule induction system.

In the ILP domain, up till now most systems have used the covering approach, although some authors (e.g., Bostrom 1995) have already pointed out that the divide-and-conquer strategy can have advantages in some cases.

Recently, an algorithm has been developed at the K.U.Leuven that learns a predicate logic theory by means of so-called logical decision trees. Logical decision trees are a first-order logic upgrade of the classical decision trees used by propositional learners. In the same manner as propositional rules can be derived from decision trees (each rule corresponds to a path from the root to some leaf; the tests in the nodes on that path are conditions of the rule), clauses can be derived from logical decision trees (each test on the path from root to leaf now being a literal or conjunction of literals that is part of the clause). The resulting trees can directly be used for classification of unseen examples, but they can also easily be transformed into a logic or Prolog program.

The ILP setting used by this algorithm, is the ``learning from interpretations'' setting (De Raedt and Dzeroski 1994), as also used by the Claudien (De Raedt and Dehaspe 1996) and ICL (De Raedt and Van Laer 1995) systems.

The TILDE system is a prototype implementation of this algorithm. It incorporates many features of Quinlan's C4.5, which is a state-of-the-art decision tree learner for attribute-value problems. Next to these, a number of techniques are used that are specific to ILP: a language bias can be specified (types and modes of predicates), a form of lookahead is incorporated, and dynamic generation of literals (DGL) is possible. The latter, based on Claudien's call handling, is a technique that allows, among other things, to fill in constants in a literal. For learning in numerical domains, a discretization procedure is available that can be used by the DGL procedure to find interesting constants.

This implementation, not surprisingly, performs as well as C4.5 on propositional problems (except for lower speed), but experiments on typical ILP data sets also show promising results with respect to predictive accuracy, efficiency and theory complexity.

References

  1. H. Blockeel and L. De Raedt. Top-down induction of first-order logical decision trees. Artificial Intelligence, 1998.
  2. H. Boström. Covering vs. divide-and-conquer for top-down induction of logic programs. In Proceedings of the 14th International Joint Conference on Artificial Intelligence. Morgan Kaufmann, 1995.
  3. L. De Raedt and S. Dzeroski. First order jk-clausal theories are

  4. PAC-learnable. Artificial Intelligence, 70:375-392, 1994.
  5. L. De Raedt and L. Dehaspe. Clausal discovery. Machine Learning, 1996. To appear.
  6. L. De Raedt and W. Van Laer. Inductive constraint logic. In Proceedings of the 5th Workshop on Algorithmic Learning Theory, volume 997 of Lecture Notes in Artificial Intelligence. Springer-Verlag, 1995.


back to index