TILDE: Top-down Induction of Logical Decision Trees (KUL)

System:TILDE
Pointers: http://www.cs.kuleuven.ac.be/~ml/Tilde/
Code:Prolog
References:Blockeel and De Raedt 1998

In the attribute-value learning domain, two major paradigms exist. There are decision tree learners, which follow a divide-and-conquer approach; and rule induction systems, typically following a covering approach. Because of these different search strategies, a rule set derived from a decision tree will usually look different from a rule set found directly by means of a typical rule induction system.

In the ILP domain, up till now most systems have used the covering approach, although some authors (e.g., Bostrom 1995) have already pointed out that the divide-and-conquer strategy can have advantages in some cases.

Recently, an algorithm has been developed at the K.U.Leuven that learns a predicate logic theory by means of so-called logical decision trees. Logical decision trees are a first-order logic upgrade of the classical decision trees used by propositional learners. In the same manner as propositional rules can be derived from decision trees (each rule corresponds to a path from the root to some leaf; the tests in the nodes on that path are conditions of the rule), clauses can be derived from logical decision trees (each test on the path from root to leaf now being a literal or conjunction of literals that is part of the clause). The resulting trees can directly be used for classification of unseen examples, but they can also easily be transformed into a logic or Prolog program.

The ILP setting used by this algorithm, is the ``learning from interpretations'' setting (De Raedt and Dzeroski 1994), as also used by the Claudien (De Raedt and Dehaspe 1996) and ICL (De Raedt and Van Laer 1995) systems.

The TILDE system is a prototype implementation of this algorithm. It incorporates many features of Quinlan's C4.5, which is a state-of-the-art decision tree learner for attribute-value problems. Next to these, a number of techniques are used that are specific to ILP: a language bias can be specified (types and modes of predicates), a form of lookahead is incorporated, and dynamic generation of literals (DGL) is possible. The latter, based on Claudien's call handling, is a technique that allows, among other things, to fill in constants in a literal. For learning in numerical domains, a discretization procedure is available that can be used by the DGL procedure to find interesting constants.

This implementation, not surprisingly, performs as well as C4.5 on propositional problems (except for lower speed), but experiments on typical ILP data sets also show promising results with respect to predictive accuracy, efficiency and theory complexity.

New developments

Development of the TILDE system was continued during the last year. In the previous report it was mentioned that a derivative of TILDE called CO.5 had been developed that can be used for regression and clustering. This derivative has in the mean time been improved [4] and integrated into the original system [1], resulting in a single system that can perform classification, clustering and regression. The resulting system was called TILDE2.0.

A second important development is that whereas in previous versions of TILDE examples were represented by interpretations (the "learning from interpretations" setting), the current version is able to work with data represented in the standard setting (learning from entailment). This makes it easier for users of other ILP systems to run TILDE on their data (no change of format is needed anymore). This change has been incorporated in TILDE2.1.

Finally, a separate version of TILDE called TILDELDS has been developed; this system was implemented so that can handle very large data sets. This version of TILDE employs an algorithm originally proposed by [5]. It has been shown [3] that TILDELDS is able to handle data sets of over 100MB or 100,000 examples. TILDELDS is only a prototype and (in its current form) will not be integrated with TILDE2.1; the main results obtained from this research are 1) experimental proof that induction of first order logical decision trees scales up linearly, making the processing of large data sets feasible, and 2) it has pointed to new methods for improving the efficiency of ILP systems [1].

Current work includes improving the user-friendliness of the system and further improving its performance (by incorporating the lessons learned during the development of TILDELDS. Our ultimate goal is to have an inductive system that has very few constraints with respect to the tasks that it can handle (classification/clustering/regression), the representation of the examples (simple or structured) and the size of the data set.

References

  1. H. Blockeel. Top-down induction of first order logical decision trees. PhD thesis, Department of Computer Science, Katholieke Universiteit Leuven, 1998.

  2. H. Blockeel and L. De Raedt. Top-down induction of first order logical decision trees. Artificial Intelligence, 101(1-2):285-297, June 1998.

  3. H. Blockeel, L. De Raedt, N. Jacobs, and B. Demoen. Scaling up inductive logic programming by learning from interpretations. Data Mining and Knowledge Discovery, 3(1), 1999.

  4. H. Blockeel, L. De Raedt, and J. Ramon. Top-down induction of clustering trees. In Proceedings of the 15th International Conference on Machine Learning, pages 55-63, 1998.

  5. M. Mehta, R. Agrawal, and J. Rissanen. SLIQ: A fast scalable classifier for data mining. In Proceedings of the Fifth International Conference on Extending Database Technology, 1996.


back to index