|Further specification:||empirical ILP system and probabilistic first order classifier|
|Pointers:||Contact Uros Pompe email@example.com|
|Code:||C++ source code (g++ preferred) and Sicstus prolog source (classification part)|
ILP-R is an inductive logic programming system and a first order classifier. It is capable of efficiently inducing theories in a restricted subset of function free normal programs. It is able to use the hypothesis to classify new instances, by using the naive Bayesian reasoning. The contributions of the system are threefold.
It uses a weak syntactic declarative bias restricting the set of possible hypotheses. The properties of this bias are then exploited to encode the current proof of a partially induced clause in such a way, that the encoding grows at most linearly with respect to the clause length.
It uses a non-myopic heuristic function called RELIEF our first order learner. At the outer level, this learner uses a covering approach similar to the one used by FOIL. At the inner level, its top-down search for a consistent clause uses the RELIEF based heuristic for literal quality estimation. This heuristic enables the system to detect strong dependencies between literals in cases when usual impurity-function based heuristics fail. Instead of adding a single literal to the body, our learner searches for the most informative conjunction of literals, which is then inserted in the clause.
ILP-R implements a first order classifier which is able to use the hypothesis for the classification of new instances. In addition to usual procedural classification, our classifier is able to assess individual clauses probabilistically, and then exploit this information to determine the class of an unseen example by using the naive Bayesian approach. This sometimes leads to a significant increase in the classification accuracy and the average information score, when classifying in multi-class domains.
We evaluated ILP-R on a series of artificial and one real world domain. We tested the RELIEF based heuristic and compared its performance to the information gain. In addition, we compared the classification accuracy of the hypotheses when using the procedural classification and our naive Bayesian approach. On attributional domains our classifier performed comparably to some of the state-of-the-art propositional learners in both efficiency of learning and classification accuracy. On parity domains the RELIEF based estimation significantly outperformed information gain. The naive Bayesian classifier performed significantly better on the LED domain and on some of the mesh domains.