Traffic Accidents (York+KUL)

Application domain: Traffic Accidents
Source: Department of Transport, UK
Dataset size: 1413 accident reports described by 4 predicates of arity 25, 24, 16 and 19, resp.
Data format: Propositional data in C4.5 and Prolog form
Systems used: Progol, Tilde, C4.5
References: (Roberts et al. 1998)
Pointers: http://www.cs.york.ac.uk/~stephen/cnf.html
stephen@cs.york.ac.uk

This work (Roberts et al. 1998) presents an experimental comparison of two Inductive Logic Programming algorithms, Progol and Tilde, with C4.5, a propositional learning algorithm, on a propositional dataset of road traffic accidents. Rebalancing methods are described for handling the skewed distribution of positive and negative examples in this dataset, and the relative cost of errors of commission and omission in this domain. It is noted that before the use of these methods all algorithms perform worse than majority class. On rebalancing, all did significantly better. The conclusion drawn from the experimental results is that on such a propositional data set ILP algorithms perform competitively in terms of predictive accuracy with propositional systems, but are significantly outperformed in terms of time taken for learning.

Bibliography

  1. S. Roberts, W. Van Laerand, N. Jacobs, S. Muggleton, and J. Broughton. A comparison of ILP and propositional systems on propositional data. In C.D. Page, editor, Proc. of the 8th International Workshop on Inductive Logic Programming (ILP-98), LNAI 1446, pages 291-299, Berlin, 1998. Springer-Verlag.


back to index