Application domain: |
Mutagenesis, regression-unfriendly |

Source: |
Oxford |

Dataset size: |
2 556 facts |

Data format: |
text |

Systems Used: |
STILL, DISTILL |

Pointers: |
http://www.lri.fr/~sebag/ |

Most experiments regard the 188-compound dataset, known as regression-friendly since numerical regression obtains around 86% predictive accuracy.

The following experiments concern the 42 other compounds, i.e. the
* regression-unfriendly* dataset which is considered to be harder
than the 188-dataset (the best prediction rate reported on this
dataset [#!SriKin96-ILP96!#] is 64%). This dataset involves 29 inactive compounds
vs 13 active compounds.

M | OK | ? | Mis | time | ||

0 | 1 | 88.9 | 1.9 | 9.26 | 15 | 1 |

0 | 2 | 88.3 | 3.3 | 8.33 | 15 | 1 |

0 | 3 | 80 | 8.3 | 11.7 | 18 | 2 |

0 | 4 | 78.3 | 18 | 3.33 | 22 | 4 |

1 | 1 | 73.3 | 8.3 | 18.3 | 19 | 1 |

1 | 2 | 90 | 0 | 10 | 14 | 1 |

1 | 3 | 86.7 | 1.7 | 11.7 | 17 | 2 |

1 | 4 | 83.3 | 3.3 | 13.3 | 18 | 2 |

2 | 1 | 68.3 | 3.3 | 28.3 | 16 | 1 |

2 | 2 | 73.3 | 0 | 26.7 | 13 | 1 |

2 | 3 | 85 | 1.7 | 13.3 | 18 | 1 |

2 | 4 | 86.7 | 1.7 | 11.7 | 17 | 1 |

The two parameters of stochastic subsumption and are respectively set to 300 and 3, and we focus on the influence of parameters and , where corresponds to perfect consistency and to maximal generality.

The table above summarizes the predictive accuracy of STILL on the test set, averaged on 25 independent selections of a 4-example test set distributed as the whole dataset. The third, fourth and fifth columns respectively give the percentage of correctly classified, unclassified and misclassified test examples. Column 6 gives the standard deviation of the predictive accuracy, Column 7 gives the total computational time (induction and classification of the test examples), in seconds on a HP-710 workstation.

The results obtained for reasonable values of and are satisfactory; the computational cost is negligible. One only regrets the high variance of the predictive accuracy.

p |
Average OK | Range | Average variance | time |

10 | 89.8 | [82.5, 95] | 12 | 1 |

20 | 88.8 | [82.5, 95] | 13 | 2 |

30 | 92.1 | [85, 97.5] | 12 | 4 |

40 | 93 | [87.5, 97.5] | 11 | 4 |

50 | 93.1 | [90, 97.5] | 11 | 5 |

60 | 93.5 | [90, 97.5] | 11 | 5 |

70 | 94.2 | [87.5, 97.5] | 11 | 8 |

80 | 93.4 | [87.5, 97.5] | 11 | 9 |

90 | 94.5 | [90, 97.5] | 10 | 10 |

100 | 95.1 | [92.5, 97.5] | 10 | 12 |

The two parameters of stochastic subsumption and are respectively set to 10 and 3, and we focus on the influence of the number of dimensions .

The above table summarizes the predictive accuracy of DISTILL, with the following experimental setting. A run corresponds to a 10-cross-fold validation; column 2 indicates the average predictive accuracy on 20 independent runs and column 3 indicates the range of variation of this predictive accuracy. The average variance of the cross-validation is given in column 4, and the computational time (induction of hypotheses, mapping all examples onto and k-NN classification of the test examples), in seconds on a HP-710 workstation.

The results obtained for sufficient values of () are satisfactory and degrade gracefully as decreases; the computational cost remains moderate. DISTILL obtains slightly better and overall more steady performances than STILL, whereas it involves one less parameter.

back to index