| Application domain: | Mutagenesis, regression-unfriendly |
| Source: | Oxford |
| Dataset size: | 2 556 facts |
| Data format: | text |
| Systems Used: | STILL, DISTILL |
| Pointers: | http://www.lri.fr/~sebag/ |
Most experiments regard the 188-compound dataset, known as regression-friendly since numerical regression obtains around 86% predictive accuracy.
The following experiments concern the 42 other compounds, i.e. the regression-unfriendly dataset which is considered to be harder than the 188-dataset (the best prediction rate reported on this dataset [#!SriKin96-ILP96!#] is 64%). This dataset involves 29 inactive compounds vs 13 active compounds.
| M | OK | ? | Mis | time | ||
| 0 | 1 | 88.9 | 1.9 | 9.26 | |
1 |
| 0 | 2 | 88.3 | 3.3 | 8.33 | |
1 |
| 0 | 3 | 80 | 8.3 | 11.7 | |
2 |
| 0 | 4 | 78.3 | 18 | 3.33 | |
4 |
| 1 | 1 | 73.3 | 8.3 | 18.3 | |
1 |
| 1 | 2 | 90 | 0 | 10 | |
1 |
| 1 | 3 | 86.7 | 1.7 | 11.7 | |
2 |
| 1 | 4 | 83.3 | 3.3 | 13.3 | |
2 |
| 2 | 1 | 68.3 | 3.3 | 28.3 | |
1 |
| 2 | 2 | 73.3 | 0 | 26.7 | |
1 |
| 2 | 3 | 85 | 1.7 | 13.3 | |
1 |
| 2 | 4 | 86.7 | 1.7 | 11.7 | |
1 |
The two parameters of stochastic subsumption
and
are respectively set to 300 and 3, and we focus on the
influence of parameters
and
, where
corresponds to perfect consistency and
to maximal generality.
The table above summarizes the predictive accuracy of STILL on the test set, averaged on 25 independent selections of a 4-example test set distributed as the whole dataset. The third, fourth and fifth columns respectively give the percentage of correctly classified, unclassified and misclassified test examples. Column 6 gives the standard deviation of the predictive accuracy, Column 7 gives the total computational time (induction and classification of the test examples), in seconds on a HP-710 workstation.
The results obtained for reasonable values of
and
are satisfactory; the computational cost is negligible. One only
regrets the high variance of the predictive accuracy.
| p | Average OK | Range | Average variance | time |
| 10 | 89.8 | [82.5, 95] | 1 | |
| 20 | 88.8 | [82.5, 95] | 2 | |
| 30 | 92.1 | [85, 97.5] | 4 | |
| 40 | 93 | [87.5, 97.5] | 4 | |
| 50 | 93.1 | [90, 97.5] | 5 | |
| 60 | 93.5 | [90, 97.5] | 5 | |
| 70 | 94.2 | [87.5, 97.5] | 8 | |
| 80 | 93.4 | [87.5, 97.5] | 9 | |
| 90 | 94.5 | [90, 97.5] | 10 | |
| 100 | 95.1 | [92.5, 97.5] | 12 |
The two parameters of stochastic subsumption
and
are respectively set to 10 and 3, and we focus on the
influence of the number of dimensions
.
The above table summarizes the predictive accuracy of DISTILL,
with the following experimental setting. A run corresponds to
a 10-cross-fold validation; column 2 indicates the average
predictive accuracy on 20 independent runs and
column 3 indicates the range of variation of this predictive
accuracy. The average variance of the cross-validation is
given in column 4, and the computational time
(induction of hypotheses, mapping all examples onto
and k-NN classification
of the test examples), in seconds on a HP-710 workstation.
The results obtained for sufficient values of
(
)
are satisfactory and degrade gracefully as
decreases;
the computational cost remains moderate. DISTILL obtains slightly
better and overall more steady performances than STILL, whereas it
involves one less parameter.