Application domain: | Mutagenesis, regression-unfriendly |
Source: | Oxford |
Dataset size: | 2 556 facts |
Data format: | text |
Systems Used: | STILL, DISTILL |
Pointers: | http://www.lri.fr/~sebag/ |
Most experiments regard the 188-compound dataset, known as regression-friendly since numerical regression obtains around 86% predictive accuracy.
The following experiments concern the 42 other compounds, i.e. the regression-unfriendly dataset which is considered to be harder than the 188-dataset (the best prediction rate reported on this dataset [#!SriKin96-ILP96!#] is 64%). This dataset involves 29 inactive compounds vs 13 active compounds.
![]() |
M | OK | ? | Mis | ![]() |
time |
0 | 1 | 88.9 | 1.9 | 9.26 | ![]() |
1 |
0 | 2 | 88.3 | 3.3 | 8.33 | ![]() |
1 |
0 | 3 | 80 | 8.3 | 11.7 | ![]() |
2 |
0 | 4 | 78.3 | 18 | 3.33 | ![]() |
4 |
1 | 1 | 73.3 | 8.3 | 18.3 | ![]() |
1 |
1 | 2 | 90 | 0 | 10 | ![]() |
1 |
1 | 3 | 86.7 | 1.7 | 11.7 | ![]() |
2 |
1 | 4 | 83.3 | 3.3 | 13.3 | ![]() |
2 |
2 | 1 | 68.3 | 3.3 | 28.3 | ![]() |
1 |
2 | 2 | 73.3 | 0 | 26.7 | ![]() |
1 |
2 | 3 | 85 | 1.7 | 13.3 | ![]() |
1 |
2 | 4 | 86.7 | 1.7 | 11.7 | ![]() |
1 |
The two parameters of stochastic subsumption and
are respectively set to 300 and 3, and we focus on the
influence of parameters
and
, where
corresponds to perfect consistency and
to maximal generality.
The table above summarizes the predictive accuracy of STILL on the test set, averaged on 25 independent selections of a 4-example test set distributed as the whole dataset. The third, fourth and fifth columns respectively give the percentage of correctly classified, unclassified and misclassified test examples. Column 6 gives the standard deviation of the predictive accuracy, Column 7 gives the total computational time (induction and classification of the test examples), in seconds on a HP-710 workstation.
The results obtained for reasonable values of and
are satisfactory; the computational cost is negligible. One only
regrets the high variance of the predictive accuracy.
p | Average OK | Range | Average variance | time |
10 | 89.8 | [82.5, 95] | ![]() |
1 |
20 | 88.8 | [82.5, 95] | ![]() |
2 |
30 | 92.1 | [85, 97.5] | ![]() |
4 |
40 | 93 | [87.5, 97.5] | ![]() |
4 |
50 | 93.1 | [90, 97.5] | ![]() |
5 |
60 | 93.5 | [90, 97.5] | ![]() |
5 |
70 | 94.2 | [87.5, 97.5] | ![]() |
8 |
80 | 93.4 | [87.5, 97.5] | ![]() |
9 |
90 | 94.5 | [90, 97.5] | ![]() |
10 |
100 | 95.1 | [92.5, 97.5] | ![]() |
12 |
The two parameters of stochastic subsumption and
are respectively set to 10 and 3, and we focus on the
influence of the number of dimensions
.
The above table summarizes the predictive accuracy of DISTILL,
with the following experimental setting. A run corresponds to
a 10-cross-fold validation; column 2 indicates the average
predictive accuracy on 20 independent runs and
column 3 indicates the range of variation of this predictive
accuracy. The average variance of the cross-validation is
given in column 4, and the computational time
(induction of hypotheses, mapping all examples onto
and k-NN classification
of the test examples), in seconds on a HP-710 workstation.
The results obtained for sufficient values of (
)
are satisfactory and degrade gracefully as
decreases;
the computational cost remains moderate. DISTILL obtains slightly
better and overall more steady performances than STILL, whereas it
involves one less parameter.