Structure-activity rules for modulating transmembrane calcium movement (OxUni)

Application domain: Structure-activity rules for modulating transmembrane calcium movement
Further specification: Data set describing 36 molecules with atom/bond facts and two propositional features
Pointers: Contact Ashwin Srinivasan
Data complexity: 140 KB (ASCII)
Data format: Progol

The data

The compounds are a class of calcium-channel activators and have a template based on methyl 2,5-dimethyl-4-1H-pyrole-3-carboxylate. Activity is measured as log(F) where F is the potency of the compound relative to an accepted standard calcium-channel activator (). Initial experiments in the chemical literature used the following attributes to obtain Structure-Activity relationships (SARS): Later results showed that incorporating structural information using state-of-the-art CoMFA methodology can significantly improve activity predictions over just the simple attributes below. CoMFA is a a very complex chemical/statistical process that involves aligning compounds to a 3-D grid, the calculations of many interaction energies, formation of hundreds of new attributes, and resampling. It is therefore of interest to see if structural descriptions obtained using ILP can be used to match or better these predictions.

The experiments

This dataset has been part of a set of experiments that examined the use of ILP-derived rules to aid the process of predicting the numerical activity of a molecule. In these experiments, structural concepts found by ILP were translated into boolean-valued attributes for multiple linear regression. For the calcium-channel dataset, a single structural feature obtained using the ILP program Progol was found to significantly (P = 0.01) improve the linear model predicting activity. This new feature flags the existence of a double bonded oxygen with a partial charge of at least -0.252. While the precision in this number is an artifact of the molecular modeling package used, an examination of the position of such atoms shows that they occur in the carboxylate group of the template. It is both of interest that groups added to the template are seen to modify the properties of the template, and that this region of the template is far from the region postulated to be important by CoMFA. The best regression equation found using CoMFA has an tex2html_wrap_inline1041 value of 0.86 which is not significantly different to the value of 0.84 obtained using the ILP attribute. However the ILP model is considerably simpler as the CoMFA model is a function of hundreds of attributes formed by estimating interaction energies using computational chemistry.


  1. A. Srinivasan and R.D. King. Feature construction with inductive logic programming: A study of quantitative predictions of biological activity aided by structural attributes. In S. Muggleton, editor, Proceedings of the 6th International Workshop on Inductive Logic Programming, pages 352-367. Stockholm University, Royal Institute of Technology, 1996.

back to index