System | WARMR |
Version | 2.2 |
Code | MasterProLog 4.1 executable on SunSparcstations |
References | [2, 3, 4, 5] |
Pointers | http://www.cs.kuleuven.ac.be/~ldh/#software |
WARMR extends APRIORI [1] to discover frequent queries in data. WARMR looks at a level of the query lattice at a time, starting from the most general patterns, and iterates between candidate generation and candidate evaluation phases: in candidate generation, the lattice structure is used for pruning nonfrequent patterns from the next level; in the candidate evaluation phase, frequencies of candidates are computed with respect to the database. Pruning is based on monotonicity of generality with respect to frequency: if a pattern is not frequent then none of its specializations is frequent. So while generating candidates for the next level, all the patterns that are specializations of infrequent patterns can be pruned.
The new release of WARMR includes the simplified language bias formalism WARMODE, and has some additional parameters for handling huge datasets. WARMR extends APRIORI [1] to mine Association Rules in Multiple Relations (ARMR's ). The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. In W ARMR this restriction is overcome through the combination of the available algorithms with standard techniques from the field of inductive logic programmmg.
WARMR communicates with an Oracle7TM database, which is virtually partitioned into disjoint subsets of tuples with identical values for an identified key attribute. This key attribute is somewhat similar to notion of a clustering field as used in the database literature. The idea of partitioning the database also fits in the learning from interpretations paradigm (an interpretation here corresponds to a partition), introduced by [3] and related to other inductive logic programming settings in [2]. From a practical point of view a significant speed up can be obtained if the partioning scheme creates examples that fit in main memory. To minimize the number of database accesses, WARMR exploits and crucially depends on the fact that an example can be singled out from the database.