WARMR (KUL)

System WARMR
Version 2.2
Code MasterProLog 4.1 executable on SunSparcstations
References [2, 3, 4, 5]
Pointers http://www.cs.kuleuven.ac.be/~ldh/#software

WARMR extends APRIORI [1] to discover frequent queries in data. WARMR looks at a level of the query lattice at a time, starting from the most general patterns, and iterates between candidate generation and candidate evaluation phases: in candidate generation, the lattice structure is used for pruning nonfrequent patterns from the next level; in the candidate evaluation phase, frequencies of candidates are computed with respect to the database. Pruning is based on monotonicity of generality with respect to frequency: if a pattern is not frequent then none of its specializations is frequent. So while generating candidates for the next level, all the patterns that are specializations of infrequent patterns can be pruned.

The new release of WARMR includes the simplified language bias formalism WARMODE, and has some additional parameters for handling huge datasets. WARMR extends APRIORI [1] to mine Association Rules in Multiple Relations (ARMR's ). The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. In W ARMR this restriction is overcome through the combination of the available algorithms with standard techniques from the field of inductive logic programmmg.

WARMR communicates with an Oracle7TM database, which is virtually partitioned into disjoint subsets of tuples with identical values for an identified key attribute. This key attribute is somewhat similar to notion of a clustering field as used in the database literature. The idea of partitioning the database also fits in the learning from interpretations paradigm (an interpretation here corresponds to a partition), introduced by [3] and related to other inductive logic programming settings in [2]. From a practical point of view a significant speed up can be obtained if the partioning scheme creates examples that fit in main memory. To minimize the number of database accesses, WARMR exploits and crucially depends on the fact that an example can be singled out from the database.

References

  1. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A.I. Verkamo. Fast discovery of association rules. In U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 307-328. The MIT Press, 1996.

  2. L. Dehaspe and L. De Raedt. Mining association rules in multiple relations. In Proceedings of the tth International Workshop on Inductive Logic Programming, volume 1297 of Lecture Notes in Artificial Intelligence, pages 125-132. Springer-Verlag,1997.

  3. L. Dehaspe, H. Toivonen, and R.D. King. Finding frequent substructures in chemical compounds. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (KDD-98), pages 30-36. AAAI-Press, 1998.

  4. L. Dehaspe and H. Toivonen. Discovery of frequent Datalog patterns. Data Mining and Knowledge Discovery, 3(1):7-36, 1999.

  5. Frequent pattern discovery in first-order logic. Ph.D. dissertation, Katholieke Universiteit Leuven, December 1998.


back to index