|Multi-Relational Data Mining (MRDM) is the multi-disciplinary field dealing with knowledge discovery from relational databases consisting of multiple tables. Mining data which consists of complex/structured objects also falls within the scope of this field, since the normalized representation of such objects in a relational database requires multiple tables. The field aims at integrating results from existing fields such as inductive logic programming, KDD, machine learning and relational databases; producing new techniques for mining multi-relational data; and practical applications of such tecniques.
Typical data mining approaches look for patterns in a single relation of a database. For many applications, squeezing data from multiple relations into a single table requires much thought and effort and can lead to loss of information. An alternative for these applications is to use multi-relational data mining. Multi-relational data mining can analyze data from a multi-relation database directly, without the need to transfer the data into a single table first. Thus the relations mined can reside in a relational or deductive database. Using multi-relational data mining it is often also possible to take into account background knowledge, which often corresponds to views in the database.
Present MRDM approaches consider all of the main data mining tasks, including association analysis, classification, clustering, learning probabilistic models and regression. The pattern languages used by single-table data mining approaches for these data mining tasks have been extended to the multiple-table case. Relational pattern languages now include relational association rules, relational classification rules, relational decision trees, and probabilistic relational models, among others. MRDM algorithms have been developed to mine for patterns expressed in relational pattern languages. Typically, data mining algorithms have been upgraded from the single-table case: for example, distance-based algorithms for prediction and clustering have been upgraded by defining distance measures
between examples/instances represented in relational logic.
MRDM methods have been successfully applied accross many application areas, ranging from the analysis of business data, through bioinformatics (including the analysis of complete genomes) and pharmacology (drug design) to Web mining (information extraction from text and Web sources).
The aim of the workshop is to bring together researchers and practitioners of data mining interested in methods for finding patterns in expressive languages from complex / multi-relational / structured data and their applications.
This workshop is the third of its kind. It follows the success of the first and second workshop on Multi-Relational Data Mining, held at SIGKDD 2002 and 2003, reports on which appears in SIGKDD Explorations [Vols 4(2) and 5(2)].
Further information on the workshops can be found at web sites MRDM-2002 or MRDM-2003.
Based on MRDM-02, a special issue of SIGKDD Explorations [Vol 5(1)] was co-edited by Saso Dzeroski and Luc de Raedt.
Why the topic is of interest?
An increasing number of data mining applications involve the analysis of complex and structured types of data (such as sequences in genome analysis, HTML and XML documents) and require the use of expressive pattern languages. There is thus a clear need for multi-relational data mining (MRDM) techniques.
On the other hand, there is a wealth of recent work concerned with upgrading some recent
successful data mining approaches to relational logic. A case in point are kernel methods (support-vector machines): the development of kernels for structured and richer data types is a hot research topic. Another example is the development of probabilistic relational representations and methods for learning in them (e.g., probabilistic relational models, first-order Bayesian networks, stochastic logic programs, etc.).
Non-exclusive list of topics, listed in alphabetical order:
- Applications of (multi-)relational data mining
- Data mining problems that require (multi-)relational methods
- Distance-based methods for structured/relational data
- Inductive databases
- Kernel methods for structured/relational data
- Learning in probabilistic relational representations
- Link analysis and discovery
- Methods for (multi-)relational data mining
- Mining structured data, such as amino-acid sequences, chemical compounds, HTML and XML documents, ...
- Propositionalization methods for transforming (multi-)relational
- data mining problems to single-table data mining problems
- Relational neural networks
- Relational pattern languages
The interest of the KDD community in MRDM has increased sharply over the last few years. An evidence for this is also the success of the MRDM-2002 and -2003 workshops, as well as the MRDM tutorial at KDD-2003 (given by Saso Dzeroski and Luc De Raedt), all of which attracted many participants. To illustrate the interest in MRDM, the MRDM-2003 proceedings were downloaded 1014 times in the period 1 FEB to 10 MAR 2004.
Contact information of organizers
Saso Dzeroski (
Jozef Stefan Institute, Jamova 39, SI-1000 Ljubljana, Slovenia.
phone: +386 1 477 3217, fax: +386 1 425 1083
Katholieke Universiteit Leuven, Department of Computer Science
Celestijnenlaan 200A, B-3001 Heverlee, Belgium
Program Committee Members
- Jean-Francois Boulicaut (University of Lyon)
- Diane Cook (University of Texas at Arlington)
- Luc Dehaspe (PharmaDM)
- Pedro Domingos (University of Washington)
- Peter Flach (University of Bristol)
- David Jensen (University of Massachusetts at Amherst)
- Kristian Kersting (Albert-Ludwigs-Universitaet Freiburg)
- Joerg-Uwe Kietz (kdlabs AG, Zurich)
- Ross King (University of Aberystwith)
- Stefan Kramer (Technical University Munich)
- Nada Lavrac (Jozef Stefan Institute)
- Donato Malerba (University of Bari)
- Stan Matwin (University of Ottawa)
- Hiroshi Motoda (University of Osaka)
- David Page (University of Wisconsin at Madison)
- Alexandrin Popescul (University of Pennsylvania)
- Foster Provost (Stern School of Business, New York University)
- Celine Rouveirol (University Paris Sud XI)
- Michele Sebag (University Paris Sud XI)
- Arno Siebes (Universiteit Utrecht)
- Ashwin Srinivasan (IBM India)
- Takashi Washio (University of Osaka)
- Stefan Wrobel (Fraunhofer Institute for Autonomous Intelligent Systems, Sankt Augustin / University of Bonn)
Extended deadline for submissions: June 15, 2004
Notification: June 29, 2004
Camera ready: July 8, 2004
Workshop day: August 22, 2004
Available for download!