Contributing Disciplines
-
Databases. Indispensable if data mining is to be efficient
on extremely large datasets (Gigabytes to Terabytes). A lot of data mining
work is on smaller, in-core datasets, however.
-
Statistics. Basic algorithms for data analysis, mostly
numeric, like regression or statistical clustering.
-
Machine Learning: Algorithms for various analysis
tasks involving non-numeric data, application background knowledge, complex
hypotheses.
Þ Statistics and Machine
Learning provide the "toolbox" of analysis methods to be used in a data
mining application but are less concerned with data storage and data preparation
tasks (even though in practice, every statistics and ML application always
features these steps!)
