Data Mining & Knowledge Discovery Task
 
Given: 
  • A large database of observations, transactions, cases, examples, ...
  • too large for in-memory algorithms (> 100.000-1.000.000 entries)
  • stored in a database management system (RDBMS or other)
  • with real-life imperfections (incorrect, missing, redundant, skewed, useless)
Find:
  • "Knowledge" (patterns, regularities, rules, classifications) that is
  • valid
  • useful, interesting, novel, understandable