Why
Data Mining?
-
Data volume is too large for classical
analysis regimes:
-
number of records too large (108
- 1012 bytes)
-
high dimensional data (many DB fields:
102 - 104)
-
how do you explore millions of records,
tens or hundreds of fields, and find patterns?
-
Networking, increased opportunity for
access
-
web navigation, on-line product catalogs, travel and
services info, …
-
End user is not a statistician
-
Cannot afford not to analyze the data:
-
Business: competitive
advantage, more effective decisions
-
Science: lost opportunity for knowledge, maximize
return per byte of data collected
-
Personal: information overload, navigation and cataloging
of the new digital universe
-
“Interesting” queries are difficult to
express declaratively (e.g. in SQL).
-
put effective analysis tools in hands
of end-users: owners of the data.
