KDD - what's new

So why “Data Mining” ? (vs. Pattern Recognition or Statistics?)

what if data is disk-resident and can only afford to scan it a few times?
what if “random” sampling is not sufficient or even more efficient than scan? distributed system?
EDA with large data sets and high dimensionality
New algorithms are beginning to appear (some without proper statistical foundation)
What is possible/impossible with massive data sets?