KDD - what's new

Research Challenges for Data Mining (1)

efficient and sufficient sampling schemes
in-memory vs. disk-based data processing
choice of right subset of techniques to span most tasks
interfaces to large warehouses, use of metadata to optimize access
client-server issues, where to perform the processing (where and when to mine)
exploiting parallelism, distributed computing over a network of workstations

General systems challenge: what will a system that will enable exploration, visualization, analysis over large databases look like?