Working
Against a DB Directly
What to do if data cannot fit
in memory?
-
Wide class of greedy methods
driven purely by a small set of correlation counts (sufficient
statistics)
-
Problem: getting desired counts using
SQL will results in many spurious (expensive) scans of the data
-
one solution: redesign way data is laid
out on disk to bypass the problem
