Notes
on association rules
-
Assumes all data is categorical, no good
algorithms for numeric fields
-
Assumes set of associations satisfying
thresholds are SPARSE
-
Deteriorates to exponential (in theory)
-
Set of associations is very large, even
if retrieved efficiently, what to do next? cluster? visualize?
-
A fast, scalable method for finding the
frequent marginals
-
A convenient fast counting scheme.
-
An effective data reduction tool
-
They are used directly, sometimes inappropriately,
to visualize data:
-
e.g. during certain time periods: diapers
and beer are bought together
-
danger: inappropriately inferring causal
effect
