The relational data model has simple and clear foundations on which significant
theoretical and systems research has flourished. By contrast, most research on data
mining has focused on algorithmic issues. A major open question is ``what's an
appropriate foundation for data mining, which can accommodate disparate mining
tasks.'' We address this problem by presenting a database model and an algebra for
data mining. The database model is based on the 3W-model introduced
by~\citeN{JLN00}. This model relied on black box mining operators. A main
contribution of this paper is to open up these black boxes, by using generic
operators in a data mining algebra. Two key operators in this algebra are
regionize, which creates regions (or models) from data tuples, and a restricted
form of looping called mining loop. Then, the resulting data mining algebra is
studied and properties concerning expressive power and complexity are established.
We present results in three directions: (1) expressiveness of the mining algebra;
(2) relations with alternative frameworks, and (3) interactions between regionize
and mining loop.