Sunteți pe pagina 1din 1

Association Rules and Statistics

Han, J., & Kamber, M. (2001). Data mining: Con- Zhu, H. (1998). On-line analytical mining of associa-
cepts and techniques. San Francisco, CA: Morgan tion rules [doctoral thesis]. Simon Fraser University, A
Kaufmann. Burnaby, Canada.
Hand, D., Mannila, H., & Smyth, P. (2001). Principles
of data mining. Cambridge, MA: MIT Press.
KEY TERMS
Hayduk, L.A. (1987). Structural equation modelling
with LISREL. Maryland: John Hopkins Press.
Attribute-Oriented Induction: Association rules,
Jensen, D. (1992). Induction with randomization classification rules, and characterization rules are
testing: Decision-oriented analysis of large data sets written with attributes (i.e., variables). These rules are
[doctoral thesis]. Washington University, Saint Louis, obtained from data by induction and not from theory
MO. by deduction.
Kodratoff, Y. (2001). Rating the interest of rules induced Badly Structured Data: Data, like texts of corpus
from data and within texts. Proceedings of the 12th or log sessions, often do not contain explicit variables.
IEEE International Conference on Database and Expert To extract association rules, it is necessary to create
Systems Aplications-Dexa, Munich, Germany. variables (e.g., keyword) after defining their values
(frequency of apparition in corpus texts or simply ap-
Megiddo, N., & Srikant, R. (1998). Discovering predic-
parition/non apparition).
tive association rules. Proceedings of the Conference
on Knowledge Discovery in Data, New York. Interaction: Two variables, A and B, are in interac-
tion if their actions are not seperate.
Padmanabhan, B., & Tuzhilin, A. (2000). Small is
beautiful: Discovering the minimal set of unexpected Linear Model: A variable is fitted by a linear com-
patterns. Proceedings of the Conference on Knowledge bination of other variables and interactions between
Discovery in Data. Boston, Massachusetts. them.
Piatetski-Shapiro, G. (2000). Knowledge discovery in Pruning: The algorithms of extraction for the as-
databases: 10 years after. Proceedings of the Confer- sociation rule are optimized in computationality cost
ence on Knowledge Discovery in Data, Boston, Mas- but not in other constraints. This is why a suppression
sachusetts. has to be performed on the results that do not satisfy
special constraints.
Prum, B. (1996). Modèle linéaire: Comparaison de
groupes et régression. Paris, France: INSERM. Structural Equations: System of several regression
equations with numerous possibilities. For instance, a
Srikant, R. (2001). Association rules: Past, present,
same variable can be made into different equations, and a
future. Proceedings of the Workshop on Concept Lat-
latent (not defined in data) variable can be accepted.
tice-Based Theory, Methods and Tools for Knowledge
Discovery in Databases, California. Taxonomy: This belongs to clustering methods
and is usually represented by a tree. Often used in life
Winer, B.J., Brown, D.R., & Michels, K.M.(1991).
categorization.
Statistical principles in experimental design. New
York: McGraw-Hill. Tests of Regression Model: Regression models and
analysis of variance models have numerous hypothesis,
Zaki, M.J. (2000). Generating non-redundant associa-
e.g. normal distribution of errors. These constraints
tion rules. Proceedings of the Conference on Knowledge
allow to determine if a coefficient of regression equa-
Discovery in Data, Boston, Massachusetts.
tion can be considered as null with a fixed level of
significance.

This work was previously published in Encyclopedia of Data Warehousing and Mining, edited by J. Wang, pp. 74-77, copyright 2005 by
IGI Publishing, formerly known as Idea Group Publishing (an imprint of IGI Global).