On Mining Summaries by Objective Measures of Interestingness |
| |
Authors: | Naim Zbidi Sami Faiz Mohamed Limam |
| |
Affiliation: | (1) Institut Supérieur de Gestion de Tunis, 41, Rue de la Liberté-Cité, Bouchoucha, 2000, Tunis —Le Bardo, Tunisia;(2) Institut National des Sciences Appliquées et de Technologie, Boulevard de la Terre, Tunis, Cedex, BP. 676- 1080, Tunisia |
| |
Abstract: | Knowledge discovery in databases is used to discover useful and understandable knowledge from large databases. A process of
knowledge discovery consists of two steps, the data mining step and the evaluation step. In this paper, evaluating and ranking
the interestingness of summaries generated from databases, which is a part of the second step, is studied using diversity
measures. Sixteen previously analyzed diversity measures of interestingness are used along with three not previously considered
ones, brought from different well-known areas. The latter three measures are evaluated theoretically according to five principles
that a measure must satisfy to be qualified acceptable for ranking summaries. A theoretical correlation study between the
eight measures that satisfy all five principles is presented based on mathematical proofs. An empirical evaluation is conducted
using three real databases. Then, a classification of the eight measures is deduced. The resulting classification is used
to reduce the number of measures to only two, which are the best over all criteria, and that produce non-similar results.
This helps the user interpret the most important discovered knowledge in his decision making process. |
| |
Keywords: | data mining diversity measures association rules |
本文献已被 SpringerLink 等数据库收录! |
|