首页 | 本学科首页   官方微博 | 高级检索  
     


DBC: a condensed representation of frequent patterns for efficient mining
Authors:Artur Bykowski  Christophe Rigotti  
Affiliation:

Laboratoire d'Ingénierie des Systèmes d'Information, INSA Lyon, Bâtiment Blaise Pascal, F-69621, Villeurbanne Cedex, France

Abstract:Given a large set of data, a common data mining problem is to extract the frequent patterns occurring in this set. The idea presented in this paper is to extract a condensed representation of the frequent patterns called disjunction-bordered condensation (DBC), instead of extracting the whole frequent pattern collection. We show that this condensed representation can be used to regenerate all frequent patterns and their exact frequencies. Moreover, this regeneration can be performed without any access to the original data. Practical experiments show that the DBCcan be extracted very efficiently even in difficult cases and that this extraction and the regeneration of the frequent patterns is much more efficient than the direct extraction of the frequent patterns themselves. We compared the DBC with another representation of frequent patterns previously investigated in the literature called frequent closed sets. In nearly all experiments we have run, the DBC have been extracted much more efficiently than frequent closed sets. In the other cases, the extraction times are very close.
Keywords:Data mining  Frequent patterns  Condensed representations
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号