New algorithms for finding approximate frequent item sets期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

New algorithms for finding approximate frequent item sets

Authors:	Christian Borgelt Christian Braune Tobias K?tter Sonja Grün

Affiliation:	1. European Centre for Soft Computing, c/ Gonzalo Gutiérrez Quirós s/n, 33600, Mieres (Asturias), Spain 2. Department of Computer Science, Otto-von-Guericke-University of Magdeburg, Universit?tsplatz 2, 39106, Magdeburg, Germany 3. Department of Computer Science, University of Konstanz, Box 712, 78457, Constance, Germany 4. RIKEN Brain Science Institute, Wako-Shi, Saitama, 351-0198, Japan 5. Institute of Neuroscience and Medicine (INM-6), Research Center Jülich, Jülich, Germany

Abstract:	In standard frequent item set mining a transaction supports an item set only if all items in the set are present. However, in many cases this is too strict a requirement that can render it impossible to find certain relevant groups of items. By relaxing the support definition, allowing for some items of a given set to be missing from a transaction, this drawback can be amended. The resulting item sets have been called approximate, fault-tolerant or fuzzy item sets. In this paper we present two new algorithms to find such item sets: the first is an extension of item set mining based on cover similarities and computes and evaluates the subset size occurrence distribution with a scheme that is related to the Eclat algorithm. The second employs a clustering-like approach, in which the distances are derived from the item covers with distance measures for sets or binary vectors and which is initialized with a one-dimensional Sammon projection of the distance matrix. We demonstrate the benefits of our algorithms by applying them to a concept detection task on the 2008/2009 Wikipedia Selection for schools and to the neurobiological task of detecting neuron ensembles in (simulated) parallel spike trains.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏