Efficient single-pass frequent pattern mining using a prefix-tree |
| |
Authors: | Syed Khairuzzaman Tanbeer Byeong-Soo Jeong Young-Koo Lee |
| |
Affiliation: | Department of Computer Engineering, Kyung Hee University, 1 Seochun-dong, Kihung-gu, Youngin-si, Kyunggi-do 446-701, Republic of Korea |
| |
Abstract: | The FP-growth algorithm using the FP-tree has been widely studied for frequent pattern mining because it can dramatically improve performance compared to the candidate generation-and-test paradigm of Apriori. However, it still requires two database scans, which are not consistent with efficient data stream processing. In this paper, we present a novel tree structure, called CP-tree (compact pattern tree), that captures database information with one scan (insertion phase) and provides the same mining performance as the FP-growth method (restructuring phase). The CP-tree introduces the concept of dynamic tree restructuring to produce a highly compact frequency-descending tree structure at runtime. An efficient tree restructuring method, called the branch sorting method, that restructures a prefix-tree branch-by-branch, is also proposed in this paper. Moreover, the CP-tree provides full functionality for interactive and incremental mining. Extensive experimental results show that the CP-tree is efficient for frequent pattern mining, interactive, and incremental mining with a single database scan. |
| |
Keywords: | Data mining Frequent pattern Association rule Incremental mining Interactive mining |
本文献已被 ScienceDirect 等数据库收录! |
|