Efficient incremental mining of contrast patterns in changing data |
| |
Authors: | James Bailey Elsa Loekito |
| |
Affiliation: | Department of Computer Science and Software Engineering, The University of Melbourne, Australia |
| |
Abstract: | A contrast pattern is a set of items (itemset) whose frequency differs significantly between two classes of data. Such patterns describe distinguishing characteristics between datasets, are meaningful to human experts, have strong discriminating ability and can be used for powerful classifiers. Incrementally mining such patterns is very important for evolving datasets, where transactions can be either inserted or deleted and mining needs to be repeated after changes occur. When the change is small, it is undesirable to carry out mining from scratch. Rather, the set of previously mined contrast patterns should be reused where possible to compute the new patterns. A primary example of evolving data is a data stream, where the data is a sequence of continuously arriving transactions (or itemsets). In this paper, we propose an efficient technique for incrementally mining contrast patterns. Our algorithm particularly aims to avoid redundant computation which might occur due to simultaneous transaction insertion and deletion, as is the case for data streams. In an experimental study using real and synthetic data streams, we show our algorithm can be substantially faster than the previous approach. |
| |
Keywords: | Data mining Contrast patterns Databases |
本文献已被 ScienceDirect 等数据库收录! |
|