An incremental mining algorithm for maintaining sequential patterns using pre-large sequences |
| |
Authors: | Tzung-Pei Hong Ching-Yao Wang Shian-Shyong Tseng |
| |
Affiliation: | 1. ISISE, Department of Civil Engineering, University of Coimbra, Coimbra, Portugal;2. Graz University of Technology, Institute for Steel Structures, Graz, Austria;1. Department of Clinical Neurophysiology, Georg-August University Medical Center Göttingen, Robert-Koch Str. 40, 37075 Göttingen, Germany;2. Leibniz Research Center for Working Environment and Human Factors, Dortmund, Germany;3. Department of Neurology, BG University Hospital Bergmannsheil, Ruhr-University Bochum, Bochum, Germany.;1. Faculty of Information Technology, Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam;2. College of Electronics and Information Engineering, Sejong University, Seoul, Republic of Korea;3. Division of Data Science, Ton Duc Thang University, Ho Chi Minh City, Vietnam;4. Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh City, Vietnam;5. Department of Electrical and Computer Engineering, University of Alberta, Edmonton, T6R 2V4 AB, Canada;6. Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah, 21589, Saudi Arabia;7. Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland |
| |
Abstract: | Mining useful information and helpful knowledge from large databases has evolved into an important research area in recent years. Among the classes of knowledge derived, finding sequential patterns in temporal transaction databases is very important since it can help model customer behavior. In the past, researchers usually assumed databases were static to simplify data-mining problems. In real-world applications, new transactions may be added into databases frequently. Designing an efficient and effective mining algorithm that can maintain sequential patterns as a database grows is thus important. In this paper, we propose a novel incremental mining algorithm for maintaining sequential patterns based on the concept of pre-large sequences to reduce the need for rescanning original databases. Pre-large sequences are defined by a lower support threshold and an upper support threshold that act as gaps to avoid the movements of sequences directly from large to small and vice versa. The proposed algorithm does not require rescanning original databases until the accumulative amount of newly added customer sequences exceeds a safety bound, which depends on database size. Thus, as databases grow larger, the numbers of new transactions allowed before database rescanning is required also grow. The proposed approach thus becomes increasingly efficient as databases grow. |
| |
Keywords: | |
本文献已被 ScienceDirect 等数据库收录! |
|