首页 | 本学科首页   官方微博 | 高级检索  
     


Applying the maximum utility measure in high utility sequential pattern mining
Affiliation:1. Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan 701, Taiwan;2. Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan;3. Department of Computer Science and Engineering, National Sun Yat-Sen University, Kaohsiung 804, Taiwan;4. Department of Information Management, National University of Kaohsiung, Kaohsiung 811, Taiwan;1. Informatics Center — Federal University of Pernambuco (UFPE), Pernambuco, Brazil;2. Federal University of Bahia (UFBA), Bahia, Brazil;1. Department of Computer Science and Statistics (DCCE), São Paulo State University (UNESP), São José do Rio Preto, SP, Brazil;2. Faculty of Computation (FACOM), Federal University of Uberlândia (UFU), Uberlândia, MG, Brazil;3. Center of Mathematics, Computing and Cognition, Federal University of ABC (UFABC), Santo André, SP, Brazil;4. Federal Institute of Triângulo Mineiro (IFTM), Ituiutaba, MG, Brazil;5. Transdisciplinary Center for Study of Chaos and Complexity (NUTECC), São José do Rio Preto Medical School, São José do Rio Preto, SP, Brazil;6. Kidney Transplant Surgical Service, Base Hospital, Fundação Faculdade Regional de Medicina (FUNFARME), São José do Rio Preto, SP, Brazil;7. Pathologic Anatomy Service, Base Hospital, Fundação Faculdade Regional de Medicina (FUNFARME), São José do Rio Preto, SP, Brazil;1. Khalifa University of Science, Technology and Research, P.O. Box 127788, Abu Dhabi, United Arab Emirates;2. Etisalat BT Innovation Center, P.O. Box 127788, Abu Dhabi, United Arab Emirates;1. Computer Science Department, Federal University of Maranhão (UFMA), São Luís, MA, Brazil;2. Department of Informatics, University of Minho, Braga, Portugal
Abstract:Recently, high utility sequential pattern mining has been an emerging popular issue due to the consideration of quantities, profits and time orders of items. The utilities of subsequences in sequences in the existing approach are difficult to be calculated due to the three kinds of utility calculations. To simplify the utility calculation, this work then presents a maximum utility measure, which is derived from the principle of traditional sequential pattern mining that the count of a subsequence in the sequence is only regarded as one. Hence, the maximum measure is properly used to simplify the utility calculation for subsequences in mining. Meanwhile, an effective upper-bound model is designed to avoid information losing in mining, and also an effective projection-based pruning strategy is designed as well to cause more accurate sequence-utility upper-bounds of subsequences. The indexing strategy is also developed to quickly find the relevant sequences for prefixes in mining, and thus unnecessary search time can be reduced. Finally, the experimental results on several datasets show the proposed approach has good performance in both pruning effectiveness and execution efficiency.
Keywords:Data mining  High utility sequential pattern mining  Sequence utility  Upper bound  Projection
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号