首页 | 本学科首页   官方微博 | 高级检索  
     

变阶马尔科夫模型算法实现
引用本文:王兴,吴艺,林劼,卓一帆.变阶马尔科夫模型算法实现[J].计算机系统应用,2018,27(4):10-17.
作者姓名:王兴  吴艺  林劼  卓一帆
作者单位:中南大学 信息科学与工程学院, 长沙 410075;福建师范大学 数学与信息学院, 福州 350108,福建师范大学 数学与信息学院, 福州 350108,福建师范大学 数学与信息学院, 福州 350108,福建师范大学 数学与信息学院, 福州 350108
基金项目:国家自然科学基金(61472082);福建省自然科学基金(2014J01220)
摘    要:如何快速有效对历史数据进行统计建模和规律挖掘具有重要意义.鉴于模型在实际数据挖掘应用的局限及马尔科夫模型的良好统计特性,设计实现了基于后缀数组和后缀自动机的变阶马尔科夫模型.算法在后缀树形结构实现的基础上,引入后缀链,实现各状态子序列的快速跳转,能动态自适应计算不同阶长概率的需求.实验结果表明:相比传统马尔科夫模型,模型能在线性时间和空间复杂度内,构建历史数据的概率统计特征及各状态后缀子序列之间的链接关系,大大降低了存储空间和时间,能实现大规模数据的在线学习和应用.

关 键 词:马尔科夫模型  变阶马尔科夫模型  字典树  后缀数组  后缀自动机
收稿时间:2017/8/17 0:00:00
修稿时间:2017/9/15 0:00:00

Algorithm Implementation of Variable Order Markov Model
WANG Xing,WU Yi,LIN Jie and ZHUO Yi-Fan.Algorithm Implementation of Variable Order Markov Model[J].Computer Systems& Applications,2018,27(4):10-17.
Authors:WANG Xing  WU Yi  LIN Jie and ZHUO Yi-Fan
Affiliation:School of Information Science and Engineering, Central South University, Changsha 410075, China;College of Mathematics and Informatics, Fujian Normal University, Fuzhou 350108, China,College of Mathematics and Informatics, Fujian Normal University, Fuzhou 350108, China,College of Mathematics and Informatics, Fujian Normal University, Fuzhou 350108, China and College of Mathematics and Informatics, Fujian Normal University, Fuzhou 350108, China
Abstract:It is of great significance how to model and mine historical data quickly and effectively. Based on the statistical characteristics of Markov model, this study designs and implements a variable order Markov model based on suffix array and suffix automata, in view of the limitations of the model in practical data mining applications. Based on the realization of suffix tree structure, the suffix chain is introduced to realize the quick jump of each state subsequence, and the requirement of different order length probability can be dynamically and adaptively calculated. The experimental results show that compared with the traditional Markov model, the model constructs the link between suffix sequence characteristics of probability and statistics of historical data and the state in linear time and space complexity, which can greatly reduce the storage space and time, and realize online learning and application of large data.
Keywords:Markov model  variable Markov model  trie tree  suffix array  suffix automation
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号