一种基于后缀数组的多重复模式匹配算法 |
| |
引用本文: | 张利香,王素一.一种基于后缀数组的多重复模式匹配算法[J].佳木斯工学院学报,2010(5):721-724,727. |
| |
作者姓名: | 张利香 王素一 |
| |
作者单位: | [1]平凉医学高等专科学校,甘肃平凉744000 [2]河北大学,河北保定071000 |
| |
摘 要: | 要在海量的信息中进行多重复模式的查找,应用一般的查找方法所需O(n2)的复杂度.为了提高模式查找算法的效率,提出了算法Epattern searcher H.该算法是采用能节省空间占用的后缀数组数据结构来实现,同时又运用过滤算法的思想而设计,从而提高算法的运行速度.这里针对英文小说高频词的查找对算法进行测试,可得到时间复杂度为O(n)实验结果.
|
关 键 词: | 汉明距离 模式匹配 过滤 后缀数组 |
A Multi-repeat Pattern Matching Algorithm Based on Suffix Arrays |
| |
Authors: | ZHANG Li-Xiang WANG Su-Yi |
| |
Affiliation: | 1.Pingliang Medical College,Pingliang 744000,China;2.Hebei University,Baoding 071000,China) |
| |
Abstract: | To find a multi-repeat pattern quickly in the vast amounts of information,the time complexity is O(n2) in ordinary string matching algorithms.Therefore,it is necessary to improve the efficiency of pattern-matching algorithms.Here,a fast algorithm of pattern matching was proposed based on suffix arrays.When the algorithm is used to find high-frequency words in English text,it can run within the time complexity O(n). |
| |
Keywords: | Haming-distance pattern-matching filtration suffix arrays |
本文献已被 维普 等数据库收录! |
|