首页 | 本学科首页   官方微博 | 高级检索  
     

基于基本操作序列的编辑距离顺序验证
引用本文:张润梁,牛之贤.基于基本操作序列的编辑距离顺序验证[J].计算机科学,2016,43(Z6):51-54.
作者姓名:张润梁  牛之贤
作者单位:太原理工大学计算机科学与技术学院 太原030024,太原理工大学计算机科学与技术学院 太原030024
摘    要:两字符串的编辑距离是从一个串转换到另一个串所需要的最少基本操作数。编辑距离广泛应用于字符串近似匹配、字符串相似连接等领域。动态规划法利用编辑距离矩阵来计算两个串的编辑距离,需要计算矩阵中的所有元素,时间效率低。改进的方法改变了矩阵中元素的计算次序,减少了需要比对的元素,但仍需要比对一半以上的元素,时间效率还有待提高。提出基于基本操作序列的编辑距离顺序验证方法。首先,分析了基本操作序列的可列性,给出了列举基本操作序列的方法。然后依次顺序验证基本操作数从小到大的基本操作序列直到某一序列通过验证,得到其编辑距离。在阈值为2的字符串近似搜索实验中发现,所提方法比动态规划类方法具有更高的效率。

关 键 词:近似串搜索  编辑距离  顺序验证  基本操作序列

Sequential Verification Algorithm to Compute Edit Distance Based on Edit Operation Sequence
ZHANG Run-liang and NIU Zhi-xian.Sequential Verification Algorithm to Compute Edit Distance Based on Edit Operation Sequence[J].Computer Science,2016,43(Z6):51-54.
Authors:ZHANG Run-liang and NIU Zhi-xian
Affiliation:College of Computer Science and Technology,Taiyuan University of Technology,Taiyuan 030024 and College of Computer Science and Technology,Taiyuan University of Technology,Taiyuan 030024
Abstract:The edit distance between two strings is the minimum number of edit operations required to transform one into another.The edit distance is widely used in approximate string match,string similarity joins and etc.Dynamic programming algorithm(DPA) uses an edit distance matrix to compute the edit distance between two strings,which needs to compute all the elements in the matrix and has poor time efficiency.The progressive method changes the calculation orders of the elements to reduce the calculation numbers,which still needs to compute half of the elements and whose time efficiency needs to be improved.In our paper,we proposed a sequential verification algorithm to compute the edit distance based on the edit operation sequence.First,we analyzed the enumerable nature of edit operation sequence and gave a way to enumerate the sequences.Then,we got the result though sequentially verifying the edit operation sequences ordered by its edit operation numbers until the verification being successful.Experiments on approximate string search with threshold 2 show that compared with the DPA,our method achieves high performance.
Keywords:Approximate string search  Edit distance  Sequential verification  Edit operation sequence
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号