首页 | 本学科首页   官方微博 | 高级检索  
     


Approximate regular expression pattern matching with concave gap penalties
Authors:J. R. Knight  E. W. Myers
Affiliation:(1) Department of Computer Science, University of Arizona, Gould-Simpson Building 721, 85721 Tucson, AZ, USA
Abstract:Given a sequenceA of lengthM and a regular expressionR of lengthP, an approximate regular expression pattern-matching algorithm computes the score of the optimal alignment betweenA and one of the sequencesB exactly matched byR. An alignment between sequencesA=a1a2 ... aM andB=b1b2... bN is a list of ordered pairs, lang(i1,j1), (i2j2), ..., (it,jtt)rang such that ik < ik+1 and jk < jk+1. In this case the alignmentaligns symbols aik and bjk, and leaves blocks of unaligned symbols, orgaps, between them. A scoring schemeS associates costs for each aligned symbol pair and each gap. The alignment's score is the sum of the associated costs, and an optimal alignment is one of minimal score. There are a variety of schemes for scoring alignments. In a concave gap penalty scoring schemeS={delta, w}, a function delta(a, b) gives the score of each aligned pair of symbolsa andb, and aconcave function w(k) gives the score of a gap of lengthk. A function w is concave if and only if it has the property that, for allk > 1, w(k + 1) –w(k) lew(k) –w(k –1). In this paper we present an O(MP(logM + log2P)) algorithm for approximate regular expression matching for an arbitrarydelta and any concavew.This work was supported in part by the National Institute of Health under Grant RO1 LM04960.
Keywords:Pattern matching  Regular expressions  Concave gap penalties  Approximate matching
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号