首页 | 本学科首页   官方微博 | 高级检索  
     


On Pattern Frequency Occurrences in a Markovian Sequence
Authors:M. Régnier  W. Szpankowski
Affiliation:(1) INRIA, Rocquencourt, 78153 Le Chesnay Cedex, France. Mireille.Regnier@inria.fr., FR;(2) Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA. spa@cs.purdue.edu., US
Abstract:Consider a given pattern H and a random text T generated by a Markovian source. We study the frequency of pattern occurrences in a random text when overlapping copies of the pattern are counted separately. We present exact and asymptotic formulae for moments (including the variance), and probability of r pattern occurrences for three different regions of r , namely: (i) r=O(1) , (ii) central limit regime, and (iii) large deviations regime. In order to derive these results, we first construct certain language expressions that characterize pattern occurrences which are later translated into generating functions. We then use analytical methods to extract asymptotic behaviors of the pattern frequency from the generating functions. These findings are of particular interest to molecular biology problems (e.g., finding patterns with unexpectedly high or low frequencies, and gene recognition), information theory (e.g., second-order properties of the relative frequency), and pattern matching algorithms (e.g., q -gram algorithms).
Keywords:. Frequency of pattern occurrences   Markov source   Autocorrelation polynomials   Languages   Generating functions   Asymptotic analysis   Large deviations.
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号