The subsequence composition of a string |
| |
Authors: | Alberto Apostolico Fabio Cunial |
| |
Affiliation: | 1. Dipartimento di Ingegneria dell’Informazione, Università di Padova, Via Gradenigo 6/A, Padova, Italy;2. College of Computing, Georgia Institute of Technology, 801 Atlantic Drive, Atlanta, GA 30318, USA |
| |
Abstract: | Words that appear as constrained subsequences in a text-string are considered as possible indicators of the host string structure, hence also as a possible means of sequence comparison and classification. The constraint consists of imposing a bound on the number ω of positions in the text that may intervene between any two consecutive characters of a subsequence. A subset of such ω-sequences is then characterized that consists, in intuitive terms, of sequences that could not be enriched with more characters without losing some occurrence in the text. A compact spatial representation is then proposed for these representative sequences, within which a number of parameters can be defined and measured. In the final part of the paper, such parameters are empirically analyzed on a small collection of text-strings endowed with various degrees of structure. |
| |
Keywords: | Constrained subsequences Special subsequences Suffix graph Core equivalence classes String complexity measures |
本文献已被 ScienceDirect 等数据库收录! |
|