首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
One of the particularities of information encoded as DNA strands is that a string u contains basically the same information as its Watson-Crick complement, denoted here as θ(u). Thus, any expression consisting of repetitions of u and θ(u) can be considered in some sense periodic. In this paper, we give a generalization of Lyndon and Schützenberger’s classical result about equations of the form ul=vnwm, to cases where both sides involve repetitions of words as well as their complements. Our main results show that, for such extended equations, if l?5,n,m?3, then all three words involved can be expressed in terms of a common word t and its complement θ(t). Moreover, if l?5, then n=m=3 is an optimal bound. These results are established based on a complete characterization of all possible overlaps between two expressions that involve only some word u and its complement θ(u), which is also obtained in this paper.  相似文献   

2.
In the field of combinatorics on words, it is known that {u, v}, the language with two words is a code if and only if . But up to now general characterization for codes consisting of any three words has not been found. In 1998, Fan, Shyr and Yu provided a characterization for codes consisting of three d-primitive words. In this paper, we give characterizations for a code with three words in which one of the words is d-primitive.Received: 7 June 2004, Published online: 29 October 2004  相似文献   

3.
Using a combinatorial characterization of digital convexity based on words, one defines the language of convex words. The complement of this language forms an ideal whose minimal elements, with respect to the factorial ordering, appear to have a particular combinatorial structure very close to the Christoffel words. In this paper, those words are completely characterized as those of the form uwkv where k≥1, w=uv and u,v,w are Christoffel words. Also, by considering the most balanced among the unbalanced words, we obtain a second characterization for a special class of minimal non-convex words that are of the form u2v2 corresponding to the case k=1 in the previous form.  相似文献   

4.
Well-known results on the avoidance of large squares in (full) words include the following: (1) Fraenkel and Simpson showed that we can construct an infinite binary word containing at most three distinct squares; (2) Entringer, Jackson and Schatz showed that there exists an infinite binary word avoiding all squares of the form xx such that |x|≥3, and that the bound 3 is optimal; (3) Dekking showed that there exists an infinite cube-free binary word that avoids all squares xx with |x|≥4, and that the bound of 4 is best possible. In this paper, we investigate these avoidance results in the context of partial words, or sequences that may have some undefined symbols called holes. Here, a square has the form uv with u and v compatible, and consequently, such a square is compatible with a number of full words that are squares over the given alphabet. We show that (1) holds for partial words with at most two holes. We prove that (2) extends to partial words having infinitely many holes. Regarding (3), we show that there exist binary partial words with infinitely many holes that avoid cubes and have only eleven full word squares compatible with factors of it. Moreover, this number is optimal, and all such squares xx satisfy |x|≤4.  相似文献   

5.
It is known that the set of all primitive words and the set of all $d$ -primitive words are disjunctive languages. In this paper we prove that the set of all $p$ -primitive words is disjunctive. We also prove that the set of all primitive but not $p$ -primitive words, the set of all balanced but not $p$ -primitive words, and the set of all $d$ -primitive but not $p$ -primitive words are disjunctive languages.  相似文献   

6.
A frame is a square uu, where u is an unbordered word. Let F(n) denote the maximum number of distinct frames in a binary word of length n. We count this number for small values of n and show that F(n) is at most ⌊n/2⌋+8 for all n and greater than 7n/30−? for any positive ? and infinitely many n. We also show that Fibonacci words, which are known to contain plenty of distinct squares, have only a few frames. Moreover, by modifying the Thue-Morse word, we prove that the minimum number of occurrences of frames in a word of length n is ⌈n/2⌉−2.  相似文献   

7.
In this paper, I address the problem wherein the same English word permits one of its complement positions to be satisfied by phrases of different categories. A well-known example of such an English word is the copula to be, whose complements include adjective phrases, noun phrases, prepositional phrases and adverbial phrases. I provide a way to treat such words, in particular verbs, as single lexical items through a conservative extension of the usual treatment of word classification as a pair comprising a part of speech category and a complement list. I then show how a further conservative extension of complement lists permits a satisfactory formalization of doubly complemented English verbs which are synonymous under a permutation of their complements. These verbs include, but are not limited to, so-called double object constructions.  相似文献   

8.
We study repetitions in infinite words coding exchange of three intervals with permutation (3, 2, 1), called 3iet words. The language of such words is determined by two parameters, ε,?. We show that finiteness of the index of 3iet words is equivalent to boundedness of the coefficients of the continued fraction of ε. In this case, we also give an upper and a lower estimate on the index of the corresponding 3iet word. Our main tool is the connection between a 3iet word with parameters ε,? and sturmian words with slope ε.  相似文献   

9.
We define a new relation on words by a finite series of insertions and/or deletions of palindromic subwords. In particular we concentrate on insertion or deletion of Watson–Crick palindromes. We show that the new relation ∼θ is, in fact, an equivalence relation where θ is any arbitrary antimorphic involution that is not the identity on the letters of the alphabet. We also show that the set of all θ-palindromic free words defined in (Daley et al. in preparation) is ∼θ-independent. Using the relation we define a new subclass of primitive words which we call as ∼θ-primitive words and show that the class of all ∼θ-primitive words is closed under circular permutations. We also define ∼θ-conjugates and ∼θ-commutativity and study the properties of such words and show that they are similar to that of conjugate words and words that commute.  相似文献   

10.
A square is the concatenation of a nonempty word with itself. A word has period p if its letters at distance p match. The exponent of a nonempty word is the quotient of its length over its smallest period. In this article we give some new results on the trade-off between the number of squares and the number of maximal-exponent powers in infinite binary words, in the three cases where the maximal exponent is 7/3, 5/2, and 3. These are the only threshold values related to the question.  相似文献   

11.
Stemming is a process of reducing a derivational or inflectional word to its root or stem by stripping all its affixes. It is been used in applications such as information retrieval, machine translation, and text summarization, as their pre-processing step to increase efficiency. Currently, there are a few stemming algorithms which have been developed for languages such as English, Arabic, Turkish, Malay and Amharic. Unfortunately, no algorithm has been used to stem text in Hausa, a Chadic language spoken in West Africa. To address this need, we propose stemming Hausa text using affix-stripping rules and reference lookup. We stemmed Hausa text, using 78 affix stripping rules applied in 4 steps and a reference look-up consisting of 1500 Hausa root words. The over-stemming index, under-stemming index, stemmer weight, word stemmed factor, correctly stemmed words factor and average words conflation factor were calculated to determine the effect of reference look-up on the strength and accuracy of the stemmer. It was observed that reference look-up aided in reducing both over-stemming and under-stemming errors, increased accuracy and has a tendency to reduce the strength of an affix stripping stemmer. The rationality behind the approach used is discussed and directions for future research are identified.  相似文献   

12.
This paper studies in a probabilistic framework some topics concerning the way words (strings) can overlap, and relationship of this to the height of digital trees associated with this set of words. A word is defined as a random sequence of (possibly infinite) symbols over a finite alphabet. A key notion of analignment matrix {C ij } i,j=1 n is introduced whereC ij is the length of the longest string that is a prefix of theith and thejth word. It is proved that the height of an associated digital tree is simply related to the alignment matrix through some order statistics. In particular, using this observation and proving some inequalities for order statistics, we establish that the height of adigital trie under anindependent model (i.e., all words are statistically independent) is asymptotically equal to 2 logα n wheren is the number of words stored in the trie and α is a parameter of the probabilistic model. This result is generalized in three directions, namely we considerb-tries,Markovian model (i.e., dependency among letters in a word), and adependent model (i.e., dependency among words). In particular, when consecutive letters in a word are Markov dependent (Markovian model), then we demonstrate that the height converges in probability to 2 logθ n where θ is a parameter of the underlying Markov chain. On the other hand, for suffix trees which fall into the dependent model, we show that the height does not exceed 2 logκ n, where κ is a parameter of the probabilistic model. These results find plenty of applications in the analysis of data structures built over digital words.  相似文献   

13.
While the theory of languages of words is very mature, our understanding of relations on words is still lagging behind. And yet such relations appear in many new applications such as verification of parameterized systems, querying graph-structured data, and information extraction, for instance. Classes of well-behaved relations typically used in such applications are obtained by adapting some of the equivalent definitions of regularity of words for relations, leading to non-equivalent notions of recognizable, regular, and rational relations. The goal of this paper is to propose a systematic way of defining classes of relations on words, of which these three classes are just natural examples, and to demonstrate its advantages compared to some of the standard techniques for studying word relations. The key idea is that of a synchronization of a pair of words, which is a word over an extended alphabet. Using it, we define classes of relations via classes of regular languages over a fixed alphabet, just {1,2} for binary relations. We characterize some of the standard classes of relations on words via finiteness of parameters of synchronization languages, called shift, lag, and shiftlag. We describe these conditions in terms of the structure of cycles of graphs underlying automata, thereby showing their decidability. We show that for these classes there exist canonical synchronization languages, and every class of relations can be effectively re-synchronized using those canonical representatives. We also give sufficient conditions on synchronization languages, defined in terms of injectivity and surjectivity of their Parikh images, that guarantee closure under intersection and complement of the classes of relations they define.  相似文献   

14.
A set of words is factorially balanced if the set of all the factors of its words is balanced. We prove that if all words of a factorially balanced set have a finite index, then this set is a subset of the set of factors of a Sturmian word. Moreover, characterizing the set of factors of a given length n of a Sturmian word by the left special factor of length n−1 of this Sturmian word, we provide an enumeration formula for the number of sets of words that correspond to some set of factors of length n of a Sturmian word.  相似文献   

15.
A balanced word is one in which any two factors of the same length contain the same number of each letter of the alphabet up to one. Finite binary balanced words are called Sturmian words. A Sturmian word is bispecial if it can be extended to the left and to the right with both letters remaining a Sturmian word. There is a deep relation between bispecial Sturmian words and Christoffel words, that are the digital approximations of Euclidean segments in the plane. In 1997, J. Berstel and A. de Luca proved that palindromic bispecial Sturmian words are precisely the maximal internal factors of primitive Christoffel words. We extend this result by showing that bispecial Sturmian words are precisely the maximal internal factors of all Christoffel words. Our characterization allows us to give an enumerative formula for bispecial Sturmian words. We also investigate the minimal forbidden words for the language of Sturmian words.  相似文献   

16.
If a partial differential equation is reduced to an ordinary differential equation in the form u(ξ)=G(u,θ1,…,θm) under the traveling wave transformation, where θ1,…,θm are parameters, its solutions can be written as an integral form . Therefore, the key steps are to determine the parameters' scopes and to solve the corresponding integral. When G is related to a polynomial, a mathematical tool named complete discrimination system for polynomial is applied to this problem so that the parameter's scopes can be determined easily. The complete discrimination system for polynomial is a natural generalization of the discrimination △=b2−4ac of the second degree polynomial ax2+bx+c. For example, the complete discrimination system for the third degree polynomial F(w)=w3+d2w2+d1w+d0 is given by and . In the paper, we give some new applications of the complete discrimination system for polynomial, that is, we give the classifications of traveling wave solutions to some nonlinear differential equations through solving the corresponding integrals. In finally, as a result, we give a partial answer to a problem on Fan's expansion method.  相似文献   

17.
Blanchet-Sadri et al. have shown that Avoidability, or the problem of deciding the avoidability of a finite set of partial words over an alphabet of size k≥2, is NP-hard [F. Blanchet-Sadri, R. Jungers, J. Palumbo, Testing avoidability on sets of partial words is hard, Theoret. Comput. Sci. 410 (2009) 968-972]. Building on their work, we analyze in this paper the complexity of natural variations on the problem. While some of them are NP-hard, others are shown to be efficiently decidable. Using some combinatorial properties of de Bruijn graphs, we establish a correspondence between lengths of cycles in such graphs and periods of avoiding words, resulting in a tight bound for periods of avoiding words. We also prove that Avoidability can be solved in polynomial space, and reduces in polynomial time to the problem of deciding the avoidability of a finite set of partial words of equal length over the binary alphabet. We give a polynomial bound on the period of an infinite avoiding word, in the case of sets of full words, in terms of two parameters: the length and the number of words in the set. We give a polynomial space algorithm to decide if a finite set of partial words is avoided by a non-ultimately periodic infinite word. The same algorithm also decides if the number of words of length n avoiding a given finite set of partial words grows polynomially or exponentially with n.  相似文献   

18.
Recently the second two authors characterized quasiperiodic Sturmian words, proving that a Sturmian word is non-quasiperiodic if and only if, it is an infinite Lyndon word. Here we extend this study to episturmian words (a natural generalization of Sturmian words) by describing all the quasiperiods of an episturmian word, which yields a characterization of quasiperiodic episturmian words in terms of their directive words. Even further, we establish a complete characterization of all episturmian words that are Lyndon words. Our main results show that, unlike the Sturmian case, there is a much wider class of episturmian words that are non-quasiperiodic, besides those that are infinite Lyndon words. Our key tools are morphisms and directive words, in particular normalized directive words, which we introduced in an earlier paper. Also of importance is the use of return words to characterize quasiperiodic episturmian words, since such a method could be useful in other contexts.  相似文献   

19.
In this paper we study the construction and the enumeration of binary words in $\{0,1\}^*$ having more 1’s than 0’s and avoiding a set of cross-bifix-free patterns. We give a particular succession rule, called jumping and marked succession rule, which describes the growth of such words according to their number of ones. Moreover, the problem of associating a word to a path in the generating tree obtained by the succession rule is solved by introducing an algorithm which constructs all binary words having more 1’s than 0’s and then kills those containing the forbidden patterns. Finally, we give the generating function of such words according to the number of ones.  相似文献   

20.
We give a short and elementary proof of the following stronger version of Duval's conjecture: let u be an unbordered word, and v a word of length |u|-1, such that v is not a prefix of u. Then uv contains an unbordered word of length at least |u|+1.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号