首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Stream X-machines are a general and powerful computational model. By coupling the control structure of a stream X-machine with a set of formal grammars a new machine called a generalised stream X-machine with underlying distributed grammars, acting as a translator, is obtained. By introducing this new mechanism a hierarchy of computational models is provided. If the grammars are of a particular class, say regular or context-free, then finite sets are translated into finite sets, when ?k, = k derivation strategies are used, and regular or context-free sets, respectively, are obtained for ?k, * and terminal derivation strategies. In both cases, regular or context-free grammars, the regular sets are translated into non-context-free languages. Moreover, any language accepted by a Turing machine may be written as a translation of a regular set performed by a generalised stream X-machine with underlying distributed grammars based on context-free rules, under = k derivation strategy. On the other hand the languages generated by some classes of cooperating distributed grammar systems may be obtained as images of regular sets through some X-machines with underlying distributed grammars. Other relations of the families of languages computed by generalised stream X-machines with the families of languages generated by cooperating distributed grammar systems are established. At the end, an example dealing with the specification of a scanner system illustrates the use of the introduced mechanism as a formal specification model. Received September 1999 / Accepted in revised form October 2000  相似文献   

2.
3.
We consider the problem of fixed-interval smoothing of a continuous-time partially observed nonlinear stochastic dynamical system. Existing results for such smoothers require the use of two-sided stochastic calculus. The main contribution of the paper is to present a robust formulation of the smoothing equations. Under this robust formulation, the smoothing equations are nonstochastic parabolic partial differential equations (with random coefficients) and, hence, the technical machinery associated with two sided stochastic calculus is not required. Furthermore, the robust smoothed state estimates are locally Lipschitz in the observations, which is useful for numerical simulation. As examples, finite dimensional robust versions of the Benes and hidden Markov model smoothers and smoothers for piecewise linear dynamics are derived; these finite-dimensional smoothers do not involve stochastic integrals.  相似文献   

4.
Although Hidden Markov Models (HMMs) are still the mainstream approach towards speech recognition, their intrinsic limitations such as first-order Markov models in use or the assumption of independent and identically distributed frames lead to the extensive use of higher level linguistic information to produce satisfactory results. Therefore, researchers began investigating the incorporation of various discriminative techniques at the acoustical level to induce more discrimination between speech units. As is known, the k-nearest neighbour (k-NN) density estimation is discriminant by nature and is widely used in the pattern recognition field. However, its application to speech recognition has been limited to few experiments. In this paper, we introduce a new segmental k-NN-based phoneme recognition technique. In this approach, a group-delay-based method generates phoneme boundary hypotheses, and an approximate version of k-NN density estimation is used for the classification and scoring of variable-length segments. During the decoding, the construction of the phonetic graph starts from the best phoneme boundary setting and progresses through splitting and merging segments using the remaining boundary hypotheses and constraints such as phoneme duration and broad-class similarity information. To perform the k-NN search, we take advantage of a similarity search algorithm called Spatial Approximate Sample Hierarchy (SASH). One major advantage of the SASH algorithm is that its computational complexity is independent of the dimensionality of the data. This allows us to use high-dimensional feature vectors to represent phonemes. By using phonemes as units of speech, the search space is very limited and the decoding process fast. Evaluation of the proposed algorithm with the sole use of the best hypothesis for every segment and excluding phoneme transitional probabilities, context-based, and language model information results in an accuracy of 58.5% with correctness of 67.8% on the TIMIT test dataset.  相似文献   

5.
A device called recursive graph defining context-free languages is described. Previously such a device was used byConway [1] for a compiler design.Conway called it “transition diagram”. Roughly, a recursive graph is a finite set of finite state graphs, which are used in a recursive manner. A recursive graph covers the essential features of both standard devices: It describes the syntactical structure as grammars do, and it represents a method for recognition and parsing as push-down automata do. A notion ofk-determinacy is introduced for recursive graphs and for a restricted kind of them, called simple recursive graphs. Thek-deterministic simple recursive graphs are more general thanLL (k) grammars [5] with respect to parsing-power, but equal with respect to language generation power. The more generalk-deterministic recursive graphs cannot parse the full set ofLR (k)-grammars [4], but they can recognize the full set ofLR (k)-languages. The set of languages recognized by (simple) recursive graphs is the set of context-free languages.  相似文献   

6.
I discuss a stochastic model of language learning and change. During a syntactic change, each speaker makes use of constructions from two different idealized grammars at variable rates. The model incorporates regularization in that speakers have a slight preference for using the dominant idealized grammar. It also includes incrementation: The population is divided into two interacting generations. Children can detect correlations between age and speech. They then predict where the population’s language is moving and speak according to that prediction, which represents a social force encouraging children not to sound out-dated. Both regularization and incrementation turn out to be necessary for spontaneous language change to occur on a reasonable time scale and run to completion monotonically. Chance correlation between age and speech may be amplified by these social forces, eventually leading to a syntactic change through prediction-driven instability.  相似文献   

7.
N-gram models are the most widely used language models in large vocabulary continuous speech recognition. Since the size of the model grows rapidly with respect to the model order and available training data, many methods have been proposed for pruning the least relevant -grams from the model. However, correct smoothing of the N-gram probability distributions is important and performance may degrade significantly if pruning conflicts with smoothing. In this paper, we show that some of the commonly used pruning methods do not take into account how removing an -gram should modify the backoff distributions in the state-of-the-art Kneser-Ney smoothing. To solve this problem, we present two new algorithms: one for pruning Kneser-Ney smoothed models, and one for growing them incrementally. Experiments on Finnish and English text corpora show that the proposed pruning algorithm provides considerable improvements over previous pruning algorithms on Kneser-Ney smoothed models and is also better than the baseline entropy pruned Good-Turing smoothed models. The models created by the growing algorithm provide a good starting point for our pruning algorithm, leading to further improvements. The improvements in the Finnish speech recognition over the other Kneser-Ney smoothed models are statistically significant, as well.  相似文献   

8.
A structural characterization of reflexive splicing languages has been recently given in [P. Bonizzoni, C. De Felice, R. Zizza, The structure of reflexive regular splicing languages via Schützenberger constants, Theoretical Computer Science 334 (2005) 71-98] and [P. Bonizzoni, G. Mauri, Regular splicing languages and subclasses, Theoretical Computer Science 340 (2005) 349-363] showing surprising connections between long standing notions in formal language theory, the syntactic monoid and Schützenberger constant and the splicing operation.In this paper, we provide a procedure to decide whether a regular language is a reflexive splicing language, based on the above-mentioned characterization that is given in terms of a finite set of constants for the language. The procedure relies on the notion of label-equivalence that induces a finite refinement of the syntactic monoid of a regular language L. A finite set of representatives for label-equivalent classes of constant words in L is defined and it is proved that such a finite set provides the splice sites of splicing rules generating language L.  相似文献   

9.
Language modeling is the problem of predicting words based on histories containing words already hypothesized. Two key aspects of language modeling are effective history equivalence classification and robust probability estimation. The solution of these aspects is hindered by the data sparseness problem.Application of random forests (RFs) to language modeling deals with the two aspects simultaneously. We develop a new smoothing technique based on randomly grown decision trees (DTs) and apply the resulting RF language models to automatic speech recognition. This new method is complementary to many existing ones dealing with the data sparseness problem. We study our RF approach in the context of n-gram type language modeling in which n  1 words are present in a history. Unlike regular n-gram language models, RF language models have the potential to generalize well to unseen data, even when histories are longer than four words. We show that our RF language models are superior to the best known smoothing technique, the interpolated Kneser–Ney smoothing, in reducing both the perplexity (PPL) and word error rate (WER) in large vocabulary state-of-the-art speech recognition systems. In particular, we will show statistically significant improvements in a contemporary conversational telephony speech recognition system by applying the RF approach only to one of its many language models.  相似文献   

10.
11.
12.
We propose an algebra of languages and transformations as a means of compositional syntactic language extension. The algebra provides a layer of high-level abstractions built on top of languages (captured by context-free grammars) and transformations (captured by constructive catamorphisms).  相似文献   

13.
A new and simple method for the inference of regular grammars or finite automata is presented. It is based on the analysis of the successive appearances of the terminal symbols in the learning strings. It is shown that for syntactic pattern recognition applications, this method is more efficient than other algorithms already proposed.  相似文献   

14.
Culik II and Cohen introduced the class of LR-regular grammars, an extension of the LR(k) grammars.

In this paper we consider an analogous extension of the LL(k) grammars called the LL- regular grammars. The relation of this class of grammars to other classes of grammars will be shown. Any LL-regular grammars is an LR-regular grammar. Properties of LL(k) grammars can be generalized to properties of LL-regular grammars.  相似文献   

15.
The paper studies the family of LLP(k) grammars introduced by Lomet. Two new grammatical characterizations for this class are presented. The first one shows that these grammars form an extension of strict deterministic grammars, in particular, any strict deterministic grammar is an LLP(0) one. LLP(k) grammars share with this subclass many important mathematical properties. The second characterization proves LLP(k) grammars to be a natural cross between LL(k) and LR(k) grammars. Relationships between corresponding families of languages are also investigated.  相似文献   

16.
During the last decade, the most significant advances in the field of continuous speech recognition (CSR) have arisen from the use of hidden Markov models (HMM) for acoustic modeling. These models address one of the major issues for CSR: simultaneous modeling of temporal and frequency distortions in the speech signal. In the HMM, the temporal dimension is managed through an oriented states graph, each state accounting for the local frequency distortions through a probability density function. In this study, improvement of the HMM performance is expected from the introduction of a very effective non-parametric probability density function estimate: the k-nearest neighbors (k-nn) estimate.First, experiments on a short-term speech spectrum identification task are performed to compare the k-nn estimate and the widespread estimate based on mixtures of Gaussian functions. Then adaptations implied by the integration of the k-nn estimate in an HMM-based recognition system are developed. An optimal training protocol is obtained based on the introduction of the membership coefficients in the HMM parameters. The membership coefficients measure the degree of association between a reference acoustic vector and a HMM state. The training procedure uses the expectation-maximization (EM) algorithm applied to the membership coefficient estimation. Its convergence is shown according to the maximum likelihood criterion. This study leads to the development of a baseline k-nn/HMM recognition system which is evaluated on the TIMIT speech database. Further improvements of the k-nn/HMM system are finally sought through the introduction of a temporal information into the representation space (delta coefficients) and the adaptation of the references (mainly, gender modeling and contextual modeling).  相似文献   

17.
18.
This paper establishes a workspace theorem in terms of regular-controlled (context-free) grammars. It proves that, if, for a regular-controlled grammar H, there is a positive integer k such that H generates every sentence yL(H) by a derivation in which every sentential form x contains at most (k−1)|x|/k occurrences of nonterminals that are erased throughout the rest of the derivation, where |x| denotes the length of x, then the language of H is generated by a propagating regular-controlled grammar. An analogical workspace theorem is demonstrated for regular-controlled grammars with appearance checking. The paper provides an algorithm that removes all erasing rules from any regular-controlled grammar (possibly with appearance checking) that satisfies the workspace condition above without affecting the generated language. In its conclusion, the paper points out a relationship of the workspace theorems to other areas of formal language theory.  相似文献   

19.
We define a class of probabilistic models in terms of an operator algebra of stochastic processes, and a representation for this class in terms of stochastic parameterized grammars. A syntactic specification of a grammar is formally mapped to semantics given in terms of a ring of operators, so that composition of grammars corresponds to operator addition or multiplication. The operators are generators for the time-evolution of stochastic processes. The dynamical evolution occurs in continuous time but is related to a corresponding discrete-time dynamics. An expansion of the exponential of such time-evolution operators can be used to derive a variety of simulation algorithms. Within this modeling framework one can express data clustering models, logic programs, ordinary and stochastic differential equations, branching processes, graph grammars, and stochastic chemical reaction kinetics. The mathematical formulation connects these apparently distant fields to one another and to mathematical methods from quantum field theory and operator algebra. Such broad expressiveness makes the framework particularly suitable for applications in machine learning and multiscale scientific modeling.   相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号