首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A simple associationist neural network learns to factor abstract rules (i.e., grammars) from sequences of arbitrary input symbols by inventing abstract representations that accommodate unseen symbol sets as well as unseen but similar grammars. The neural network is shown to have the ability to transfer grammatical knowledge to both new symbol vocabularies and new grammars. Analysis of the state-space shows that the network learns generalized abstract structures of the input and is not simply memorizing the input strings. These representations are context sensitive, hierarchical, and based on the state variable of the finite-state machines that the neural network has learned. Generalization to new symbol sets or grammars arises from the spatial nature of the internal representations used by the network, allowing new symbol sets to be encoded close to symbol sets that have already been learned in the hidden unit space of the network. The results are counter to the arguments that learning algorithms based on weight adaptation after each exemplar presentation (such as the long term potentiation found in the mammalian nervous system) cannot in principle extract symbolic knowledge from positive examples as prescribed by prevailing human linguistic theory and evolutionary psychology.  相似文献   

2.
We define a class of probabilistic models in terms of an operator algebra of stochastic processes, and a representation for this class in terms of stochastic parameterized grammars. A syntactic specification of a grammar is formally mapped to semantics given in terms of a ring of operators, so that composition of grammars corresponds to operator addition or multiplication. The operators are generators for the time-evolution of stochastic processes. The dynamical evolution occurs in continuous time but is related to a corresponding discrete-time dynamics. An expansion of the exponential of such time-evolution operators can be used to derive a variety of simulation algorithms. Within this modeling framework one can express data clustering models, logic programs, ordinary and stochastic differential equations, branching processes, graph grammars, and stochastic chemical reaction kinetics. The mathematical formulation connects these apparently distant fields to one another and to mathematical methods from quantum field theory and operator algebra. Such broad expressiveness makes the framework particularly suitable for applications in machine learning and multiscale scientific modeling.   相似文献   

3.
Suppes  Patrick  Böttner  Michael  Liang  Lin 《Machine Learning》1995,19(2):133-152
We are developing a theory of probabilistic language learning in the context of robotic instruction in elementary assembly actions. We describe the process of machine learning in terms of the various events that happen on a given trial, including the crucial association of words with internal representations of their meaning. Of central importance in learning is the generalization from utterances to grammatical forms. Our system derives a comprehension grammar for a superset of a natural language from pairs of verbal stimuli like Go to the screw! and corresponding internal representations of coerced actions. For the derivation of a grammar no knowledge of the language to be learned is assumed but only knowledge of an internal language.We present grammars for English, Chinese, and German generated from a finite sample of about 500 commands that are roughly equivalent across the three languages. All of the three grammars, which are context-free in form, accept an infinite set of commands in the given language.  相似文献   

4.
This paper describes the winning entry to the Omphalos context free grammar learning competition. We describe a context-free grammatical inference algorithm operating on positive data only, which integrates an information theoretic constituent likelihood measure together with more traditional heuristics based on substitutability and frequency. The competition is discussed from the perspective of a competitor. We discuss a class of deterministic grammars, the Non-terminally Separated (NTS) grammars, that have a property relied on by our algorithm, and consider the possibilities of extending the algorithm to larger classes of languages. Editor: Georgios Paliouras and Yasubumi Sakakibara  相似文献   

5.
Grammatical inference – used successfully in a variety of fields such as pattern recognition, computational biology and natural language processing – is the process of automatically inferring a grammar by examining the sentences of an unknown language. Software engineering can also benefit from grammatical inference. Unlike these other fields, which use grammars as a convenient tool to model naturally occurring patterns, software engineering treats grammars as first-class objects typically created and maintained for a specific purpose by human designers. We introduce the theory of grammatical inference and review the state of the art as it relates to software engineering.  相似文献   

6.
Input-output HMMs for sequence processing   总被引:2,自引:0,他引:2  
We consider problems of sequence processing and propose a solution based on a discrete-state model in order to represent past context. We introduce a recurrent connectionist architecture having a modular structure that associates a subnetwork to each state. The model has a statistical interpretation we call input-output hidden Markov model (IOHMM). It can be trained by the estimation-maximization (EM) or generalized EM (GEM) algorithms, considering state trajectories as missing data, which decouples temporal credit assignment and actual parameter estimation. The model presents similarities to hidden Markov models (HMMs), but allows us to map input sequences to output sequences, using the same processing style as recurrent neural networks. IOHMMs are trained using a more discriminant learning paradigm than HMMs, while potentially taking advantage of the EM algorithm. We demonstrate that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem. Experimental results are presented for the seven Tomita grammars, showing that these adaptive models can attain excellent generalization.  相似文献   

7.
A new tool, the stochastic programmed production system (SPPS), is proposed for representing rules and control mechanisms in rule-based systems. This approach is based on the stochastic programmed grammars, and its peculiarity is both its well-formalized semantics and its ability to represent most reasoning and control methods used in expert systems today. Since the proposed method is not limited by the particular heuristic organization of the rule-based system it is modeling, the SPPS is more general than existing knowledge-domain-independent expert-system development languages and may help bring a better understanding of the theory of expert systems.  相似文献   

8.
This paper describes approaches for machine learning of context free grammars (CFGs) from positive and negative sample strings, which are implemented in Synapse system. The grammatical inference consists of a rule generation by “inductive CYK algorithm,” mechanisms for incremental learning, and search. Inductive CYK algorithm generates minimum production rules required for parsing positive samples, when the bottom-up parsing by CYK algorithm does not succeed. The incremental learning is used not only for synthesizing grammars by giving the system positive strings in the order of their length but also for learning grammars from other similar grammars. Synapse can synthesize fundamental ambiguous and unambiguous CFGs including nontrivial grammars such as the set of strings not of the form ww with w∈{a,b}+.  相似文献   

9.
Inference of high-dimensional grammars is discussed. Specifically, techniques for inferring tree grammars are briefly presented. The problem of inferring a stochastic grammar to model the behavior of an information source is also introduced and techniques for carrying out the inference process are presented for a class of stochastic finite-state and context-free grammars. The possible practical application of these methods is illustrated by examples.  相似文献   

10.
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.  相似文献   

11.
Summary The present categorial approach to bottom-up parsing context free grammars treats two aspects of determinism. One is an abstraction of grammatical determinism from actual parsing strategies. The other is the transfer of determinism under grammar transformations. The approach is based on the characterization of a parse step as categorial limit, which on the one hand yields a convenient pattern for grammar type definition, and leads on the other hand in a transparent way to invariance results on deterministic grammars under homomorphic transformations.  相似文献   

12.
We study grammars used in grammatical genetic programming (GP) which create algorithms that control the base station pilot power in a femtocell network. The overall goal of evolving algorithms for femtocells is to create a continuous online evolution of the femtocell pilot power control algorithm in order to optimize their coverage. We compare the performance of different grammars and analyse the femtocell simulation model using the grammatical genetic programming method called grammatical evolution. The grammars consist of conditional statements or mathematical functions as are used in symbolic regression applications of GP, as well as a hybrid containing both kinds of statements. To benchmark and gain further information about our femtocell network simulation model we also perform random sampling and limited enumeration of femtocell pilot power settings. The symbolic regression based grammars require the most configuration of the evolutionary algorithm and more fitness evaluations, whereas the conditional statement grammar requires more domain knowledge to set the parameters. The content of the resulting femtocell algorithms shows that the evolutionary computation (EC) methods are exploiting the assumptions in the model. The ability of EC to exploit bias in both the fitness function and the underlying model is vital for identifying the current system and improves the model and the EC method. Finally, the results show that the best fitness and engineering performances for the grammars are similar over both test and training scenarios. In addition, the evolved solutions’ performance is superior to those designed by humans.  相似文献   

13.
The diagrammatic approach to user interfaces for computer-aided software development toolkits, visual query systems, and visual programming environments, is based on the use of diagrams and charts traditionally drawn on paper. In particular, the VLG system (Visual Language Generator) has been proposed to generate icon-oriented visual languages customized for given applications. The syntactical model underlying the interpretation of a visual language in VLG has been designed to describe icon-oriented visual languages. In order to enable the VLG system to apply to any kind of graphical languages, like diagrammatic ones, it is necessary to find a more general syntactical model able to support both their generation and interpretation. This paper addresses the comprehension of the features that a grammatical formalism for nonlinear languages must have to match any requirement for an efficient parsing. To this aim, relation grammars support an easy implementation of a general parsing algorithm for multidimensional languages, parametric with respect to the rewriting rules of the grammar. We compare the expressive power of relation grammars to grammatical formalisms for graph grammars  相似文献   

14.
15.
This paper describes an improved version of TBL algorithm [Y. Sakakibara, Learning context-free grammars using tabular representations, Pattern Recognition 38(2005) 1372–1383; Y. Sakakibara, M. Kondo, GA-based learning of context-free grammars using tabular representations, in: Proceedings of 16th International Conference in Machine Learning (ICML-99), Morgan-Kaufmann, Los Altos, CA, 1999] for inference of context-free grammars in Chomsky Normal Form. The TBL algorithm is a novel approach to overcome the hardness of learning context-free grammars from examples without structural information available. The algorithm represents the grammars by parsing tables and thanks to this tabular representation the problem of grammar learning is reduced to the problem of partitioning the set of nonterminals. Genetic algorithm is used to solve NP-hard partitioning problem. In the improved version modified fitness function and new delete specialized operator is applied. Computer simulations have been performed to determine improved a tabular representation efficiency. The set of experiments has been divided into 2 groups: in the first one learning the unknown context-free grammar proceeds without any extra information about grammatical structure, in the second one learning is supported by a partial knowledge of the structure. In each of the performed experiments the influence of partition block size in an initial population and the size of population at grammar induction has been tested. The new version of TBL algorithm has been experimentally proved to be not so much vulnerable to block size and population size, and is able to find the solutions faster than standard one.  相似文献   

16.
The high complexity of natural language and the huge amount of human and temporal resources necessary for producing the grammars lead several researchers in the area of Natural Language Processing to investigate various solutions for automating grammar generation and updating processes. Many algorithms for Context-Free Grammar inference have been developed in the literature. This paper provides a survey of the methodologies for inferring context-free grammars from examples, developed by researchers in the last decade. After introducing some preliminary definitions and notations concerning learning and inductive inference, some of the most relevant existing grammatical inference methods for Natural Language are described and classified according to the kind of presentation (if text or informant) and the type of information (if supervised, unsupervised, or semi-supervised). Moreover, the state of the art of the strategies for evaluation and comparison of different grammar inference methods is presented. The goal of the paper is to provide a reader with introduction to major concepts and current approaches in Natural Language Learning research.  相似文献   

17.
18.
The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated  相似文献   

19.
This article reviews our work in the field of music processing (MP) using grammatical inference (GI), where regular grammars are used for modeling musical style. These models can be used to generate automatic composition (AC) and classify music by style (musical style identification) with their resulting applications. The latter, for instance, would improve content-based retrieval in multimedia databases, joining indexing by musical style to other suitable indexes. In this work, several GI techniques are used to learn from examples of melodies, stochastic grammars for different musical styles. Then, each of the learned grammars is used to generate new melodies (composition) or to classify test melodies (style identification). Our studies in this field show the need of proper music coding schemes, so different coding schemes are presented and compared. Results from our previous studies have been improved, achieving in style identification a classification error rate that ranges from 0.5 to 1.7%, depending on the corpus used.  相似文献   

20.
This paper describes the grammatical approach, an approach to the specification of XML processing tasks based on attribute grammars. This approach describes how to provide task-specific context-free grammars for XML documents, as well as how to decompose complex processing tasks into simpler ones with attribute-grammar fragments. The result is a high-level, syntax-directed declarative specification for the processing of XML documents, which facilitates the development and maintenance of complex XML processing applications while preserving the flexibility of general-purpose XML processing models. The grammatical approach is illustrated using Chasqui, an e-learning platform for building educational digital libraries of learning objects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号