首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A system of grammars using symmetric phrase structure and translation rules in a Lisp version of Prolog is shown to provide symmetric bidirectional translation between English and Chinese for a fragment of the two languages. It is argued that symmetric grammars and translation rules significantly reduce the total grammar writing requirement for translation systems, and that research on symmetric translation systems deserves further study.  相似文献   

2.
Stochastic syntax-directed translation schemata describe both the syntactic structure and the probability distribution of stochastic mappings between contextfree languages. The relationship between stochastic syntax-directed translation schemata and stochastic grammars and automata are presented by proving that a stochastic pushdown transducer can be constructed to define the same translations as a simple schema, and that the simple schema are characterized by stochastic contextfree grammars. Asymptotic properties of linear schemata are established by the theory of Markov chains. Since stochastic translations contain both input and output strings, their information content can be described. Equations are developed for both the information content and the rate of stochastic translations.  相似文献   

3.
The Cunei machine translation platform is an open-source system for data-driven machine translation. Our platform is a synthesis of the traditional example-based MT (EBMT) and statistical MT (SMT) paradigms. What makes Cunei unique is that it measures the relevance of each translation instance with a distance function. This distance function, represented as a log-linear model, operates over one translation instance at a time and enables us to score the translation instance relative to the specified input and/or the current target hypothesis. We describe how our system, Cunei, scores features individually for each translation instance and how it efficiently performs parameter tuning over the entire feature space. We also compare Cunei with three other open-source MT systems (Moses, CMU-EBMT, and Marclator). In our experiments involving Korean–English and Czech–English translation Cunei clearly outperforms the traditional EBMT and SMT systems.  相似文献   

4.
A cost function is developed, based on information-theoretic concepts, that measures the complexity of a stochastic context-free grammar, as well as the discrepancy between its language and a given stochastic language sample. This function is used to guide a search procedure that finds simple grammars whose languages are good fits to a sample. Reasonable results have been obtained in a variety of cases, including parenthesis and addition strings, Basic English (the first 25 sentences in English Through Pictures) and chain-encoded chromosome boundaries.  相似文献   

5.
A cost function is developed, based on information-theoretic concepts, that measures the complexity of a stochastic context-free grammar, as well as the discrepancy between its language and a given stochastic language sample. This function is used to guide a search procedure that finds simple grammars whose languages are good fits to a sample. Reasonable results have been obtained in a variety of cases, including parenthesis and addition strings, Basic English (the first 25 sentences in English Through Pictures) and chain-encoded chromosome boundaries.  相似文献   

6.
Offline grammar-based recognition of handwritten sentences   总被引:1,自引:0,他引:1  
This paper proposes a sequential coupling of a hidden Markov model (HMM) recognizer for offline handwritten English sentences with a probabilistic bottom-up chart parser using stochastic context-free grammars (SCFG) extracted from a text corpus. Based on extensive experiments, we conclude that syntax analysis helps to improve recognition rates significantly.  相似文献   

7.
8.
This paper summarizes some recent results concerned with the extension of formal languages to their corresponding stochastic versions. Weighted grammars and languages are first defined, and stochastic grammars and languages are defined as a special case of weighted grammars and languages. Fuzzy grammars and languages, which have some properties similar to weighted grammars and languages, are also discussed. Stochastic automata are defined from the language recognition viewpoint. Languages accepted by stochastic finite-state and pushdown automata, with and without a cutpoint, are studied. Weighted and stochastic programmed and indexed grammars, and stochastic nested stack automata are defined. Finally, some decidability problems of stochastic (weighted, fuzzy) languages are discussed, and problems for further research are suggested.This work was supported by the National Science Foundation Grant GK-18225.  相似文献   

9.
This paper describes an approach to the problem of articulating multimedia information based on parsing and syntax-directed translation that uses Relational Grammars. This translation is followed by a constraint-solving mechanism to create the final layout. Grammatical rules provide the mechanism for mapping from a representation of the content and context of a presentation to forms that specify the media objects to be realized. These realization forms include sets of spatial and temporal constraints between elements of the presentation. Individual grammars encapsulate the “look and feel” of a presentation and can be used as generators of such a style. By making the grammars sensitive to the requirements of the output medium, parsing can introduce flexibility into the information realization process.  相似文献   

10.
We provide a strongly polynomial algorithm for determining whether a given multi-type branching process is subcritical, critical, or supercritical. The same algorithm also decides consistency of stochastic context-free grammars.  相似文献   

11.
Summary An attribute grammar is one-visit if the attributes can be evaluated by walking through the derivation tree in such a way that each subtree is visited at most once. One-visit (1V) attribute grammars are compared with one-pass left-to-right (L) attribute grammars and with attribute grammars having only one synthesized attribute (1S).Every 1S attribute grammar can be made one-visit. One-visit attribute grammars are simply permutations of L attribute grammars; thus the classes of output sets of 1V and L attribute grammars coincide, and similarly for 1S and L-1S attribute grammars. In case all attribute values are trees, the translation realized by a 1V attribute grammar is the composition of the translation realized by a 1S attribute grammar with a deterministic top-down tree transduction, and vice versa; thus, using a result of Duske e.a., the class of output languages of 1V (or L) attribute grammars is the image of the class of IO macro tree languages under all deterministic top-down tree transductions.  相似文献   

12.
Inference of high-dimensional grammars is discussed. Specifically, techniques for inferring tree grammars are briefly presented. The problem of inferring a stochastic grammar to model the behavior of an information source is also introduced and techniques for carrying out the inference process are presented for a class of stochastic finite-state and context-free grammars. The possible practical application of these methods is illustrated by examples.  相似文献   

13.
We define a class of probabilistic models in terms of an operator algebra of stochastic processes, and a representation for this class in terms of stochastic parameterized grammars. A syntactic specification of a grammar is formally mapped to semantics given in terms of a ring of operators, so that composition of grammars corresponds to operator addition or multiplication. The operators are generators for the time-evolution of stochastic processes. The dynamical evolution occurs in continuous time but is related to a corresponding discrete-time dynamics. An expansion of the exponential of such time-evolution operators can be used to derive a variety of simulation algorithms. Within this modeling framework one can express data clustering models, logic programs, ordinary and stochastic differential equations, branching processes, graph grammars, and stochastic chemical reaction kinetics. The mathematical formulation connects these apparently distant fields to one another and to mathematical methods from quantum field theory and operator algebra. Such broad expressiveness makes the framework particularly suitable for applications in machine learning and multiscale scientific modeling.   相似文献   

14.
We propose Generate and Repair Machine Translation (GRMT), a constraint–based approach to machine translation that focuses on accurate translation output. GRMT performs the translation by generating a Translation Candidate (TC), verifying the syntax and semantics of the TC and repairing the TC when required. GRMT comprises three modules: Analysis Lite Machine Translation (ALMT), Translation Candidate Evaluation (TCE) and Repair and Iterate (RI). The key features of GRMT are simplicity, modularity, extendibility, and multilinguality.
An English–Thai translation system has been implemented to illustrate the performance of GRMT. The system has been developed and run under SWI–Prolog 3.2.8. The English and Thai grammars have been developed based on Head–Driven Phrase Structure Grammar (HPSG) and implemented on the Attribute Logic Engine (ALE). GRMT was tested to generate the translations for a number of sentences/phrases. Examples are provided throughout the article to illustrate how GRMT performs the translation process.  相似文献   

15.
Heterogeneous computer systems that are interconnected in today's computer networks lack efficient, general-purpose translation facilities for remote data. The provision of such facilities is potentially quite costly since the format of remote data is a function of the attributes of the remote architectures, operating systems, and software applications that maintain the data. This paper introduces a novel translation technique that parameterizes the translation process according to these attributes. Its formal specification is based on environment grammars, parameterized two-level grammars that lend themselves to the specification of classes of data languages with similar structures. We present a formal definition of environment grammars, discuss properties that permit their efficient parsing, and describe a data parsing method based on these properties. Examples illustrate the use of environment grammars for two different types of data languages. The viability of this parameterized technique has been demonstrated by an operational translation subsystem for data of heterogeneous relational database management systems.  相似文献   

16.
In this article, the first public release of GREAT as an open-source, statistical machine translation (SMT) software toolkit is described. GREAT is based on a bilingual language modelling approach for SMT, which is so far implemented for n-gram models based on the framework of stochastic finite-state transducers. The use of finite-state models is motivated by their simplicity, their versatility, and the fact that they present a lower computational cost, if compared with other more expressive models. Moreover, if translation is assumed to be a subsequential process, finite-state models are enough for modelling the existing relations between a source and a target language. GREAT includes some characteristics usually present in state-of-the-art SMT, such as phrase-based translation models or a log-linear framework for local features. Experimental results on a well-known corpus such as Europarl are reported in order to validate this software. A competitive translation quality is achieved, yet using both a lower number of model parameters and a lower response time than the widely-used, state-of-the-art SMT system Moses.  相似文献   

17.
Automatically generating program translators from source and target language specifications is a non-trivial problem. In this paper we focus on the problem of automating the process of building translators between operations languages, a family of DSLs used to program satellite operations procedures. We exploit their similarities to semi-automatically build transformation tools between these DSLs. The input to our method is a collection of annotated context-free grammars. To simplify the overall translation process even more, we also propose an intermediate representation common to all operations languages. Finally, we discuss how to enrich our annotated grammars model with more advanced semantic annotations to provide a verification system for the translation process. We validate our approach by semi-automatically deriving translators between some real world operations languages, using the prototype tool which we implemented for that purpose.  相似文献   

18.
基于语义模式分解的英语介词语义分析和汉译   总被引:1,自引:0,他引:1       下载免费PDF全文
在研究英语介词相关短语和句式的基础上,给出了语义模式的概念,构建了介词相关短语语义模式库、相关句式语义模式库、主虚量库和固定搭配知识库。根据介词相关短语语义模式特点,提出了一种基于语义模式分解的介词语义分析和汉译算法,结合相关句式语义模式库和固定搭配知识库,对英语介词进行语义分析和汉译。实验表明,利用该文方法能有效解决英语介词汉译的问题。  相似文献   

19.
音译是解决人名翻译的重要方法。在英汉人名音译问题中,翻译粒度问题一直是研究的重点之一。该文提出一种基于多粒度的英汉人名音译方法。将多种粒度的英文切分通过词图进行融合,并使用层次短语模型进行解码,从而缓解了由于切分错误而导致的音译错误,提高了系统的鲁棒性。实验结果表明基于多粒度的音译方法融合了基于各种粒度音译方法的优点,在准确率上提高了3.1%,在BLEU取得了2.2个点的显著提升。  相似文献   

20.
This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号