共查询到20条相似文献,搜索用时 0 毫秒
1.
一种基于实例的汉英机器翻译策略 总被引:3,自引:0,他引:3
介绍了一种基于实例的汉英机器翻译策略,重点讨论了汉英双语语料库的设计和基于该语料库的汉语句子的匹配算法。在进行汉语句子的匹配时,根据汉语的特点直接采用汉字的匹配,而没有进行汉语句子的分词。另外,匹配时确定匹配片断的边界也是基于实例机器翻译的难点之一,在这方面也采取了相应的解决方法。没有对翻译句子的连接装配进行更深入的研究,这是因为该翻译策略是用于多翻译引擎系统的,它要与其它翻译策略配合使用,以提高翻译结果的正确率。基于实例的机器翻译需要大量的双语语料库作为翻译时的依据,而人工建设大型语料库费时费力,所以尝试采用计算机进行汉英双语语料库的自动建立,包括篇章对齐和单词级的对齐。 相似文献
2.
This paper proposes a novel method for phrase-based statistical machine translation based on the use of a pivot language.
To translate between languages L
s
and L
t
with limited bilingual resources, we bring in a third language, L
p
, called the pivot language. For the language pairs L
s
− L
p
and L
p
− L
t
, there exist large bilingual corpora. Using only L
s
− L
p
and L
p
− L
t
bilingual corpora, we can build a translation model for L
s
− L
t
. The advantage of this method lies in the fact that we can perform translation between L
s
and L
t
even if there is no bilingual corpus available for this language pair. Using BLEU as a metric, our pivot language approach
significantly outperforms the standard model trained on a small bilingual corpus. Moreover, with a small L
s
− L
t
bilingual corpus available, our method can further improve translation quality by using the additional L
s
− L
p
and L
p
− L
t
bilingual corpora. 相似文献
3.
汉英机器翻译中基于实例的歧义结构消解 总被引:1,自引:0,他引:1
歧义是自然语言特别是汉语的显著特点和普遍现象,也是当前汉英机器翻译系统的主要处理难点之一。通过对其中一些常见汉语歧义结构的分析,提出一种基于实例的歧义结构消解方法。由于歧义结构的对应实例具有较高的“结构”上的代表性,通过与这些实例的相似性比较可以较准确地把握待消歧语段的内部结构。 相似文献
4.
Gregor Thurmair 《Computers and the Humanities》1991,25(2-3):115-128
This paper describes developments in the area of machine translation (MT). First, the paper gives an overview of developments in Germany in general; then, special problems are discussed. The system taken as an example is METAL (Machine Translation and Analysis of Natural Language), where recent development work has centered around two main topics. (i) Efforts have been made to make the system really multilingual. The German-to-English prototype had to be expanded, some system components had to be readjusted, and additional problems had to be solved. Currently, analysis and synthesis components for German, English, French, Spanish, and Dutch are under development. All these languages use a common system kernel and a standard interface structure. (ii) The system had to be made user-friendly. This was an even more important task as, up to now, MT systems have not been well accepted by users. METAL tries to be more realistic, and also tries to support the main user interfaces in a much better way than has been done before. This is based on the conviction that there are several parameters which determine the real success of an MT system. It is not just translation quality which is decisive, it is also the integration of an MT system into the whole process of preparing and translating documents.Gregor Thurmair is head of the Linguistics Department at Siemens Nixdorf Information Systems and project leader of the machine translation group, METAL. He is involved in projects in information retrieval (morphological analysis), speech understanding (parsing, semantics) and machine translation (METAL system). He has presented papers on morphology, semantics in speech understanding, transfer problems in MT, and grammar checking. 相似文献
5.
Virtual manufacturing systems provide a useful means for products to be manufactured ‘right the first time’ without the need of physical testing on the shop floor. Earlier research was mostly on developing a virtual manufacturing environment. Over the years, simple graphical prediction and simulation gave way to complex multi-science predictions. Virtual systems such as Virtual Machine Tool, Virtual Machining, Virtual Assembly, Virtual Tooling and Virtual Prototype have been developed to support virtual manufacturing. Different systems and approaches have different targeted applications. This paper aims to provide a comprehensive review of existing virtual systems. Their focuses and approaches (i.e. virtual reality, Web-based techniques, mathematical modelling, hardware interactions and STEP-NC-based methodologies) are discussed in detail. To better understand the systems, we have categorized them into different groups according to their application domains. Discussions and concluding remarks are given based on the review. 相似文献
6.
Leif Ibsen 《Software》1984,14(1):17-29
A portable compiler can be constructed by letting it generate code for a virtual machine, which is then implemented on the real target machines. The design of a virtual machine which is especially suitable as a target machine for compiled Ada programs is described. The main design goals, implementability on mini-computers and portability, are discussed and the resulting design is described in some detail. Some implementation strategies for the machine are proposed and the feasibility of the virtual machine approach is discussed. 相似文献
7.
8.
We design a task mapper TPCM for assigning tasks to virtual machines, and an application-aware virtual machine scheduler TPCS oriented for parallel computing to achieve a high performance in virtual computing systems. To solve the problem of mapping tasks to virtual machines, a virtual machine mapping algorithm (VMMA) in TPCM is presented to achieve load balance in a cluster. Based on such mapping results, TPCS is constructed including three components: a middleware supporting an application-driven scheduling, a device driver in the guest OS kernel, and a virtual machine scheduling algorithm. These components are implemented in the user space, guest OS, and the CPU virtualization subsystem of the Xen hypervisor, respectively. In TPCS, the progress statuses of tasks are transmitted to the underlying kernel from the user space, thus enabling virtual machine scheduling policy to schedule based on the progress of tasks. This policy aims to exchange completion time of tasks for resource utilization. Experimental results show that TPCM can mine the parallelism among tasks to implement the mapping from tasks to virtual machines based on the relations among subtasks. The TPCS scheduler can complete the tasks in a shorter time than can Credit and other schedulers, because it uses task progress to ensure that the tasks in virtual machines complete simultaneously, thereby reducing the time spent in pending, synchronization, communication, and switching. Therefore, parallel tasks can collaborate with each other to achieve higher resource utilization and lower overheads. We conclude that the TPCS scheduler can overcome the shortcomings of present algorithms in perceiving the progress of tasks, making it better than schedulers currently used in parallel computing. 相似文献
9.
10.
对齐短语是决定统计机器翻译系统质量的核心模块。提出基于短语结构树的层次短语模型,这是利用串-树模型的思想对层次短语模型的扩展。基于短语结构树的层次短语模型是在双语对齐短语的基础之上结合英语短语结构树抽取翻译规则,并利用启发式策略获得翻译规则的扩展句法标记。采用翻译规则的统计机器翻译系统在不同数据集上具有稳定的翻译结果,在训练集和测试集的平均BlEU评分高于短语模型和层次短语模型的BLEU评分。 相似文献
11.
Sergei Nirenburg 《Machine Translation》1989,4(1):5-24
This paper provides an overview of the KBMT-89 project at Carmegie Mellon University's Center for Machine Translation, as well therefore of the special number of this journal, which reports on the project. The knowledge-based approach to machine translation is presented and defended in a historical context. Various components of the system, key parts of which are described in subsequent papers of the issue, are introduced and paired with their computational motivations. 相似文献
12.
Constructive machine translation evaluation 总被引:1,自引:0,他引:1
Stephen Minnis 《Machine Translation》1993,8(1-2):67-75
When surveying the many methods currently employed in MT evaluation,1 it is not immediately obvious that the methods used serve to increase the knowledge of the properties being measured. This report describes aconstructive machine translation evaluation method, aimed at addressing this issue.2
Edited version of a presentation given to the International Working Group on the Evaluation of Machine Translation Systems, Vaud, Switzerland, April 1991. 相似文献
13.
14.
This paper attempts to propose a virtual operating system applied to operation training of manufacturing facility and manufacturing process simulation. The system is based on VRML and browser/server structure, so user only needs to install a free plug-in, and run the package normally via Microsoft Internet Explorer. Initially this paper studies the system framework, structure models and concept models. Then, a communication approach based on VRML, Java and HTML, which is key to realize the virtual operating of CNC machines, has been presented. The algorithm of material removed simulation based on VRML Z-map is also presented in this paper. It has the advantages such as a lower memory requirement, and a faster computation speed. Finally, in order to validate the feasibility of the proposed approach, the CNC milling machine has been taken as an illustrative example for the prototype development. 相似文献
15.
16.
We propose a novel approach to cross-lingual language model and translation lexicon adaptation for statistical machine translation
(SMT) based on bilingual latent semantic analysis. Bilingual LSA enables latent topic distributions to be efficiently transferred
across languages by enforcing a one-to-one topic correspondence during training. Using the proposed bilingual LSA framework,
model adaptation can be performed by, first, inferring the topic posterior distribution of the source text and then applying
the inferred distribution to an n-gram language model of the target language and translation lexicon via marginal adaptation. The background phrase table is
enhanced with the additional phrase scores computed using the adapted translation lexicon. The proposed framework also features
rapid bootstrapping of LSA models for new languages based on a source LSA model of another language. Our approach is evaluated
on the Chinese–English MT06 test set using the medium-scale SMT system and the GALE SMT system measured in BLEU and NIST scores.
Improvement in both scores is observed on both systems when the adapted language model and the adapted translation lexicon
are applied individually. When the adapted language model and the adapted translation lexicon are applied simultaneously,
the gain is additive. At the 95% confidence interval of the unadapted baseline system, the gain in both scores is statistically
significant using the medium-scale SMT system, while the gain in the NIST score is statistically significant using the GALE
SMT system. 相似文献
17.
John Hutchins 《Machine Translation》2005,19(3-4):197-211
In the last decade the dominant models of MT have been data-driven or corpus-based. Of the two main trends, statistical machine
translation and example-based machine translation (EBMT), the latter is much less clearly defined. In a review of the recently
published collection edited by Michael Carl and Andy Way, this essay surveys the basic processes, methods, main problems and
tasks of EBMT, and attempts to provide a definition of the essence of EBMT in comparison with statistical MT and traditional
rule-based MT.
Recent Advances in Example-based Machine Translation. Edited by Michael Carl and Andy Way. Dordrecht: Kluwer Academic Publishers, 2003. xxxi, 482pp. (Text, Speech and Language
Technology, vol. 21) ISBN: 1-4020-1400-7 (hardback), 1-4020-1401-5 (paperback). 相似文献
18.
19.
Statistical machine translation systems are usually trained on large amounts of bilingual text (used to learn a translation
model), and also large amounts of monolingual text in the target language (used to train a language model). In this article
we explore the use of semi-supervised model adaptation methods for the effective use of monolingual data from the source language
in order to improve translation quality. We propose several algorithms with this aim, and present the strengths and weaknesses
of each one. We present detailed experimental evaluations on the French–English EuroParl data set and on data from the NIST
Chinese–English large-data track. We show a significant improvement in translation quality on both tasks. 相似文献
20.
Word reordering is one of the challengeable problems of machine translation. It is an important factor of quality and efficiency of machine translation systems. In this paper, we introduce a novel reordering model based on an innovative structure, named, phrasal dependency tree. The phrasal dependency tree is a modern syntactic structure which is based on dependency relationships between contiguous non-syntactic phrases. The proposed model integrates syntactical and statistical information in the context of log-linear model aimed at dealing with the reordering problems. It benefits from phrase dependencies, translation directions (orientations) and translation discontinuity between translated phrases. In comparison with well-known and popular reordering models such as distortion, lexicalised and hierarchical models, the experimental study demonstrates the superiority of our model in terms of translation quality. Performance is evaluated for Persian → English and English → German translation tasks using Tehran parallel corpus and WMT07 benchmarks, respectively. The results report 1.54/1.7 and 1.98/3.01 point improvements over the baseline in terms of BLEU/TER metrics on Persian → English and German → English translation tasks, respectively. On average our model retrieved a significant impact on precision with comparable recall value with respect to the lexicalised and distortion models. 相似文献