共查询到20条相似文献,搜索用时 31 毫秒
1.
The development of a machine translation system is one of the most difficult computational tasks. Without a deep semantic analysis of both source and target languages, a machine translation system can not generate good results. This paper describes a machine translation system based on a new method called the Integral Method in which semantic analysis using an active dictionary plays a very important role. 相似文献
2.
无监督神经机器翻译仅利用大量单语数据,无需平行数据就可以训练模型,但是很难在2种语系遥远的语言间建立联系。针对此问题,提出一种新的不使用平行句对的神经机器翻译训练方法,使用一个双语词典对单语数据进行替换,在2种语言之间建立联系,同时使用词嵌入融合初始化和双编码器融合训练2种方法强化2种语言在同一语义空间的对齐效果,以提高机器翻译系统的性能。实验表明,所提方法在中-英与英-中实验中比基线无监督翻译系统的BLEU值分别提高2.39和1.29,在英-俄和英-阿等单语实验中机器翻译效果也显著提高了。 相似文献
3.
机读字典蕴藏着非常丰富的词汇语意知识,这些知识可由自动化方式粹取出来,有效地利用在各种自然语言处理相关研究上。本研究提出一套方法,以英文版的WordNet 作为基本骨架,结合比对属类词与比对定义内容两种技巧,将WordNet同义词集对映到朗文当代英汉双语词典之词条。并藉由这个对映将WordNet同义词集冠上中文翻译词汇。在实验部分,我们依岐义程度将词汇分为单一语意与语意岐义两部分进行。在单一语意部分的实验结果,以100%的涵盖率计算,可获得97.7%的精准率。而在语意岐义部分,我们得到85.4%精准率,以及63.4%涵盖率的实验结果。 相似文献
4.
《International journal of man-machine studies》1993,38(2):313-330
Practical natural language processing (NLP) systems such as database front-ends, deductive databases or object-oriented databases are at the forefront of research into the next-generation intelligent database systems. The research described in this paper has been aimed at integrating front-end paradigms and rule-based deduction to provide a single powerful framework for database systems in Arabic. The lexicon stores only roots of verbs and uses a program intelligent enough to handle all derived forms automatically. This is significant, as these alone represent 70% of the total dictionary. As part of the discussion of this system, its utility in such NLP applications as parsing and machine translation is examined. 相似文献
5.
该文对神经机器翻译中的数据泛化方法和短语生成方法进行研究。在使用基于子词的方法来缓解未登录词和稀疏词汇问题的基础上,提出使用数据泛化的方法来进一步优化未登录词和稀疏词汇的翻译,缓解了子词方法中出现的错译问题。文中对基于子词的方法和基于数据泛化的方法进行了详细的实验对比,对两种方法的优缺点进行了讨论和说明。针对数据泛化的处理方法,提出了一致性检测方法和解码优化方法。由于标准的神经机器翻译模型以词汇为基础进行翻译建模,因此该文提出了一种规模可控的短语生成方法,通过使用该文方法生成的源语言短语,神经机器翻译的翻译性能进一步提高。最终,在汉英和英汉翻译任务上,翻译性能与基线翻译系统相比分别提高了1.3和1.2个BLEU值。 相似文献
6.
Information on subcategorization and selectional restrictions in a valency dictionary is important for natural language processing
tasks such as monolingual parsing, accurate rule-based machine translation and automatic summarization. In this paper we present
an efficient method of assigning valency information and selectional restrictions to entries in a bilingual dictionary, based
on information in an existing valency dictionary. The method is based on two assumptions: words with similar meaning have
similar subcategorization frames and selectional restrictions; and words with the same translations have similar meanings.
Based on these assumptions, new valency entries are constructed for words in a plain bilingual dictionary, using entries with
similar source-language meaning and the same target-language translations. We evaluate the effects of various measures of
semantic similarity. 相似文献
7.
Ajantha Herath Yasuaki Hyodo Y. Kunieda Takashi IkedaSusantha Herath 《Information Sciences》1996,90(1-4):303-319
This paper presents the design and implementation techniques employed in a Japanese-to-Sinhalese machine translation system. The main result of this work is the successful application of Bunsetsu in generating meaningful translations for a flexible-grammar language. The system has been developed considering the similarities between Japanese Bunsetsu and Sinhalese units. Such efforts are being focused on determining the minimum reasonable grammatical knowledge necessary for machine translation. The principal characteristics of the system, the translation process, problems encountered during the development stages, present status, and future plans will be discussed. 相似文献
8.
9.
研究语义是当前人工智能、语义网、语义词典等研究领域的热点,它可以有效支持机器翻译和自然语言处理等技术.文章根据藏文独特的文法特性,运用藏文逻辑格和计算语言学知识,在保留藏文原有特点的基础上,为藏文语义关系抽取方法建立较完整的语义场,以此为藏文语义词典建设提供了基础性构建方法. 相似文献
10.
本文提出用面向对象理论来建立机器翻译词典基类的方法,成功地用一种通用的模式来实现机器翻译中各科电子词典的管理。新方法较大地提高了机器翻译系统的可靠性、可维护性与可重用性,并已在NHWIN中日-日中机器翻译系统中得到了很好的应用。 相似文献
11.
Robert Krajewski Henryk Rybinski Marek Kozlowski 《Journal of Intelligent Information Systems》2016,47(3):491-514
The paper addresses the problem of automatic dictionary translation.The proposed method translates a dictionary by means of mining repositories in the source and target languages, without any directly given relationships connecting the two languages. It consists of two stages: (1) translation by lexical similarity, where words are compared graphically, and (2) translation by semantic similarity, where contexts are compared. In the experiments Polish and English version of Wikipedia were used as text corpora. The method and its phases are thoroughly analyzed. The results allow implementing this method in human-in-the-middle systems. 相似文献
12.
Capturing the underlying semantic relationships of sentences is helpful for machine translation. Variational neural machine translation approaches provide an effective way to model the uncertain underlying semantics in languages by introducing latent variables. Multitask learning is applied in multimodal machine translation to integrate multimodal data. However, these approaches usually lack a strong interpretation in utilizing out-of-text information in machine translation tasks. In this paper, we propose a novel architecture-free multimodal translation model, called variational multimodal machine translation (VMMT), under the variational framework which can model the uncertainty in languages caused by ambiguity through utilizing visual and textual information. In addition, the proposed model can eliminate the discrepancy between training and prediction in the existing variational translation models by constructing encoders only relying on source data. More importantly, the proposed multimodal translation model is designed as multitask learning in which the shared semantic representation for different modes is learned and the gap among semantic representation from various modes is reduced by incorporating additional constraints. Moreover, the information bottleneck theory is adopted in our variational encoder–decoder model, which helps the encoder to filter redundancy and the decoder to concentrate on useful information. Experiments on multimodal machine translation demonstrate that the proposed model is competitive. 相似文献
13.
Higuchi T. Handa K. Takahashi N. Furuya T. Iida H. Sumita E. Oi O. Kitano H. 《Computer》1994,27(11):53-63
Describes the IXM2 associative processor and its main application in speech-to-speech translation. The IXM2 is a semantic memory system machine that began as a faithful implementation of the NETL semantic network machine and grew into a massively parallel SIMD machine that has demonstrated the power of large associative memories. Such processors can support robust performance in speech applications. In fact, the IXM2 with 73 transputers has outperformed a Cray in some language-translation tasks. We selected speech-to-speech translation as our main application because it is one of the grand challenges of massively parallel artificial intelligence. The social implications of successful automatic translation are enormous-e.g. people who speak different languages could communicate in real time by using interpreting telephony 相似文献
14.
In this paper, we address the demanding task of developing intelligent systems equipped with machine creativity that can perform design tasks automatically. The main challenge is how to model human beings' creativity mathematically and mimic such creativity computationally. We propose a ``synthesis reasoning model" as the underlying mechanism to simulate human beings' creative thinking when they are handling design tasks. We present the theory of the synthesis reasoning model, and the detailed procedure of designing an intelligent system based on the model. We offer a case study of an intelligent Chinese calligraphy generation system which we have developed. Based on implementation experiences of the calligraphy generation system as well as a few other systems for solving real-world problems, we suggest a generic methodology for constructing intelligent systems using the synthesis reasoning model. 相似文献
15.
Fanchao QI Ruobing XIE Yuan ZANG Zhiyuan LIU Maosong SUN 《Frontiers of Computer Science》2021,15(5):155327
A sememe is defined as the minimum semantic unit of languages in linguistics. Sememe knowledge bases are built by manually annotating sememes for words and phrases. HowNet is the most well-known sememe knowledge base. It has been extensively utilized in many natural language processing tasks in the era of statistical natural language processing and proven to be effective and helpful to understanding and using languages. In the era of deep learning, although data are thought to be of vital importance, there are some studies working on incorporating sememe knowledge bases like HowNet into neural network models to enhance system performance. Some successful attempts have been made in the tasks including word representation learning, language modeling, semantic composition, etc. In addition, considering the high cost of manual annotation and update for sememe knowledge bases, some work has tried to use machine learning methods to automatically predict sememes for words and phrases to expand sememe knowledge bases. Besides, some studies try to extend HowNet to other languages by automatically predicting sememes for words and phrases in a new language. In this paper, we summarize recent studies on application and expansion of sememe knowledge bases and point out some future directions of research on sememes. 相似文献
16.
17.
18.
《Future Generation Computer Systems》1986,2(2):77-82
A survey of the current machine translation systems is given, which includes not only activities in Japan, but also abroad, especially European, US and Canadian activities. Then the components of a machine translation system are explained from the standpoint of software, linguistic components, and users' demands. The importance of pre-editing and post-editing is stressed. The semantic and contextual processings are essential to obtain a better translation quality, which are the future problems to attack. Attention is given to the difficulty of contemplating a pivot method in machine translation instead of transfer methods, because the projection from a word or a phrase to a concept is very difficult if we want to have a very exact concept representation and translation. A new transfer method which accompanies the pe-transfer structural adjustment and post-transfer adjustment is explained. This method was adopted by the Japanese governmental project of machine translation which was directed by the author. Various mechanisms of structural transformations in the transfer and generation processes are explained, which are necessitated by the language translation between the two languages of different language families like Japanese and English.Finally some comments are given from the standpoint of users of machine translation systems. Systems always are imperfect, and users must use them after recognizing the possibilities and the limitations of the system. 相似文献
19.
语料库作为基本的语言数据库和知识库,是各种自然语言处理方法实现的基础。随着统计方法在自然语言处理中的广泛应用,语料库建设已成为重要的研究课题。自动分词是句法分析的一项不可或缺的基础性工作,其性能直接影响句法分析。本文通过对85万字节藏语语料的统计分析和藏语词的分布特点、语法功能研究,介绍基于词典库的藏文自动分词系统的模型,给出了切分用词典库的结构、格分块算法和还原算法。系统的研制为藏文输入法研究、藏文电子词典建设、藏文字词频统计、搜索引擎的设计和实现、机器翻译系统的开发、网络信息安全、藏文语料库建设以及藏语语义分析研究奠定了基础。 相似文献
20.
汉语和维吾尔语是在句法结构和语序上差异较大的两种语言。对于一个完备的汉维机器翻译系统而言,进行源语言的分析和目标语言时态、语态的准确表达是有必要的。针对统计机器翻译模型中所包含的句法、语义成分较低导致的准确率及语序问题,通过建立相关转换及匹配规则,以期用于机器翻译的混合方法之中来提高翻译系统的工作性能。 相似文献